HTTP header based phishing attack detection using machine learning

TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES(2024)

引用 0|浏览5
暂无评分
摘要
In the past, many techniques like blacklisting/whitelisting, third-party, search engine, visual similarity, heuristic, URL features, and website content were used for anti-phishing. Search engine-based, third-party assisted tools and blacklist/whitelist fail to identify new phishing attacks resulting in high FPR. Heuristic and visual similarity approaches are slow, whereas URL and web content-based techniques do not mimic the dynamic content of current websites and hence cannot stop zero-day attacks. A study was conducted to understand the critical features used in the past for anti-phishing, and we found 16 HTTP header features that were novel. In this paper, we have developed a real-time, highly scalable, feature-rich anti-phishing detection technique based on ML that extracts the HTTP headers (predominantly security headers) from web pages to identify them as legitimate or phished. It is observed that phishing sites are short-lived and are created to achieve a specific objective, like stealing the credential of a user. Once the goal is met, the sites are pulled down immediately. Hence these sites do not take pain to use the security features of web technology and only focus on making the site as similar as possible to the original website. Test results based on our novel features show high accuracy of 97.8% with an average response time of 1.57 s. We have created multiple datasets for different scenarios, like a dataset for website creation through phishing tools and a new dataset for testing unseen phishing attacks. The results thus obtained show detection accuracy of 99% and 95%, respectively. The system architecture diagram of phishing detection, where the HTTP header extraction module extracts novel email headers, and ML is used to classify them as legitimate or phishing with high detection accuracy.image
更多
查看译文
关键词
Cyber Security,Machine learning,Phishing Attacks,Security and Privacy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要