Applying Machine Learning Techniques to Understand User Behaviors When Phishing Attacks Occur

Yi Li; Kaiqi Xiong; Xiangyang  Li

doi:10.4108/eai.13-7-2018.162809

Authors

Yi Li University of South Florida
Kaiqi Xiong University of South Florida
Xiangyang Li Johns Hopkins University

DOI:

https://doi.org/10.4108/eai.13-7-2018.162809

Keywords:

User behavior, phishing emails, machine learning, security attacks

Abstract

Emails have been widely used in our daily life. It is important to understand user behaviors regarding email security situation assessments. However, there are very challenging and limited studies on email user behaviors. To study user security-related behaviors, we design and investigate an email test platform to understand how users behave differently when they read emails, some of which are phishing. Specifically, we conduct two experimental studies, where participants take part in our experiments on site in a lab contained environment and online through Amazon Mechanical Turk that are referred to on-site study and online study, respectively. In the two experimental studies, we design questionnaires for the two studies and use a set of emails including phishing emails from the real world with some necessary modifications for personal information protection. Furthermore, we develop necessary software tools to collect experimental data include participants’ basic background information, time measurement, mouse movement, and their answers to survey questions. Based on the collected data, we investigate what factors, such as intervention, phishing types, and an incentive mechanism, play a key role in user behaviors when phishing attacks occur. The difficulty of such investigation is due to the qualitative analysis of user behaviors and the limited number of data in the on-site study. For these reasons, we develop an approach to quantify user behavior metrics and reduce the number of user attributes by evaluating the significance of each attribute and analyzing the correlation of attributes. Moreover, we propose a machine learning framework, which contains attribute reduction, to find a critical point that classifies the performance of a participant into either ‘good’ or ‘bad’ through 10-fold cross-validation with randomly selected attributes cross-validation models. The proposed machine learning model can be used to predict the performance of a user based on the user profile. Our data analysis shows that intervention and an incentive mechanism play a significant role while phishing type I is more harmful to users compared to the other two types. The findings of this research can be used to help a user identify a phishing attack and prevent the user from being a victim of such an attack.

Applying Machine Learning Techniques to Understand User Behaviors When Phishing Attacks Occur

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Funding data

Current Issue

Make a Submission

Keywords