Leveraging attention-based deep neural networks for security vetting of Android applications
DOI:
https://doi.org/10.4108/eai.27-9-2021.171168Keywords:
Android Apps, Android Security, Malware Detection, Deep Neural Networks, AttentionAbstract
Many traditional machine learning and deep learning algorithms work as a black box and lack interpretability. Attention-based mechanisms can be used to address the interpretability of such models by providing insights into the features that a model uses to make its decisions. Recent success of attention-based mechanisms in natural language processing motivates us to apply the idea for security vetting of Android apps. An Android app’s code contains API-calls that can provide clues regarding the malicious or benign nature of an app. By observing the pattern of the API-calls being invoked, we can interpret the predictions of a model trained to separate benign apps from malicious apps. In this paper, using the attention mechanism, we aim to find the API-calls that are predictive with respect to the maliciousness of Android apps. More specifically, we target to identify a set of API-calls that malicious apps exploit, which might help the community discover new signatures of malware. In our experiment, we work with two attention-based models: Bi-LSTM Attention and Self-Attention. Our classification models achieve high accuracy in malware detection. Using the attention weights, we also extract the top 200 API-calls (that reflect the malicious behavior of the apps) from each of these two models, and we observe that there is significant overlap between the top 200 API-calls identified by the two models. This result increases our confidence that the top 200 API-calls can be used to improve the interpretability of the models.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 EAI Endorsed Transactions on Security and Safety
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 3.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.
Funding data
-
National Science Foundation
Grant numbers 1718214