Android Malware Detection Using API Calls: A Comparison of Feature Selection and Machine Learning Models

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)
142 Downloads (Pure)

Abstract

Android has become a major target for malware attacks due its popularity and ease of distribution of applications. According to a recent study, around 11,000 new malware appear online on daily basis. Machine learning approaches have shown to perform well in detecting malware. In particular, API calls has been found to be one of the best performing features in malware detection. However, due to the functionalities provided by the Android SDK, applications can use many API calls, creating a computational overhead while training machine learning models. In this study, we look at the benefits of using feature selection to reduce this overhead. We consider three different feature selection algorithms, mutual information, variance threshold and Pearson correlation coefficient, when used with five different machine learning models: support vector machines, decision trees, random forests, Naïve Bayes and AdaBoost. We collected a dataset of 40,000 Android applications that used 134,207 different API calls. Our results show that the number of API calls can be reduced by approximately 95%, whilst still being more accurate than when the full API feature set is used. Random forests achieve the best discrimination between malware and benign applications, with an accuracy of 96.1%.
Original languageEnglish
Title of host publicationProceedings of the International Conference on Applied CyberSecurity (ACS) 2021
EditorsHani Ragab Hassen, Hadj Batatia
PublisherSpringer
Pages3-12
Number of pages10
ISBN (Electronic)9783030959180
ISBN (Print)9783030959173
DOIs
Publication statusPublished - 2 Feb 2022

Publication series

NameLecture Notes in Networks and Systems
Volume378
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Signal Processing
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Android Malware Detection Using API Calls: A Comparison of Feature Selection and Machine Learning Models'. Together they form a unique fingerprint.

Cite this