Description
We collected Android applications released between 2019 and 2021 from various stores and repositories commonly used by end users. Due to the lack of available scripts for building datasets, we developed platform-independent Python scripts to crawl these stores and download applications. These scripts can be periodically used to maintain an up-to-date dataset. Our targeted stores included UpToDown, APKMirror, and F-Droid. The scripts crawled these websites for all applications, regardless of their categories. To the best of our knowledge, this is the most realistic and up-to-date benign dataset currently available.
For collecting malware, we used VirusShare. Additionally, we identified malware while downloading applications from the application stores. We utilized VirusTotal reports to label all the applications. To prevent false positives in our benign dataset, we only included applications that had zero positive tags from anti-malware in the VirusTotal reports.
For collecting malware, we used VirusShare. Additionally, we identified malware while downloading applications from the application stores. We utilized VirusTotal reports to label all the applications. To prevent false positives in our benign dataset, we only included applications that had zero positive tags from anti-malware in the VirusTotal reports.
Date made available | 2024 |
---|---|
Publisher | Heriot-Watt University |
Date of data production | 2019 - 2021 |