Balancing Approaches towards ML for IDS: A Survey for the CSE-CIC IDS Dataset

Subiksha Srinivasa Gopalan, Dharshini Ravikumar, Dino Linekar, Ali Raza, Maheen Hasib

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

Balanced datasets play a key role in the bias observed in machine learning algorithms towards classification and prediction. The CSE-CIC IDS datasets published in 2017 and 2018 have both attracted considerable scholarly attention towards research in intrusion detection systems. Recent work published using this dataset indicates little attention paid to the imbalance of the dataset. The study presented in this paper sets out to explore the degree to which imbalance has been treated and provide a taxonomy of the machine learning approaches developed using these datasets. A survey of published works related to these datasets was done to deliver a combined qualitative and quantitative methodological approach for our analysis towards deriving a taxonomy. The research presented here confirms that the impact of bias due to the imbalance datasets is rarely addressed. This data supports further research and development of supervised machine learning techniques which reduce the impact of bias in classification or prediction due to these imbalance datasets.
Original languageEnglish
Title of host publication2020 International Conference on Communications, Signal Processing, and their Applications (ICCSPA)
PublisherIEEE
ISBN (Electronic)9781728165356
ISBN (Print)9781728165363
DOIs
Publication statusPublished - 2 Apr 2021
EventInternational Conference on Communications, Signal Processing, and their Applications 2020 - Sharjah, United Arab Emirates
Duration: 16 Mar 202118 Mar 2021

Conference

ConferenceInternational Conference on Communications, Signal Processing, and their Applications 2020
Abbreviated titleICCSPA 2020
Country/TerritoryUnited Arab Emirates
CitySharjah
Period16/03/2118/03/21

Keywords

  • Measurement
  • machine learning algorithms
  • Taxonomy
  • Machine learning
  • Signal processing
  • Predictive models
  • Research and Development
  • Balance
  • dataset
  • intrusion detection system

Fingerprint

Dive into the research topics of 'Balancing Approaches towards ML for IDS: A Survey for the CSE-CIC IDS Dataset'. Together they form a unique fingerprint.

Cite this