Total views : 165
A Model for Generating Synthetic Network Flows and Accuracy Index for Evaluation of Anomaly Network Intrusion Detection Systems
Objectives: This study proposes a model for generating synthetic network flows inserting malicious fragments randomly and a new metric for measuring the performance of an Anomaly Network Intrusion Detection System (ANIDS). Method: A simulation model is developed for generating synthetic network flows inserting malicious fragments that reflect Denial of Service (DoS) and Probe attacks. An ANIDS shall maximize true positives and true negatives which is equivalent to minimizing Type-I and Type-II errors. The geometric mean of True Positive Rate (TPR) and True Negative Rate (TNR) is proposed as a metric, namely, Geometric Mean Accuracy Index (GMAI) for measuring the performance of any proposed ANIDS. Findings: The task of detecting anomalous network flows by inspecting at fragment level boils down to discrete binary classification problem. The Receiver Operating Characteristic (ROC) curve considers False Positive Rates (FPR) and True Positive Rate (TPR) only. It does not reflect the minimization of Type-I and Type-II errors. Maximizing GMAI is the reflection of minimizing 1-GMAI which is equivalent to minimizing Type-I and Type-II errors. Further, the GMAI can be employed as service level for evaluating acceptance sampling based ANIDS. The domain of DoS and Probe attacks, mostly employed by the intruders at fragment level is studied. A conceptual simulation model is developed for generating synthetic network flows incorporating malicious fragments randomly from the domain of DoS and Probe attacks. The conceptual model is translated into operational model (a set computer programs) and synthetic network flows are generated. Using the operational model, the 1000 synthetic network flows are generated for each percentage of anomalous flows varying from 0.1 to 0.9 and employing discrete uniform probability distribution for selecting a fragment for transforming it into malicious. The generated network flows for each percentage of anomalous flows are represented graphically as histogram. It is found that they follow discrete uniform distribution. Hence, the model is validated. Applications: The simulation model can be used for generating synthetic networks flows for evaluating ANIDS. The GMAI can be used as service level for evaluating a discrete binary classifier irrespective of domain.
Anomalous Flows, Geometric Mean Accuracy Index, Network Intrusion Detection Systems, Synthetic Network Flows, Simulation Model
- Mogul JC, Rashid RF, Accetta MJ. The packet filter: an efficient mechanism for user-level network code. Technical report, Western Research Lab, Digital Equipment Corporation: California, USA; 1987 Nov. p. 1–26.
- Denning DE. An intrusion-detection model. Institute of Electrical and Electronics Engineers (IEEE) Transactions on Software Engineering. 1987 Feb; 13(2):222–32.
- Debar H, Dacier M, Wespi A. Towards a taxonomy of intrusion detection systems. The International Journal of Computer and Telecommunications Networking. 1999 Apr; 31(8):805–22.
- Monowar HB, Bhattacharyya DK, Kalita JK. Network anomaly detection: methods, systems and tools. Institute of Electrical and Electronics Engineers (IEEE) Communications Surveys and Tutorials. 2014 Jan; 16(1):303–36.https://doi.org/10.1109/SURV.2013.052213.00046.
- Weber D. Taxonomy of computer intrusions [Master thesis].Cambridge, MA, Massachusetts Institute of Technology; 1998.
- Kendall K. A database of computer attacks for the evaluation of intrusion detection systems [Master thesis]. Cambridge, MA, Massachusetts Institute of Technology; 1999.
- Neustar. The Threat scape widens: DDoS aggression and the evolution of IOT risks [Internet]. 2016 [cited 2016 Sep 27].Available from: https://ns-cdn.neustar.biz/creative_services/ biz/neustar/www/resources/whitepapers/it-security/ ddos/2016-apr-ddos-report.pdf.
- Bhuyan MH, Bhattacharyya DK, Kalita JK. Towards generating real-life datasets for network intrusion detection.International Journal of Network Security. 2015 Nov; 17(6):675–93.
- The SHMOO Group. Defcon data set [Internet]. 2015 [cited 2015 Jun 7]. Available from: http://cctf. Shmoo.com.
- Centre for Applied Internet Data Analysis (CAIDA).Anonymized internet traces [Internet]. 2015 [cited 2015 Sep 11]. Available from: http://www.caida.org/data/overview.
- Lawrence Berkeley National Laboratory/International Computer Science Institute Enterprise Tracing Project [Internet]. 2013 [cited 2013 Jul 30]. Available from: http:// www.icir.org/enterprise-tracing/Overview.html.
- UCI Machine Learning Repository. KDD Cup 1999 data data set [Internet]. 2015 [cited 2015 Dec 9].Available from: https://archive.ics.uci.edu/ml/datasets/ KDD+Cup+1999+Data.
- University of New Brunswick. NSL-KDD data set for networkbased intrusion detection systems [Internet]. 2016 [cited 2016 Mar 11]. Available from: http://nsl.cs.unb.ca/ NSL-KDD/.
- Song J, Takakura H, Okabe Y. Description of Kyoto university benchmark data [Internet]. 2016 [cited 2016 Jun 7]. Available from: http://www.takakura.com/Kyoto_data/ BenchmarkData-Description-v5.pdf.
- MAWI working group traffic archive [Internet]. 2016 [cited 2016 Jun 7]. Available from: http://mawi.wide.ad.jp/mawi.
- Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA. Towards developing a systematic approach to generate benchmark datasets for intrusion detection. Computers and Security.2012 May; 31(3):357–74. Crossref.
- University of Brescia (UNIBS). UNIBS: Data sharing [Internet]. 2011 [cited 2011 Jul 12]. Available from: http:// netweb.ing.unibs.it/~ntw/tools/traces/index.php.
- Kubat M, Matwin S. Addressing the curse of Imbalanced training sets: one side selection. In the Proceedings of the Fourteenth International Conference on Machine Learning, Nashville, Morgan Kaufmann; 1997. p. 179–86.
- Fawcett T, Provost F. Combining data mining and host learning for effective user profiling. In the Proceedings of the second international conference on Knowledge Discovery and Data Mining (KDD), Portland; 1996. p.8–13.
- Chan PK, Prodromidis A, Stolfo SJ. Distributed data mining in credit card fraud detection. Institute of Electrical and Electronics Engineers (IEEE) Journal on Intelligent Systems. 1999 Nov; 14(6):67–74.
- Swets JA. Measuring the accuracy of diagnostic systems.Journal of Store, New Series. 1988 Jun; 240(48):1285–93.
- Lewis DD, Gale WA. A sequential algorithm for training text classifiers. In the Proceedings of seventeenth annual international conference on Research and Development in Informational Retrieval, London; 1994. p. 3–12. Crossref.
- Provost FJ, Fawcett T. Robust classification for imprecise environments. Machine Learning. 2001 Mar; 42(3):203–31.
- Maxion RA, Roberts RR. Proper use of ROC curves in intrusion anomaly detection. Technical Report, School of Computing Science, University of Newcastle: Australia; 2004.
- Tang B, He H, Baggenstoss PM, Kay S. A bayesian classification approach using class-specific features for text categorization. Institute of Electrical and Electronics Engineers (IEEE) Transactions on Knowledge and Data Engineering. 2016 Jun; 28(6):1602–6. Crossref.
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.