Journal of Electrical and Computer Engineering Innovations (JECEI)
مقاله 17 ، دوره 13، شماره 1 ، فروردین 2025، صفحه 209-224 اصل مقاله (1.3 M )
نوع مقاله: Original Research Paper
شناسه دیجیتال (DOI): 10.22061/jecei.2024.11110.767
نویسندگان
K. Moeenfar ؛ V. Kiani* ؛ A. Soltani ؛ R. Ravanifard
Computer Engineering Department, Faculty of Engineering, University of Bojnord, Bojnord, Iran.
تاریخ دریافت : 05 مرداد 1403 ،
تاریخ بازنگری : 07 آبان 1403 ،
تاریخ پذیرش : 27 آبان 1403
چکیده
Background and Objectives: In this paper, a novel and efficient unsupervised machine learning algorithm named EiForestASD is proposed for distinguishing anomalies from normal data in data streams. The proposed algorithm leverages a forest of isolation trees to detect anomaly data instances. Methods: The proposed method EiForestASD incorporates an isolation forest as an adaptable detector model that adjusts to new data over time. To handle concept drifts in the data stream, a window-based concept drift detection is employed that discards only those isolation trees that are incompatible with the new concept. The proposed method is implemented using the Python programming language and the Scikit-Multiflow library.Results: Experimental evaluations were conducted on six real-world and two synthetic data streams. Results reveal that the proposed method EiForestASD reduces computation time by 19% and enhances anomaly detection rate by 9% compared to the baseline method iForestASD. These results highlight the efficacy and efficiency of the EiForestASD in the context of anomaly detection in data streams.Conclusion: The EiForestASD method handles concept change using an intelligent strategy where only those trees from the detector model incompatible with the new concept are removed and reconstructed. This modification of the concept drift handling mechanism in the EiForestASD significantly reduces computation time and improves anomaly detection accuracy.
کلیدواژهها
Anomaly Detection ؛ Data Streams ؛ Concept Drift ؛ Sliding Window ؛ Isolation Tree
مراجع
[1] R. Al-amri, R. K. Murugesan, M. Man, A. F. Abdulateef, M. A. Al-Sharafi, A. A. Alkahtani, “A review of machine learning and deep learning techniques for anomaly detection in IoT data,” Appl. Sci., 11(12): 5320, 2021.
[2] R. A. Ariyaluran Habeeb, F. Nasaruddin, A. Gani, I. A. Targio Hashem, E. Ahmed, M. Imran, “Real-time big data processing for anomaly detection: A Survey,” Int. J. Inf. Manag., 45: 289-307, 2019.
[3] M. Hosseini Shirvani, A. Akbarifar, “A survey study on intrusion detection system in wireless sensor network: Challenges and considerations,” J. Electr. Comput. Eng. Innovations, 12(2): 449-474, 2024.
[4] A. A. Cook, G. Mısırlı, Z. Fan, “Anomaly detection for IoT time-series data: A survey,” IEEE Internet Things J., 7(7): 6481-6494, 2020.
[5] L. Qi, Y. Yang, X. Zhou, W. Rafique, J. Ma, “Fast anomaly identification based on multiaspect data streams for intelligent intrusion detection toward secure industry 4.0,” IEEE Trans. Ind. Inform., 18(9): 6503-6511, 2022.
[6] B. Steenwinckel, D. D. Paepe, S. V. Hautte, P. Heyvaert, M. Bentefrit, P. Moens, A. Dimou, B. V. D. Bossche, F. D. Turck, S. V. Hoecke, F. Ongenae, “FLAGS: A methodology for adaptive anomaly detection and root cause analysis on sensor data streams by fusing expert knowledge with machine learning,” Future Gener. Comput. Syst., 116: 30-48, 2021.
[7] M. E. Villa-Pérez, M. Á. Álvarez-Carmona, O. Loyola-González, M. A. Medina-Pérez, J. C. Velazco-Rossell, K. K. R. Choo, “Semi-supervised anomaly detection algorithms: A comparative summary and future research directions,” Knowl. Based Syst., 218: 106878, 2021.
[8] A. Oloomi, H. Khanmirza, “Fault tolerance of RTMP protocol for live video streaming applications in hybrid software-defined networks,” J. Electr. Comput. Eng. Innovatons, 7(2): 241-250, 2019.
[9] T. Lu, L. Wang, X. Zhao, “Review of anomaly detection algorithms for data streams,” Appl. Sci., 13(10): 6353, 2023.
[10] Z. Nouri, V. Kiani, H. Fadishei, “Rarity updated ensemble with oversampling: An ensemble approach to classification of imbalanced data streams,” Stat. Anal. Data Min. ASA Data Sci. J., 17(1): e11662, 2024.
[11] I. Souiden, M. N. Omri, Z. Brahmi, “A survey of outlier detection in high dimensional data streams,” Comput. Sci. Rev., 44: 100463, 2022.
[12] K. Yamanishi, J. Takeuchi, G. Williams, P. Milne, “On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms,” Data Min. Knowl. Discov., 8(3): 275-300, 2004.
[13] F. Rollo, C. Bachechi, L. Po, “Anomaly detection and repairing for improving air quality monitoring,” Sensors, 23(2): 640, 2023.
[14] C. Bachechi, F. Rollo, L. Po, “Detection and classification of sensor anomalies for simulating urban traffic scenarios,” Clust. Comput., 25: 2793-2817, 2022.
[15] A. Shylendra, P. Shukla, S. Mukhopadhyay, S. Bhunia, A. R. Trivedi, “Low power unsupervised anomaly detection by nonparametric modeling of sensor statistics,” IEEE Trans. Very Large Scale Integr. VLSI Syst., 28(8): 1833-1843, 2020.
[16] Y. Yang, C. Fan, L. Chen, H. Xiong, “IPMOD: An efficient outlier detection model for high-dimensional medical data streams,” Expert Syst. Appl., 191: 116212, 2022.
[17] F. Angiulli, F. Fassetti, “Detecting distance-based outliers in streams of data,” in Proc. ACM Conference on Information and Knowledge Management: 811-820, 2007.
[18] M. Kontaki, A. Gounaris, A. N. Papadopoulos, K. Tsichlas, Y. Manolopoulos, “Continuous monitoring of distance-based outliers over data streams,” in Proc. IEEE 27th International Conference on Data Engineering: 135-146, 2011.
[19] S. Yoon, J. G. Lee, B. S. Lee, “NETS: extremely fast outlier detection from a data stream via set-based processing,” in Proc. VLDB Endow., 12(11): 1303-1315, 2019.
[20] M. J. Bah, H. Wang, M. Hammad, F. Zeshan, H. Aljuaid, “An effective minimal probing approach with micro-cluster for distance-based outlier detection in data streams,” IEEE Access, 7: 154922-154934, 2019.
[21] R. Zhu, X. Ji, D. Yu, Z. Tan, L. Zhao, J. Li, X. Xia, “KNN-based approximate outlier detection algorithm over IoT streaming data,” IEEE Access, 8: 42749-42759, 2020.
[22] T. Toliopoulos, A. Gounaris, “Explainable distance-based outlier detection in data streams,” IEEE Access, 10: 47921-47936, 2022.
[23] M. M. Breunig, H.-P. Kriegel, R. T. Ng, J. Sander, “LOF: identifying density-based local outliers,” in Proc. ACM SIGMOD International Conference on Management of Data: 93-104, 2000.
[24] D. Pokrajac, A. Lazarevic, L. J. Latecki, “Incremental local outlier detection for data streams,” in Proc. IEEE Symposium on Computational Intelligence and Data Mining: 504-515, 2007.
[25] G. S. Na, D. Kim, H. Yu, “DILOF: Effective and memory efficient local outlier detection in data streams,” in Proc. ACM SIGKDD International Conference on Knowledge Discovery & Data Mining: 1993-2002, 2018.
[26] H. Yao, X. Fu, Y. Yang, O. Postolache, “An incremental local outlier detection method in the data stream,” Appl. Sci., 8(8): 1248, 2018.
[27] O. Alghushairy, R. Alsini, X. Ma, T. Soule, “A genetic-based incremental local outlier factor algorithm for efficient data stream processing,” in Proc. ACM International Conference on Compute and Data Analysis: 38-49, 2020.
[28] D. Barrish, J. Vuuren, “Enhancement of the local outlier factor algorithm for anomaly detection in time series,” Easy Chair Preprint: 14238, 2024.
[29] D. Apoji, K. Soga, “Soil clustering and anomaly detection based on epbm data using principal component analysis and local outlier factor,” in Proc. Geo-Risk 2023: 1-11, 2023.
[30] L. Chen, W. Wang, Y. Yang, “CELOF: Effective and fast memory efficient local outlier detection in high-dimensional data streams,” Appl. Soft Comput., 102: 107079, 2021.
[31] L. Wan, W. K. Ng, X. H. Dang, P. S. Yu, K. Zhang, “Density-based clustering of data streams at multiple resolutions,” ACM Trans. Knowl. Discov. Data, 3(3): 14, 2009.
[32] A. Bär, P. Casas, L. Golab, A. Finamore, “DBStream: An online aggregation, filtering and processing system for network traffic monitoring,” in Proc. International Wireless Communications and Mobile Computing Conference (IWCMC): 611-616, 2014.
[33] N. A. Supardi, S. J. Abdulkadir, N. Aziz, “An evolutionary stream clustering technique for outlier detection,” in Proc. International Conference on Computational Intelligence (ICCI): 299-304, 2020.
[34] C. Yin, S. Zhang, Z. Yin, J. Wang, “Anomaly detection model based on data stream clustering,” Clust. Comput., 22(1): 1729-1738, 2019.
[35] E. Vanem, A. Brandsæter, “Unsupervised anomaly detection based on clustering methods and sensor data on a marine diesel engine,” J. Mar. Eng. Technol., 20(4): 217-234, 2021.
[36] R. A. A. Habeeb, F. Nasaruddin, A. Gani, M. A. Amanullah, I. A. T. Hashem, E. Ahmed, M. Imran, “Clustering-based real-time anomaly detection—A breakthrough in big data technologies,” Trans. Emerg. Telecommun. Technol., 33(8): e3647, 2022.
[37] X. Wang, M. M. Ahmed, M. N. Husen, H. Tao, Q. Zhao, “Dynamic micro-cluster-based streaming data clustering method for anomaly detection,” in Proc. International Conference on Soft Computing in Data Science: 61-75, 2023.
[38] C. H. Park, “Outlier and anomaly pattern detection on data streams,” J. Supercomput., 75(9): 6118-6128, 2019.
[39] K. Gokcesu, M. M. Neyshabouri, H. Gokcesu, S. S. Kozat, “Sequential outlier detection based on incremental decision trees,” IEEE Trans. Signal Process., 67(4): 993-1005, 2019.
[40] T. Barbariol, F. D. Chiara, D. Marcato, G. A. Susto, “A review of tree-based approaches for anomaly detection,” in Control Charts and Machine Learning for Anomaly Detection in Manufacturing, Springer, pp: 149-185, 2022.
[41] S. C. Tan, K. M. Ting, T. F. Liu, “Fast anomaly detection for streaming data,” in Proc. International Joint Conference on Artificial Intelligence (IJCAI): 1511-1516, 2011.
[42] K. Wu, K. Zhang, W. Fan, A. Edwards, P. S. Yu, “RS-Forest: A rapid density estimator for streaming anomaly detection,” in Proc. IEEE International Conference on Data Mining: 600-609, 2014.
[43] F. T. Liu, K. M. Ting, Z. H. Zhou, “Isolation-based anomaly detection,” ACM Trans. Knowl. Discov. Data, 6(1): 3, 2012.
[44] Z. Ding, M. Fei, “an anomaly detection approach based on isolation forest algorithm for streaming data using sliding window,” IFAC Proc., 46(20): 12-17, 2013.
[45] M. U. Togbe, Y. Chabchoub, A. Boly, M. Barry, R. Chiky, M. Bahri, “Anomalies detection using isolation in concept-drifting data streams,” Computers, 10(1): 13, 2021.
[46] A. H. Madkour, A. Elsayed, H. Abdel-Kader, “Historical isolated forest for detecting and adaptation concept drifts in nonstationary data streaming,” Int. J. Comput. Inf., 10(2): 16-27, 2023.
[47] G. Hannák, G. Horváth, A. Kádár, M. D. Szalai, “Bilateral-Weighted online adaptive isolation forest for anomaly detection in streaming data,” Stat. Anal. Data Min. ASA Data Sci. J., 16(3): 215-223, 2023.
[48] Y. Liu, C. Liu, J. Li, Y. Sun, “Anomaly detection of streaming data based on deep learning,” in Proc. International Conference on Internet of Things, Communication and Intelligent Technology: 459-465, 2024.
[49] M. E. A. Azz, A. Aljasmi, A, E. F. Seghrouchni, W. Benzarti, P. Chopin, F. Barbaresco, R. A. Zitar, “ADS-B data anomaly detection with machine learning methods,” in Proc. International Workshop on Metrology for AeroSpace: 94-99, 2024.
[50] Y. Lee, C. Park, N. Kim, J. Ahn, J. Jeong, “LSTM-Autoencoder based anomaly detection using vibration data of wind turbines,” Sensors, 24(9): 2833, 2024.
[51] J. Li, K. Malialis, M. M. Polycarpou, “Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation,” in Proc. International Joint Conference on Neural Networks (IJCNN): 1-8, 2023.
[52] M. Molan, A. Borghesi, D. Cesarini, L. Benini, A. Bartolini, “RUAD: Unsupervised anomaly detection in HPC systems,” Future Gener. Comput. Syst., 141: 542-554, 2023.
[53] M. Pourreza, B. Mohammadi, M. Khaki, S. Bouindour, H. Snoussi, M. Sabokrou, “G2D: Generate to detect anomaly,” in Proc. IEEE Winter Conference on Applications of Computer Vision (WACV): 2002-2011, 2021.
[54] P. Jiao, T. Li, Y. Xie, Y. Wang, W. Wang, D. He, H. Wu, “Generative evolutionary anomaly detection in dynamic networks,” IEEE Trans. Knowl. Data Eng., 35(12): 12234-12248, 2023.
[55] T. Yang, Y. Hu, Y. Li, W. Hu, Q. Pan, “A standardized ICS network data processing flow with generative model in anomaly detection,” IEEE Access, 8: 4255-4264, 2020.
[56] M. Ravanbakhsh, M. Nabi, E. Sangineto, L. Marcenaro, C. Regazzoni, N. Sebe, “Abnormal event detection in videos using generative adversarial nets,” in Proc. IEEE International Conference on Image Processing (ICIP): 1577-1581, 2017.
[57] J. Wang, J. Liu, J. Pu, Q. Yang, Z. Miao, J. Gao, Y. Song, “An anomaly prediction framework for financial IT systems using hybrid machine learning methods,” J. Ambient Intell. Humaniz. Comput., 14(11): 15277-15286, 2023.
[58] A. Srivastava, M. R. Bharti, “Hybrid machine learning model for anomaly detection in unlabelled data of wireless sensor networks,” Wirel. Pers. Commun., 129(4): 2693-2710, 2023.
[59] W. Ullah, T. Hussain, F. U. M. Ullah, M. Y. Lee, S. W. Baik, “TransCNN: Hybrid CNN and transformer mechanism for surveillance anomaly detection,” Eng. Appl. Artif. Intell., 123(A): 106173, 2023.
[60] Y. Karadayı, M. N. Aydin, A. S. Öğrenci, “A hybrid deep learning framework for unsupervised anomaly detection in multivariate spatio-temporal data,” Appl. Sci., 10(15): 5191, 2020.
[61] M. U. Togbe et al., “Anomaly detection for data streams based on isolation forest using scikit-multiflow,” in Proc. Computational Science and Its Applications (ICCSA): 15-30, 2020.
[62] Y. Yang, X. Yang, M. Heidari, M. A. Khan, G. Srivastava, M. R. Khosravi, L. Qi, “ASTREAM: Data-Stream-Driven scalable anomaly detection with accuracy guarantee in IIoT environment,” IEEE Trans. Netw. Sci. Eng., 10(5): 3007-3016, 2022.
[63] Y. Yang, S. Ding, Y. Liu, S. Meng, X. Chi, R. Ma, C. Yan, “Fast wireless sensor for anomaly detection based on data stream in an edge-computing-enabled smart greenhouse,” Digit. Commun. Netw., 8(4): 498-507, 2022.
[64] Q. Li, Z. Yu, H. Xu, B. Guo, “Human-machine interactive streaming anomaly detection by online self-adaptive forest,” Front. Comput. Sci., 17(2): 172317, 2022.
[65] F. T. Liu, K. M. Ting, Z. H. Zhou, “Isolation forest,” in Proc. IEEE International Conference on Data Mining: 413-422, 2008.
آمار
تعداد مشاهده مقاله: 42
تعداد دریافت فایل اصل مقاله: 43