Deep Learning Attention-based Framework for Integrating EEG and Image Information in Visual Content Recognition

Hakkak, Hamed; Khalilzadeh, Mohammad Mahdi; Azarnoosh, Mahdi; Kobravi, Hamid Reza

doi:10.22061/jecei.2026.12557.890

تعداد نشریات	15
تعداد شماره‌ها	239
تعداد مقالات	2,438
تعداد مشاهده مقاله	4,127,820
تعداد دریافت فایل اصل مقاله	3,007,186

	Deep Learning Attention-based Framework for Integrating EEG and Image Information in Visual Content Recognition
Journal of Electrical and Computer Engineering Innovations (JECEI)
مقاله 19، دوره 14، شماره 2، مهر 2026، صفحه 565-582 اصل مقاله (5.12 M)
نوع مقاله: Original Research Paper
شناسه دیجیتال (DOI): 10.22061/jecei.2026.12557.890
نویسندگان
Hamed Hakkak؛ Mohammad Mahdi Khalilzadeh^* ؛ Mahdi Azarnoosh؛ Hamid Reza Kobravi
Department of Biomedical Engineering, Ma.C., Islamic Azad University, Mashhad, Iran.
تاریخ دریافت: 03 بهمن 1404، تاریخ بازنگری: 22 اردیبهشت 1405، تاریخ پذیرش: 05 خرداد 1405
چکیده
Background and Objectives: While deep learning has significantly advanced visual content recognition, existing models primarily rely on image data alone, neglecting the rich cognitive context embedded in neural responses. This study aimed to develop and validate a novel framework that synergistically integrates electroencephalography (EEG) signals with visual features to achieve superior accuracy in multiclass image recognition. Methods: We designed a hierarchical attention-based deep learning architecture to fuse neural and visual information. EEG data recorded (the dataset newly developed by the authors) during visual stimulus presentation were preprocessed and analyzed using temporal models (RNN-CNN and LSTM) to extract neural features. Concurrently, visual features were extracted from the stimulus images using ResNet101 and DenseNet201 architectures. The proposed attention mechanism dynamically weighted and integrated these multimodal features, prioritizing the most salient information from each modality. Results: The proposed framework significantly outperformed conventional unimodal approaches. The hybrid RNN-CNN + ResNet101 model achieved a peak classification accuracy. A feature contribution analysis revealed that the optimal performance was attained through an integrated contribution of approximately 60% from image-derived features and 40% from EEG-derived features, demonstrating the critical complementary value of neural data. Conclusion: This study confirms that the structured, attention-based fusion of neurophysiological and visual data substantially enhances visual content recognition. The findings provide a robust and effective framework for advanced cognitive assessment applications and establish a new benchmark for multimodal integration in machine learning, highlighting the significant potential of EEG data to complement and improve computer vision tasks.
کلیدواژه‌ها
EEG–image Fusion؛ Attention-based Deep Learning؛ Multi-class Visual Content Classification؛ Hierarchical Attention Mechanism؛ RNN-CNN

مراجع
[1] Y. Wang, X. Liu, C. Yu, "Assisted diagnosis of alzheimer’s disease based on deep learning and multimodal feature fusion," Complexity, 6626728, 2021. [2] M. M. A. Monshi, J. Poon, V. Chung, "Deep learning in generating radiology reports: A survey," Artif. Intell. Med., 106, 101878, 2020. [3] P. Lu, L. Hu, A. Mitelpunkt, S. Bhatnagar, L. Lu, H. Liang, "A hierarchical attention-based multimodal fusion framework for predicting the progression of Alzheimer's disease," Biomed. Signal Process. Control, 88(B), 105669, 2024. [4] M. Liu, D. Guan, C. Zheng, C. Tian, J. Wen, Q. Zhu, "ViEEG: hierarchical neural coding with cross-modal progressive enhancement for EEG-based visual decoding," arXiv preprint arXiv:2505.12408, 2025. [5] Z. Xue, R. Marculescu, "Dynamic multimodal fusion," in Proc. the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops: 2575-2584, 2023. [6] Y. Wang, Y. Zhang, Y. Zhang, "A systematic review of intermediate fusion in multimodal deep learning for biomedical applications," Comput. Biol. Med., 166: 107497, 2025. [7] M. Ozdemir, E. Akbas, "A hierarchical cross-modal spatial fusion network for multimodal emotion recognition," IEEE Trans. Affective Comput., 6(5): 1429-1438, 2025. [8] S. Li, H. Tang, "Multimodal alignment and fusion: A survey," arXiv preprint arXiv: 2411.17040, 2024. [9] M. Zuo, X. Chen, L. Sui, "Evaluation of machine learning algorithms for classification of visual stimulation-induced EEG signals in 2D and 3D VR videos," Brain Sci., 15(1), 75, 2025. [10] R. Zhang, Q. Zong, L. Dou, X. Zhao, Y. Tang, Z. Li, "Hybrid deep neural network using transfer learning for EEG motor imagery decoding," Biomed. Signal Process. Control, 63, 102144, 2021. [11] Z. C. Tang, C. Li, J. F. Wu, P. C. Liu, S. W. Cheng, "Classification of EEG-based single-trial motor imagery tasks using a B-CSP method for BCI," Front. Inf. Technol. Electron. Eng., 20(8): 1087-1098, 2019. [12] M. Yu, A. Masrur, C. Blaszczak-Boxe, "Predicting hourly PM2. 5 concentrations in wildfire-prone areas using a SpatioTemporal Transformer model," Sci. Total Environ., 860, 160446, 2023. [13] H. Ahmadieh, F. Gassemi, M.H. Moradi, "A hybrid deep learning framework for automated visual image classification using EEG signals," Neural Comput. Appl., 35(28): 20989-21005, 2023. [14] X. Wu, Y. Chu, Q. Li, Y. Luo, Y. Zhao, X. Zhao, "AMEEGNet: attention-based multiscale EEGNet for effective motor imagery EEG decoding," Front. Neurorobotics, 19, 1540033, 2025. [15] Z. Huang, Y. Yang, Y. Ma, Q. Dong, J. Su, H. Shi, S. Zhang, L. Hu, "EEG detection and recognition model for epilepsy based on dual attention mechanism," Sci. Rep., 15(1), 9404, 2025. [16] W. Liao, Z. Miao, S. Liang, L. Zhang, C. Li, "A composite improved attention convolutional network for motor imagery EEG classification," Front. Neuroscience, 19, 1543508, 2025. [17] K. Martín-Chinea, J. Ortega, J. F. Gómez-González, E. Pereda, J. Toledo, L. Acosta, "Effect of time windows in LSTM networks for EEG-based BCIs," Cognit. Neurodynamic, 17(2): 385-398, 2023. [18] H. Li, M. Ding, R. Zhang, C. Xiu, "Motor imagery EEG classification algorithm based on CNN-LSTM feature fusion network," Biomed. Signal Process. Control, 72, 103342, 2022. [19] R. Du, S. Zhu, H. Ni, T. Mao, J. Li, R. Wei, "Valence-arousal classification of emotion evoked by Chinese ancient-style music using 1D-CNN-BiLSTM model on EEG signals for college students," Multimedia Tools Appl., 82(10): 15439-15456, 2023. [20] Z. Wang, J. Yang, M. Sawan, "A novel multi-scale dilated 3D CNN for epileptic seizure prediction," in Proc. 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS): 1-4, 2021. [21] Y. Wang, L. Zhang, P. Xia, P. Wang, X. Chen, L. Du, Z. Fang, M. Du, "EEG-based emotion recognition using a 2D CNN with different kernels," Bioengineering, 9(6), 231, 2022. [22] Z. Wang, Z. Ma, Z. An, F. Huang, "A novel diagnosis method of depression based on EEG and convolutional neural network," in Proc. International Conference on Frontier Computing: 91-102, 2021. [23] S. Shanmugam, S. Dharmar, "A CNN-LSTM hybrid network for automatic seizure detection in EEG signals," Neural Comput. Appl., 35(28): 20605-20617, 2023. [24] J. Wang, S. Cheng, J. Tian, Y. Gao, "A 2D CNN-LSTM hybrid algorithm using time series segments of EEG data for motor imagery classification," Biomed. Signal Process. Control, 83, 104627, 2023. [25] J. Patel, "SeizureSight: 2D CNN-LSTM hybrid for EEG-based seizure prediction," in Proc. 2024 3rd International Conference on Applied Artificial Intelligence and Computing (ICAAIC): 252-256, 2024. [26] X. Li, X. Xu, X. He, X. Wei, H. Yang, "Intelligent crack detection method based on GM-ResNet," Sensors, 23(20), 8369, 2023. [27] A. J. Jalil, N. M. Reda, "Infrared thermal image gender classifier based on the deep resnet model," Adv. Human‐Comput. Interac., 2022(1), 3852054, 2022. [28] Y. Hou, Z. Wu, X. Cai, T. Zhu, "The application of improved densenet algorithm in accurate image recognition," Sci. Rep. 14(1), 8645, 2024. [29] M. G. Lanjewar, K. G. Panchbhai, P. Charanarur, "Lung cancer detection from CT scans using modified DenseNet with feature selection methods and ML classifiers," Exp. Syst. Appl., 224, 119961, 2023. [30] S. Dash, P. K. Sethy, S. K. Behera, "Cervical transformation zone segmentation and classification based on improved Inception-ResNet-V2 using colposcopy images," Cancer Inf., 22, 2023. [31] B. Hu, J. Liu, Y. Xu, T. Huo, "An integrated bearing fault diagnosis method based on multibranch SKNet and enhanced inception‐ResNet‐v2," Shock Vibr., 2024, 9071328, 2024. [32] F. Khezerlou, A. Baradarani, M. A. Balafar, R. G. Maev, "Spatio‐temporal attention modules in orientation‐magnitude‐response guided multi‐stream CNNs for human action recognition," IET Image Process., 18(9): 2372-2388, 2024. [33] C. Zeng, S. Feng, D. Zhu, Z. Wang, "Source acquisition device identification from recorded audio based on spatiotemporal representation learning with multi-attention mechanisms," Entropy, 25(4), 626, 2023. [34] N. Delfan, M. Shahsavari, S. Hussain, R. Damaševičius, U. R. Acharya, "A hybrid deep spatiotemporal attention‐based model for parkinson's disease diagnosis using resting state EEG signals," Int. J. Imag. Syst. Technol., 34(4), e23120, 2024. [35] Q. Xu, Y. Gao, J. Shen, Y. Li, X. Ran, H. Tang, G. Pan, "Enhancing adaptive history reserving by spiking convolutional block attention module in recurrent neural networks," Adv. Neural Inf. Process. Syst., 36: 58890-58901, 2023. [36] X. Zhu, C. Liu, L. Zhao, S. Wang, "EEG emotion recognition network based on attention and spatiotemporal convolution," Sensors, 24(11), 3464, 2024. [37] Y. Pan, Y. Shang, T. Liu, Z. Shao, G. Guo, H. Ding, Q. Hu, "Spatial–temporal attention network for depression recognition from facial videos," Exp. Syst. Appl., 237, 121410, 2024. [38] C. Zhang, S. Wang, L. Zhong, Q. Chen, C. Rao, "Cascaded temporal and spatial attention network for solar adaptive optics image restoration," Astronom. Astrophys., 674, A126, 2023. [39] A. Haeri-Mehrizi, S. Mohammadi, S. Rafifar, J. Sadighi, R. M. Kermani, R. Rostami, A. Hashemi, M. Tavousi, A. Montazeri, "Health literacy and mental health: a national cross-sectional inquiry," Sci. Rep., 14(1), 13639, 2024. [40] A.K. Wojujutari, E. S. Idemudia, L. E. Ugwu, "The evaluation of the General Health Questionnaire (GHQ-12) reliability generalization: A meta-analysis," PloS one, 19(7), e0304182, 2024.
آمار تعداد مشاهده مقاله: 141 تعداد دریافت فایل اصل مقاله: 61

سامانه مدیریت نشریات علمی. طراحی و پیاده سازی از سیناوب

پیوندهای مفید

آمار

Deep Learning Attention-based Framework for Integrating EEG and Image Information in Visual Content Recognition