Deep Reinforcement Learning for Efficient Multilingual Dialogue Management
Journal of Electrical and Computer Engineering Innovations (JECEI)
مقالات آماده انتشار ، پذیرفته شده، انتشار آنلاین از تاریخ 14 اردیبهشت 1404
نوع مقاله: Original Research Paper
شناسه دیجیتال (DOI): 10.22061/jecei.2025.11348.814
نویسندگان
Mohammad Javad Nasri-Lowshani ؛ Javad Salimi Sartakhti* ؛ Hossein Ebrahimpour-Komole
Department of Artificial Intelligence, Faculty of Electrical and Computer Engineering, University of Kashan, Kashan, Iran.
تاریخ دریافت : 28 دی 1403 ،
تاریخ بازنگری : 06 فروردین 1404 ،
تاریخ پذیرش : 19 فروردین 1404
چکیده
Background and Objectives: Developing efficient task-oriented dialogue systems capable of handling multilingual interactions is a growing area of research in natural language processing (NLP). In this paper, we propose SenSimpleDS, a deep reinforcement learning-based joint task-oriented dialogue system, designed for multilingual conversations.Methods: The system utilizes a deep Q-network and the SBERT model to represent the dialogue environment. We introduce two variants, SenSimpleDS+ and SenSimpleDS-NSP, which incorporate modifications in the ε-greedy method and leverage next sequence prediction (NSP) using BERT to refine the reward function. These methods are evaluated on datasets in English, Persian, Spanish, and German, and compared with baseline methods such as SimpleDS and SCGSimpleDS.Results: Our experimental results demonstrate that the proposed methods outperform the baselines in terms of average collected rewards, requiring fewer learning steps to achieve optimal dialogue policies. Notably, the incorporation of NSP significantly improves performance by optimizing reward collection. The multilingual SenSimpleDS further showcases the system’s ability to function across languages using a random forest classifier for language detection and MPNet for environment construction. In addition to system evaluations, we introduce a new Persian dataset for task-oriented dialogue in the restaurant domain, expanding the resources available for developing dialogue systems in low-resource languages.Conclusion: SenSimpleDS, a deep reinforcement learning-based joint task-oriented dialogue system, demonstrates superior performance over baseline methods by leveraging deep Q-networks, SBERT. The integration of next sequence prediction (NSP) significantly enhances reward optimization, enabling faster convergence to optimal dialogue policies. This work establishes a foundation for future research in multilingual dialogue systems, with potential applications across diverse service domains.
کلیدواژهها
Task-Oriented Dialogue Systems ؛ Deep Reinforcement Learning ؛ Multilingual Dialogue Management ؛ State Representation Learning ؛ Reward Function Optimization
مراجع
[1] X. Wang, C. Yuan, "Recent advances on human-computer dialogue," CAAI Trans. Intell. Technol., 1(4): 303-312, 2016.
[2] H. Chen, X. Liu, D. Yin, J. Tang, "A survey on dialogue systems: recent advances and new frontiers," SIGKDD Explor. Newsl., 19(2): 25-35, 2017.
[3] B. Liu, G. Tür, D. Hakkani-Tür, P. Shah, L. Heck, "Dialogue learning with human teaching and feedback in end-to-end trainable task-oriented dialogue systems," in Proc. Human Language Technologies: 2060-2069, 2018.
[4] P. Budzianowski, I. Vulić, "Hello, It's GPT-2 - How can i help you? towards the use of pretrained language models for task-oriented dialogue systems," in Proc. 3rd Workshop on Neural Generation and Translation: 15-22, 2019.
[5] F. Almeida, G. B. Xexéo, "Word embeddings: A survey," ArXiv, abs/1901.09069, 2019.
[6] J. Devlin, M. W. Chang, K. Lee, K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," in Proc. Human Language Technologies: 4171-4186, 2019.
[7] J. Pennington, R. Socher, C. Manning, "GloVe: Global vectors for word representation," in Proc. Empirical Methods in Natural Language Processing (EMNLP): 1532-1543, 2014.
[8] N. Reimers, I. Gurevych, "Sentence-BERT: Sentence embeddings using siamese BERT-Networks," in Proc. Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): 3982-3992, 2019.
[9] X. Zhang, H. Wang, "A joint model of intent determination and slot filling for spoken language understanding," in Proc. 25th International Joint Conference on Artificial Intelligence: 2993-2999, 2016.
[10] M. Rafiepour, J. S. Sartakhti, "CTRAN: CNN-Transformer-based network for natural language understanding," Eng. Appl. Artif. Intell., 126(PC): 9, 2023.
[11] Y. Shi, K. Yao, H. Chen, Y. C. Pan, M. Y. Hwang, B. Peng, "Contextual spoken language understanding using recurrent neural networks," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): 5271-5275, 2015.
[12] Z. Yan, N. Duan, P. Chen, M. Zhou, J. Zhou, Z. Li, "Building task-oriented dialogue systems for online shopping," in Proc. AAAI conference on artificial intelligence, 31(1): 4618-4625, 2017.
[13] T. H. Wen, M. Gašić, N. Mrkšić, P. H. Su, D. Vandyke, S. Young, "Semantically conditioned LSTM-based natural language generation for spoken dialogue systems," in Proc. Empirical Methods in Natural Language Processing: 1711-1721, 2015.
[14] O. Dušek, F. Jurčíček, "Sequence-to-sequence generation for spoken dialogue via deep syntax trees and strings," in Proc. 54th Annual Meeting of the Association for Computational Linguistics, 2: 45-51, 2016.
[15] Z. Jiang, X. L. Mao, Z. Huang, J. Ma, S. Li, "Towards end-to-end learning for efficient dialogue agent by modeling looking-ahead ability," in Proc. 20th Annual SIGdial Meeting on Discourse and Dialogue: 133-142, 2019.
[16] H. Cuayáhuitl, "SimpleDS: A simple deep reinforcement learning dialogue system," in Proc. International Workshop on Spoken Dialogue Systems Technology, 2016.
[17] H. Cuayáhuitl, S. Yu, A. Williamson, J. Carse, "Deep reinforcement learning for multi-domain dialogue systems," ArXiv, abs/1611.08675, 2016.
[18] H. Cuayáhuitl, S. Yu, A. Williamson, J. Carse, "Scaling up deep reinforcement learning for multi-domain dialogue systems," in Proc. International Joint Conference on Neural Networks (IJCNN): 3339-3346, 2017.
[19] Z. Dehghanipour, J. Salimi, "An improved deep reinforcement learning for task-oriented dialogue system," Preprint, 2022.
[20] V. Ilievski, C. Musat, A. Hossmann, M. Baeriswyl, "Goal-oriented chatbot dialog management bootstrapping with transfer learning," in Proc. 27th International Joint Conference on Artificial Intelligence: 4115-4121, 2018.
[21] H. Cuayáhuitl, "A data-efficient deep learning approach for deployable multimodal social robots," Neurocomputing, 396: 587-598, 2020.
[22] Y. Ma, X. Wang, Z. Dong, H. Chen, "Cascaded LSTMs based deep reinforcement learning for goal-driven dialogue," in Proc. Natural Language Processing and Chinese Computing: 29-41, 2018.
[23] T. H. Wen, D. Vandyke, N. Mrkšić, M. Gašić, L. M. Rojas-Barahona, P. H. Su, S. Ultes, S. Young, "A network-based end-to-end trainable task-oriented dialogue system," in Proc. 15th Conference of the European Chapter of the Association for Computational Linguistics, 1: 438-449, 2017.
[24] X. Li, Y. N. Chen, L. Li, J. Gao, A. Celikyilmaz, "End-to-end task-completion neural dialogue systems," in Proc. 8th International Joint Conference on Natural Language Processing, 1: 733-743, 2017.
[25] M. Sharma, T. Russell-Rose, L. Barakat, A. Matsuo, "Building a legal dialogue system: development process, challenges and opportunities," ArXiv, abs/2109.00381, 2021.
[26] D. Ham, J. G. Lee, Y. Jang, K. E. Kim, "End-to-end neural pipeline for goal-oriented dialogue systems using GPT-2," in Proc. 58th Annual Meeting of the Association for Computational Linguistics: 583-592, 2020.
[27] J. Kulhánek, V. Hudeček, T. Nekvinda, O. Dušek, "AuGPT: Auxiliary tasks and data augmentation for end-to-end dialogue with pre-trained language models," in Proc. 3rd Workshop on Natural Language Processing for Conversational AI: 198-210, 2021.
[28] Z. Borhanifard, H. Basafa, S. Z. Razavi, H. Faili, "Persian language understanding in task-oriented dialogue system for online shopping," in Proc. 11th International Conference on Information and Knowledge Technology (IKT): 79-84, 2020.
[29] K. Mahmoudi, H. Faili, "PerSHOP--A Persian dataset for shopping dialogue systems modeling," ArXiv, abs/2401.00811, 2024.
[30] A. Ghandeharioun, J. H. Shen, N. Jaques, C. Ferguson, N. Jones, A. Lapedriza, R. Picard, "Approximating interactive human evaluation with self-play for open-domain dialog systems," in Proc. 33rd International Conference on Neural Information Processing Systems: 13665-13676, 2019.
[31] N. Reimers, I. Gurevych, "Making monolingual sentence embeddings multilingual using knowledge distillation," in Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP): 4512-4525, 2020.
[32] E. Razumovskaia, G. Glavas, O. Majewska, E. M. Ponti, A. Korhonen, I. Vulic, "Crossing the conversational chasm: A primer on natural language processing for multilingual task-oriented dialogue systems," J. Artif. Intell. Res., 741351-1402, 2022.
آمار
تعداد مشاهده مقاله: 210