Faster Convergence of Modified Policy Iteration for Quantitative Verification of Markov Decision Processes

Mohagheghi, M.S.

doi:10.22061/jecei.2020.5395.216

تعداد نشریات	15
تعداد شماره‌ها	239
تعداد مقالات	2,438
تعداد مشاهده مقاله	4,131,049
تعداد دریافت فایل اصل مقاله	3,009,735

	Faster Convergence of Modified Policy Iteration for Quantitative Verification of Markov Decision Processes
Journal of Electrical and Computer Engineering Innovations (JECEI)
مقاله 12، دوره 7، شماره 1، فروردین 2019، صفحه 113-122 اصل مقاله (1.49 M)
نوع مقاله: Original Research Paper
شناسه دیجیتال (DOI): 10.22061/jecei.2020.5395.216
نویسنده
M.S. Mohagheghi^*
Department of Computer science, Faculty of Mathematical Sciences, Vali-e-Asr University ofRafsanjan, Rafsanjan, Iran
تاریخ دریافت: 13 بهمن 1396، تاریخ بازنگری: 22 اردیبهشت 1397، تاریخ پذیرش: 22 آبان 1397
چکیده
Background and Objectives: Numerical iterative methods are widely used to compute reachability probabilities and expected rewards in probabilistic model checking of Markov decision processes. Several approaches have been proposed to improve the performance of these iterative methods. Reducing the total number of iterations is an important goal that is followed by several techniques in this paper. Methods: In this paper, we consider MDPs with different levels of non-determinism. We show that modified policy iteration performs better than the other standard iterative methods when the degree of non-determinism increases. We propose some novel methods to improve the performance of the modified policy iteration method. Our approach is to define several criteria to avoid useless iterations and updates of the modified policy iteration method. The first criterion considers the graphical structure of the model to define the number of iterations after each policy modification. The second criterion is a dynamic one to determine when a policy should be updated. The third proposed approach in this work is to use several priority heaps to select states for value updates. Results: Our proposed methods are implemented in the PRISM model checker and applied on several standard case studies. The results of experiments show that the running times of our approaches are less than the running time of the standard and previous methods for most case studies. The main reason for these results is that the total numbers of iterations and updates are reduced by our approaches, which results in an improvement in the performance of the method. While the running times are reduced by our approaches, the precision of computations are kept in most cases. Conclusion: The proposed techniques are able to reduce the number of iterations and accelerate the convergence to the optimal policies. The also prioritize the commutations to reduce the total number of updates.
کلیدواژه‌ها
Expected rewards؛ Markov decision processes؛ Modified policy iteration؛ Probabilistic model checking؛ Reachability probabilities

مراجع
[1] C. Baier, J.P. Katoen, Principles of Model Checking, MIT press, 2008. [2] Y. Feng, Y. Nengkun, M. Ying, “Model checking quantum Markov chains,” Journal of Computer and System Sciences, 79(7): 1181-1198, 2013. [3] G. Norman, D. Parker, M. Kwiatkowska, S. Shukla, R. Gupta, “Using probabilistic model checking for dynamic power management,” Formal Aspects of Computing, 17(2): 160-176, 2005. [4] M. Kwiatkowska, C. Thachuk, “Probabilistic model checking for biology,” Software Systems Safety, 36: 165-189, 2014. [5] T. Chen, F. Vojtch, M. Kwiatkowska, D. Parker, A. Simaitis. “Automatic verification of competitive stochastic systems,” Formal Methods in System Design, 43(1): 61-92, 2013. [6] M. Kwiatkowska, G. Norman, D. Parker, “Probabilistic verification of Hermans self-stabilisation algorithm,” Formal Aspects of Computing, 24(4): 661-670, 2012. [7] M. Kwiatkowska, G. Norman, D. Parker, “Probabilistic model checking: advances and applications,” in Formal System Verification): 73-121, Springer, Cham, 2018. [8] N. Kamaleson, “Model reduction techniques for probabilistic verification of Markov chains,” Ph.D. Diss., University of Birmingham, 2018. [9] M. Kwiatkowska, G. Norman, D. Parker, “PRISM 4.0: verification of probabilistic real-time systems,” presented at the International conference on computer aided verification, Berlin, Heidelberg, Germany, 2011. [10] J. Sun, Y. Liu, J.S Dong, J. Pang, “PAT: Towards flexible verification under fairness,” presented at the International Conference on Computer Aided Verification, Berlin, Heidelberg, Germany, 2009. [11] C. Dehnert, S. Junges, J.P. Katoen, M. Volk, “A storm is coming: A modern probabilistic model checker,” presented at the International Conference on Computer Aided Verification, Berlin, Heidelberg, Germany, 2017. [12] V. Forejt, M. Kwiatkowska, G. Norman, D. Parker, “Automated verification techniques for probabilistic systems,” presented at the International School on Formal Methods for the Design of Computer, Communication and Software Systems, Berlin, Heidelberg, Germany, 2011. [13] A. Bianco, L. De Alfaro, “Model checking of probabilistic and nondeterministic systems,” presented at the International Conference on Foundations of Software Technology and Theoretical Computer Science, Berlin, Heidelberg, Germany, 1995. [14] H. Rashidi, M. Sadeghzadeh Hemayati, “Software quality models: A comprehensive review and analysis,” Journal of Electrical and Computer Engineering Innovations, 6(1): 59-76, 2018. [15] A. Saberi Nejad, R. Tavoli “A method for estimating the cost of software using principle components analysis and data mining,” Journal of Electrical and Computer Engineering Innovations, 6(1): 33-42, 2018. [16] M. L. Puterman, C. S. Moon, “Modified policy iteration algorithms for discounted Markov decision problems,” Management Science, 24(11): 1127-1137, 1978. [17] M. Kwiatkowska, D. Parker, H. Qu, “Incremental quantitative verification for Markov decision processes,” in Proc. IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN): 359-370, 2011. [18] C. Baier, J. Klein, L. Leuschner, D. Parker, S. Wunderlich, “Ensuring the reliability of your model checker: Interval iteration for markov decision processes,” in Proc. International Conference on Computer Aided Verification: 160-180, 2017. [19] T. Brzdil, K. Chatterjee, M. Chmelk, V. Forejt, J. Ketnsk, M. Kwiatkowska, D. Parker, M. Ujma, “Verification of Markov decision processes using learning algorithms,” in Proc. International Symposium on Automated Technology for Verification and Analysis): 98-114, 2014. [20] M. Kwiatkowska, G. Norman, D. Parker, “Symmetry reduction for probabilistic model checking,” in Proc. International Conference on Computer Aided Verification: 234-248, 2006. [21] M. Groesser, C. Baier, “Partial order reduction for Markov decision processes: A survey,” in Proc. International Symposium on Formal Methods for Components and Objects: 408-427, 2005. [22] O. Shlakhter, “Acceleration of iterative methods for Markov decision processes,” Ph.D. Diss., 2010. [23] D. Wingate and D. Kevin, “Prioritization methods for accelerating MDP solvers,” Journal of Machine Learning Research, 6(5): 851-881, 2005. [24] M. Mohagheghi, J. Karimpour, A. Isazadeh. “Prioritizing methods to accelerate probabilistic model checking of discrete-time Markov models,” The Computer Journal, 63(1): 105-122, 2020. [25] J. Klein, C. Baier, P. Chrszon, M. Daum, “Advances in probabilistic model checking with PRISM: Variable reordering, quintiles and weak deterministic Buchi automata,” International Journal on Software Tools for Technology Transfer, 20(2): 179-194. 2018. [26] L. Feng, “On learning assumptions for compositional verification of probabilistic systems,” Ph.D. Thesis, University of Oxford, 2013. [27] F. Ciesinski, C. Baier, M. Größer, J. Klein, “Reduction techniques for model checking Markov decision processes,” in Proc. IEEE 2008 Fifth International Conference on Quantitative Evaluation of Systems: 45-54, 2008.
آمار تعداد مشاهده مقاله: 661 تعداد دریافت فایل اصل مقاله: 889

سامانه مدیریت نشریات علمی. طراحی و پیاده سازی از سیناوب

پیوندهای مفید

آمار

Faster Convergence of Modified Policy Iteration for Quantitative Verification of Markov Decision Processes