Journal of Systems Engineering and Electronics ›› 2022, Vol. 33 ›› Issue (5): 1249-1267.doi: 10.23919/JSEE.2022.000119

上一篇    下一篇


  • 收稿日期:2020-10-12 接受日期:2022-06-16 出版日期:2022-10-27 发布日期:2022-10-27

Research on virtual entity decision model for LVC tactical confrontation of army units

Ang GAO(), Qisheng GUO(), Zhiming DONG*(), Zaijiang TANG(), Ziwei ZHANG(), Qiqi FENG()   

  1. 1 Military Exercise and Training Center, Army Academy of Armored Forces, Beijing 100072, China
  • Contact: Zhiming DONG E-mail:15689783388@163.com;236211566@qq.com;dong_zhiming@163.com;tangzaijiang@sina.com;gaoang370829@sohu.com;594472717@qq.com
  • About author:|GAO Ang was born in 1988. He received his Ph.D. degree in science of military equipemnt from Army Academy of Armored Forces. He is a Ph.D. candidate in Army Academy of Armored Forces. His research interest is intelligent decision of computer generated force based on multi-agent deep reinforcement learning. E-mail: 15689783388@163.com||GUO Qisheng was born in 1962. He received his Ph.D. degree in science of military equipemnt from Tsinghua University. His research interests are equipment requirement demonstration and equipment test. E-mail: 236211566@qq.com||DONG Zhiming was born in 1977. He received his Ph.D. degree in science of military equipemnt from Army Academy of Armored Forces. His research interests are equipment requirement demonstration and equipment test. E-mail: dong_zhiming@163.com||TANG Zaijiang was born in 1976. He received his Ph.D. degree in science of military equipemnt from Army Academy of Armored Forces. His research interest is battle simulation. E-mail: tangzaijiang@sina.com||ZHANG Ziwei was born in 1986. He received his Ph.D. degree in science of military equipemnt from Army Academy of Armored Forces. He is a Ph.D. candidate in Army Academy of Armored Forces. His research interest is equipment test evaluation. E-mail: gaoang370829@sohu.com||FENG Qiqi was born in 1992. She received her M.S. degree in science of military equipemnt form Army Academy of Armored Forces. She is pursuing her Ph.D. degree in Army Academy of Armored Forces. Her research interest is real-time research of live virtual constructive. E-mail: 594472717@qq.com
  • Supported by:
    This work was supported by the Military Scentific Research Project (41405030302; 41401020301)


According to the requirements of the live-virtual-constructive (LVC) tactical confrontation (TC) on the virtual entity (VE) decision model of graded combat capability, diversified actions, real-time decision-making, and generalization for the enemy, the confrontation process is modeled as a zero-sum stochastic game (ZSG). By introducing the theory of dynamic relative power potential field, the problem of reward sparsity in the model can be solved. By reward shaping, the problem of credit assignment between agents can be solved. Based on the idea of meta-learning, an extensible multi-agent deep reinforcement learning (EMADRL) framework and solving method is proposed to improve the effectiveness and efficiency of model solving. Experiments show that the model meets the requirements well and the algorithm learning efficiency is high.

Key words: live-virtual-constructive (LVC), army unit, tactical confrontation (TC), intelligent decision model, multi-agent deep reinforcement learning