Journal of Systems Engineering and Electronics ›› 2022, Vol. 33 ›› Issue (5): 1186-1194.doi: 10.23919/JSEE.2022.000114
• CONTROL THEORY AND APPLICATION • Previous Articles Next Articles
Xiaofeng LI1,2(), Lu DONG3(), Changyin SUN1,2,*()
Received:
2021-01-13
Accepted:
2022-07-22
Online:
2022-10-27
Published:
2022-10-27
Contact:
Changyin SUN
E-mail:230169413@seu.edu.cn;ldong90@seu.edu.cn;cysun@seu.edu.cn
About author:
Supported by:
Xiaofeng LI, Lu DONG, Changyin SUN. Hybrid Q-learning for data-based optimal control of non-linear switching system[J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1186-1194.
1 | LIBERZON D. Switching in systems and control. Boston: Birkhauser, 2003. |
2 |
TANWANI A, SHIM H, LIBERZON D Observability for switched linear systems: characterization and observer design. IEEE Trans. on Automatic Control, 2013, 58 (4): 891- 904.
doi: 10.1109/TAC.2012.2224257 |
3 |
RINEHART M, DAHLEH M, REED D, et al Suboptimal control of switched systems with an application to the disc engine. IEEE Trans. on Control Systems Technology, 2008, 16 (2): 189- 201.
doi: 10.1109/TCST.2007.903366 |
4 |
KOUVELAS A, ABOUDOLAS K, PAPAGEORGIOU M, et al A hybrid strategy for real-time traffic signal control of urban road networks. IEEE Trans. on Intelligent Transportation Systems, 2011, 12 (3): 884- 894.
doi: 10.1109/TITS.2011.2116156 |
5 |
BRYSON A E Optimal control−1950 to 1985. IEEE Control System Magazine, 1996, 16 (3): 26- 33.
doi: 10.1109/37.506395 |
6 |
LIU D R, XUE S, ZHAO B, et al Adaptive dynamic programming for control: a survey and recent advances. IEEE Trans. on System, Man, and Cybernetics: System, 2021, 51 (1): 142- 160.
doi: 10.1109/TSMC.2020.3042876 |
7 |
SOLER M, OLIVARES A, STAFFETTI E, et al Framework for aircraft trajectory planning toward an efficient air traffic management. Journal of Aircraft, 2012, 49 (1): 341- 348.
doi: 10.2514/1.C031490 |
8 |
GANS N R, HUTCHINSON S A Stable visual servoing through hybrid switched-system control. IEEE Trans. on Robotics, 2007, 23 (3): 530- 540.
doi: 10.1109/TRO.2007.895067 |
9 |
LI X F, DONG L, XUE L, et al Hybrid reinforcement learning for optimal control of non-linear switching system. IEEE Trans. on Neural Networks and Learning Systems, 2022.
doi: 10.1109/TNNLS.2022.3156287 |
10 | SARGENT R Optimal control. Journal of Computational and Applied Mathematics, 2000, 124 (1): 361- 371. |
11 |
AXELSSON H, EGERSTEDT M, WARDI Y, et al Algorithm for switching-time optimization in hybrid dynamical systems. Proc. of the IEEE International Conference on Control and Automation Intelligent Control, 2005, 256- 261.
doi: 10.1109/.2005.1467024 |
12 |
EGERSTEDT M, WARDI Y, AXELSSON H Transition-time optimization for switched-mode dynamical systems. IEEE Trans. on Automatic Control, 2006, 51 (1): 110- 115.
doi: 10.1109/TAC.2005.861711 |
13 | LI S T, LIU X, TAN Y, et al Optimal switching time control of discrete-time switched autonomous systems. International Journal of Innovative Computing, Information and Control, 2015, 11 (6): 2043- 2050. |
14 | LUUS R, CHEN Y Optimal switching control via direct search optimization. Proc. of the IEEE International Symposium on Intelligent Control, 2003, 371- 376. |
15 |
XU X P, ANTSAKLIS P J Optimal control of switched systems based on parameterization of the switching instants. IEEE Trans. on Automatic Control, 2004, 49 (1): 2- 16.
doi: 10.1109/TAC.2003.821417 |
16 |
SAKLY M, SAKLY A, MAJDOUB N, et al Optimization of switching instants for optimal control of linear switched systems based on genetic algorithms. IFAC Proceedings Volumes, 2009, 42 (19): 249- 253.
doi: 10.3182/20090921-3-TR-3005.00045 |
17 | LONG R, FU J M, ZHANG L Y Optimal control of switched system based on neural network optimization. Proc. of the International Conference on Intelligent Computing, 2008, 799- 806. |
18 |
RUNGGER M, STURSBERG O A numerical method for hybrid optimal control based on dynamic programming. Nonlinear Analysis: Hybrid Systems, 2011, 5 (2): 254- 274.
doi: 10.1016/j.nahs.2010.09.002 |
19 | SUTTON R S, BARTO A G. Reinforcement Learning: an introduction. Cambridge: MIT Press, 2018. |
20 |
MNIH V, KAVUKCUOGLU K, SILVER D, et al Human-level control through deep reinforcement learning. Nature, 2015, 518 (7540): 529- 533.
doi: 10.1038/nature14236 |
21 |
SILVER D, HUBERT T, SCHRITTWIESER J, et al A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 2018, 362 (6419): 1140- 1144.
doi: 10.1126/science.aar6404 |
22 | BERTSEKAS D P. Neuro-dynamic programming. Belmont: Athena Scientific, 1996. |
23 |
LEWIS F L, VRABIE D Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine, 2009, 9 (3): 32- 50.
doi: 10.1109/MCAS.2009.933854 |
24 |
SI J, WANG Y T Online learning control by association and reinforcement. IEEE Trans. on Neural networks, 2001, 12 (2): 264- 276.
doi: 10.1109/72.914523 |
25 |
LI X F, DONG L, SUN C Y Data-based optimal tracking of autonomous nonlinear switching systems. IEEE/CAA Journal of Automatica Sinica, 2021, 8 (1): 227- 238.
doi: 10.1109/JAS.2020.1003486 |
26 |
AL-TAMIMI A, LEWIS F L, ABU-KHALAF M Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans. on Systems, Man, and Cybernetics, Part B (Cybernetics), 2008, 38 (4): 943- 949.
doi: 10.1109/TSMCB.2008.926614 |
27 |
MU C X, WANG D, HE H B Novel iterative neural dynamic programming for data-based approximate optimal control design. Automatica, 2017, 81, 240- 252.
doi: 10.1016/j.automatica.2017.03.022 |
28 |
LUO B, WU H N, HUANG T W, et al Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica, 2014, 50 (12): 3281- 3290.
doi: 10.1016/j.automatica.2014.10.056 |
29 |
ZHANG H G, SONG R Z, WEI Q L, et al Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Trans. on Neural Networks, 2011, 22 (12): 1851- 1862.
doi: 10.1109/TNN.2011.2172628 |
30 |
ZHANG H G, LUO Y H, LIU D R Neural-network-based near optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans. on Neural Networks, 2009, 20 (9): 1490- 1503.
doi: 10.1109/TNN.2009.2027233 |
31 | DONG L, ZHONG X N, SUN C Y, et al Adaptive event-triggered control based on heuristic dynamic programming for nonlinear discrete-time systems. IEEE Trans. on Neural Networks and Learning Systems, 2016, 28 (7): 1594- 1605. |
32 | HEYDARI A Optimal switching of DC-DC power converters using approximate dynamic programming. IEEE Trans. on Neural Networks and Learning Systems, 2016, 29 (3): 586- 596. |
33 |
HEYDARI A Optimal switching with minimum dwell time constraint. Journal of the Franklin Institute, 2017, 354 (11): 4498- 4518.
doi: 10.1016/j.jfranklin.2017.04.015 |
34 |
LIU D R, WANG D, ZHAO D B Neural-network based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming. IEEE Trans. on Automation Science and Engineering, 2012, 9 (3): 628- 634.
doi: 10.1109/TASE.2012.2198057 |
35 |
ZHANG H G, QIN C, LUO Y H Neural-network-based constrained optimal control scheme for discrete-time switched nonlinear system using dual heuristic programming. IEEE Trans. on Automation Science and Engineering, 2014, 11 (3): 839- 849.
doi: 10.1109/TASE.2014.2303139 |
36 |
MU C X, LIAO K, REN L, et al Approximately optimal control of discrete-time nonlinear switched systems using globalized dual heuristic programming. Neural Processing Letters, 2020, 52 (2): 1089- 1108.
doi: 10.1007/s11063-020-10278-9 |
37 | GU S X, LILLICRAP T, SUTSKEVER I, et al Continuous deep Q-learning with model-based acceleration. Proc. of the International Conference on Machine Learning, 2016, 2829- 2838. |
38 | LEWIS F L, VRABIE D, SYRMOS V L. Optimal control. New Jersey: John Wiley & Sons, 2012. |
[1] | Bohao LI, Yunjie WU, Guofei LI. Hierarchical reinforcement learning guidance with threat avoidance [J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1173-1185. |
[2] | Ang GAO, Qisheng GUO, Zhiming DONG, Zaijiang TANG, Ziwei ZHANG, Qiqi FENG. Research on virtual entity decision model for LVC tactical confrontation of army units [J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1249-1267. |
[3] | Jingyu CAO, Lu DONG, Changyin SUN. Day-ahead scheduling based on reinforcement learning with hybrid action space [J]. Journal of Systems Engineering and Electronics, 2022, 33(3): 693-705. |
[4] | Xiangyang LIN, Qinghua XING, Fuxian LIU. Choice of discount rate in reinforcement learning with long-delay rewards [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 381-392. |
[5] | Wenzhang LIU, Lu DONG, Jian LIU, Changyin SUN. Knowledge transfer in multi-agent reinforcement learning with incremental number of agents [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 447-460. |
[6] | Wanping SONG, Zengqiang CHEN, Mingwei SUN, Qinglin SUN. Reinforcement learning based parameter optimization of active disturbance rejection control for autonomous underwater vehicle [J]. Journal of Systems Engineering and Electronics, 2022, 33(1): 170-179. |
[7] | Jiandong ZHANG, Qiming YANG, Guoqing SHI, Yi LU, Yong WU. UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning [J]. Journal of Systems Engineering and Electronics, 2021, 32(6): 1421-1438. |
[8] | Kaifang WAN, Bo LI, Xiaoguang GAO, Zijian HU, Zhipeng YANG. A learning-based flexible autonomous motion control method for UAV in dynamic unknown environments [J]. Journal of Systems Engineering and Electronics, 2021, 32(6): 1490-1508. |
[9] | Xin ZENG, Yanwei ZHU, Leping YANG, Chengming ZHANG. A guidance method for coplanar orbital interception based on reinforcement learning [J]. Journal of Systems Engineering and Electronics, 2021, 32(4): 927-938. |
[10] | Ye MA, Tianqing CHANG, Wenhui FAN. A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning [J]. Journal of Systems Engineering and Electronics, 2021, 32(3): 642-657. |
[11] | Zongxing LI, Rui ZHANG. Time-varying sliding mode control of missile based on suboptimal method [J]. Journal of Systems Engineering and Electronics, 2021, 32(3): 700-710. |
[12] | Shengnan FU, Xiaodong LIU, Wenjie ZHANG, Qunli XIA. Multiconstraint adaptive three-dimensional guidance law using convex optimization [J]. Journal of Systems Engineering and Electronics, 2020, 31(4): 791-803. |
[13] | Dariush TAVAKOLIFAR, Hamid KHALOOZADEH, Roya AMJADIFARD. Stabilization of switched systems with all unstable modes: application to the aircraft team problem [J]. Journal of Systems Engineering and Electronics, 2019, 30(4): 792-798. |
[14] | Rong WANG, Yahui WU, Hongbin HUANG, Su DENG. Cooperative transmission in delay tolerant network [J]. Journal of Systems Engineering and Electronics, 2019, 30(1): 30-36. |
[15] | Bin FU, Hang GUO, Kang CHEN, Wenxing FU, Xingyu WU, Jie YAN. Aero-thermal heating constrained midcourse guidance using state-constrained model predictive static programming method [J]. Journal of Systems Engineering and Electronics, 2018, 29(6): 1263-1270. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||