A review of mobile robot motion planning methods: from classical motion planning workflows to reinforcement learning-based architectures

doi:10.23919/JSEE.2023.000051

Journal of Systems Engineering and Electronics ›› 2023, Vol. 34 ›› Issue (2): 439-459.doi: 10.23919/JSEE.2023.000051

• CONTROL THEORY AND APPLICATION • Previous Articles

A review of mobile robot motion planning methods: from classical motion planning workflows to reinforcement learning-based architectures

Lu DONG¹(), Zichen HE²^,³(), Chunwei SONG³(), Changyin SUN³^,⁴^,*()

¹ School of Cyber Science and Engineering, Southeast University, Nanjing 211189, China
² Shanghai Institute of Intelligent Science and Technology, Tongji University, Shanghai 201804, China
³ College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
⁴ School of Automation, Southeast University, Nanjing 210096, China

Received:2022-03-08 Online:2023-04-18 Published:2023-04-18
Contact: Changyin SUN E-mail:ldong90@seu.edu.cn;1910646@tongji.edu.cn;2030739@tongji.edu.cn;cysun@seu.edu.cn
About author:
DONG Lu was born in 1990. She received her B.S. degree in School of Physics and Ph.D. degree in School of Automation from Southeast University, Nanjing, China, in 2012 and 2017, respectively. She is currently an associate professor with the School of Cyber Science and Engineering, Southeast University. Her research interests include adaptive dynamic programming, event-triggered control, nonlinear system control, and optimization. E-mail: ldong90@seu.edu.cn

HE Zichen was born in 1995. He received his B.S. and M.S. degrees in mechanical engineering from China University of Petroleum, Beijing, China, in 2016 and 2019. He is currently pursuing his Ph.D. degree in control science and engineering with Tongji University, Shanghai, China. His research interests include reinforcement learning, multi-robot collaborative navigation, and motion planning. E-mail: 1910646@tongji.edu.cn

SONG Chunwei was born in 1998. He received his B.E. degree in automation from Hunan University, Changsha, China, in 2020. He is currently pursuing his M.S. degree in control science and engineering at the School of Electronics and Information Engineering, Tongji University, Shanghai, China. His research interests include multi-agent reinforcement learning and robot navigation. E-mail: 2030739@tongji.edu.cn

SUN Changyin was born in 1975. He received his B.S. degree in applied mathematics from the College of Mathematics, Sichuan University, Chengdu, China, in 1996, and M.S. and Ph.D. degrees in electrical engineering from Southeast University, Nanjing, China, in 2001 and 2004, respectively. He is a professor with the School of Automation, Southeast University, Nanjing, China. His research interests include intelligent control, flight control, and optimal theory. E-mail: cysun@seu.edu.cn
Supported by:
This work was supported by the National Natural Science Foundation of China (62173251), the“Zhishan”Scholars Programs of Southeast University, the Fundamental Research Funds for the Central Universities, and Shanghai Gaofeng & Gaoyuan Project for University Academic Program Development (22120210022)

Abstract

Abstract:

Motion planning is critical to realize the autonomous operation of mobile robots. As the complexity and randomness of robot application scenarios increase, the planning capability of the classical hierarchical motion planners is challenged. With the development of machine learning, the deep reinforcement learning (DRL)-based motion planner has gradually become a research hotspot due to its several advantageous feature. The DRL-based motion planner is model-free and does not rely on the prior structured map. Most importantly, the DRL-based motion planner achieves the unification of the global planner and the local planner. In this paper, we provide a systematic review of various motion planning methods. Firstly, we summarize the representative and state-of-the-art works for each submodule of the classical motion planning architecture and analyze their performance features. Then, we concentrate on summarizing reinforcement learning (RL)-based motion planning approaches, including motion planners combined with RL improvements, map-free RL-based motion planners, and multi-robot cooperative planning methods. Finally, we analyze the urgent challenges faced by these mainstream RL-based motion planners in detail, review some state-of-the-art works for these issues, and propose suggestions for future research.

Key words: mobile robot, reinforcement learning (RL), motion planning, multi-robot cooperative planning

Lu DONG, Zichen HE, Chunwei SONG, Changyin SUN. A review of mobile robot motion planning methods: from classical motion planning workflows to reinforcement learning-based architectures[J]. Journal of Systems Engineering and Electronics, 2023, 34(2): 439-459.

Figures/Tables 8

Fig 1

Fig 2

Fig 3

Fig 4

Fig 5

Fig 6

Fig 7

Fig 8

References 148

1	XIAO X S, LIU B, WARNELL G, et al. Motion control for mobile robot navigation using machine learning: a survey. https://arxiv.org/abs/2011.13112.
2	SUN C Y, LIU W Z, DONG L Reinforcement learning with task decomposition for cooperative multiagent systems. IEEE Trans. on Neural Networks and Learning Systems, 2021, 32 (5): 2054- 2065. doi: 10.1109/TNNLS.2020.2996209
3	ARADI S Survey of deep reinforcement learning for motion planning of autonomous vehicles. IEEE Trans. on Intelligent Transportation Systems, 2020, 23 (2): 740- 759.
4	ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al Deep reinforcement learning: a brief survey. IEEE Signal Processing Magazine, 2017, 34 (6): 26- 38. doi: 10.1109/MSP.2017.2743240
5	DONG L, ZHONG X N, SUN C Y, et al Event-triggered adaptive dynamic programming for continuous-time systems with control constraints. IEEE Trans. on Neural Networks and Learning Systems, 2017, 28 (8): 1941- 1952. doi: 10.1109/TNNLS.2016.2586303
6	DONG L, YUAN X, SUN C Y Event-triggered receding horizon control via actor-critic design. Science China Information Sciences, 2020, 63 (5): 1869- 1919.
7	OLEYNIKOVA H, TAYLOR Z, FEHR M, et al Voxblox: incremental 3d Euclidean signed distance fields for on-board MAV planning. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017, 1366- 1373.
8	HAN L X, GAO F, ZHOU B Y, et al Fiesta: Fast incremental Euclidean distance fields for online motion planning of aerial robots. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019, 4423- 4430.
9	QUAN L, HAN L X, ZHOU B Y, et al Survey of UAV motion planning. IET Cyber-systems and Robotics, 2020, 2 (1): 14- 21. doi: 10.1049/iet-csr.2020.0004
10	CLAUSSMANN L, REVILLOUD M, GRUYER D, et al A review of motion planning for highway autonomous driving. IEEE Trans. on Intelligent Transportation Systems, 2019, 21 (5): 1826- 1848.
11	CHOSET H M, LYNCH K M, HUTCHINSON S, et al. Principles of robot motion: theory, algorithms, and implementation. London: MIT Press, 2005.
12	GONZALEZ D, PEREZ J, MILANES V, et al A review of motion planning techniques for automated vehicles. IEEE Trans. on Intelligent Transportation Systems, 2016, 17 (4): 1135- 1145. doi: 10.1109/TITS.2015.2498841
13	WANG H J, YU Y, YUAN Q B. Application of Dijkstra algorithm in robot path-planning. Proc. of the International Conference on Mechanic Automation and Control Engineering, 2011: 1067−1069.
14	HART P E, NILSSON N J, RAPHAEL B A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. on Systems Science and Cybernetics, 1968, 4 (2): 100- 107. doi: 10.1109/TSSC.1968.300136
15	STENTZ A Optimal and efficient path planning for partially known environments. Intelligent Unmanned Ground Vehicles, 1997, 388, 203- 220.
16	KOENIG S, LIKHACHEV M, FURCY D Lifelong planning A*. Artificial Intelligence, 2004, 155 (1/2): 93- 146. doi: 10.1016/j.artint.2003.12.001
17	BELANOVA D, MACH M, SINCAK P, et al Path planning on robot based on D* lite algorithm. Proc. of the World Symposium on Digital Intelligence for Systems and Machines, 2018, 125- 130.
18	HARABOR D, GRASTIEN A Online graph pruning for pathfinding on grid maps. Proc. of the AAAI Conference on Artificial Intelligence, 2011, 1114- 1119.
19	HE Z C, DONG L, SUN C Y, et al Asynchronous multithreading reinforcement-learning-based path planning and tracking for unmanned underwater vehicle. IEEE Trans. on Systems, Man, and Cybernetics: Systems, 2021, 52 (5): 2757- 2769.
20	JIANG C J, SUN S F, LIU J L, et al Global path planning of mobile robot based on improved JPS+ algorithm. Proc. of the Chinese Automation Congress, 2020, 2387- 2392.
21	BULITKO V, LEE G Learning in real-time search: a unifying framework. Journal of Artificial Intelligence Research, 2006, 25 (1): 119- 157.
22	KOENIG S, LIKHACHEV M Real-time adaptive A*. Proc. of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems, 2006, 281- 288.
23	PIVTORAIKO M, KELLY A Generating state lattice motion primitives for differentially constrained motion planning. Proc. of the International Conference on Intelligent Robots and Systems, 2012, 101- 108.
24	PIVTORAIKO M, KELLY A Kinodynamic motion planning with state lattice motion primitives. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011, 2172- 2179.
25	ZHOU B Y, GAO F, WANG L Q, et al Robust and efficient quadrotor trajectory generation for fast autonomous flight. IEEE Robotics and Automation Letters, 2019, 4 (4): 3529- 3536. doi: 10.1109/LRA.2019.2927938
26	GONZALEZ D, PEREZ J, MILANES V, et al A review of motion planning techniques for automated vehicles. IEEE Trans. on Intelligent Transportation Systems, 2015, 17 (4): 1135- 1145.
27	KARAMAN S, FRAZZOLI E Sampling-based algorithms for optimal motion planning. The International Journal of Robotics Research, 2011, 30 (7): 846- 894. doi: 10.1177/0278364911406761
28	NASIR J, ISLAM F, MALIK U, et al RRT*-SMART: a rapid convergence implementation of RRT. International Journal of Advanced Robotic Systems, 2013, 10 (7): 299. doi: 10.5772/56718
29	ARSLAN O, TSIOTRAS P Use of relaxation methods in sampling-based algorithms for optimal motion planning. Proc. of the IEEE International Conference on Robotics and Automation, 2013, 2421- 2428.
30	WEBB D J, BERG J Kinodynamic RRT*: asymptotically optimal motion planning for robots with linear dynamics. Proc. of the IEEE International Conference on Robotics and Automation, 2013, 5054- 5061.
31	GAMMELL J D, SRINIVASA S S, BARFOOT T D Informed RRT*: optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014, 2997- 3004.
32	GAMMELL J D, SRINIVASA S S, BARFOOT T D Batch informed trees (BIT*): sampling-based optimal planning via the heuristically guided search of implicit random geometric graph. Proc. of the IEEE International Conference on Robotics and Automation, 2015, 3067- 3074.
33	JANSON L, SCHMERLING E, CLARK A, et al Fast marching tree: a fast marching sampling-based method for optimal motion planning in many dimensions. The International Journal of Robotics Research, 2015, 34 (7): 883- 921. doi: 10.1177/0278364915577958
34	STRUB M P, GAMMELL J D Advanced BIT(ABIT): sampling-based planning with advanced graph-search techniques. Proc. of the IEEE International Conference on Robotics and Automation, 2020, 130- 136.
35	STRUB M P, GAMMELL J D Adaptively informed trees (AIT*): fast asymptotically optimal path planning through adaptive heuristics. Proc. of the IEEE International Conference on Robotics and Automation, 2020, 3191- 3198.
36	NADERI K, RAJAMAKI J, HAMALAINEN P RT-RRT*: a real-time path planning algorithm based on RRT. Proc. of the ACM SIGGRAPH Conference on Motion in Games, 2015, 113- 118.
37	PIMENTEL J M, ALVIM M S, CAMPOS M F, et al Information-driven rapidly-exploring random tree for efficient environment exploration. Journal of Intelligent & Robotic Systems, 2018, 91 (2): 313- 331.
38	FRAICHARD T, SCHEUER A From Reeds and Shepp’s to continuous-curvature paths. IEEE Trans. on Robotics, 2004, 20 (6): 1025- 1035. doi: 10.1109/TRO.2004.833789
39	BREZAK M, PETROVIC I Real-time approximation of clothoids with bounded error for path planning applications. IEEE Trans. on Robotics, 2013, 30 (2): 507- 515.
40	GAO F, WU W, LIN Y, et al Online safe trajectory generation for quadrotors using fast marching method and Bernstein basis polynomial. Proc. of the IEEE International Conference on Robotics and Automation, 2018, 344- 351.
41	MELLINGER D, KUMAR V Minimum snap trajectory generation and control for quadrotors. Proc. of the IEEE International Conference on Robotics and Automation, 2011, 2520- 2525.
42	RICHTER C, BRY A, ROY N Polynomial trajectory planning for aggressive quadrotor flight in dense indoor environments. Robotics Research, 2016, 114, 649- 666.
43	CHEN J, LIU T B, SHEN S J Online generation of collision-free trajectories for quadrotor flight in unknown cluttered environments. Proc. of the IEEE International Conference on Robotics and Automation, 2016, 1476- 1483.
44	GAO F, LIN Y, SHEN S J Gradient-based online safe trajectory generation for quadrotor flight in complex environments. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017, 3681- 3688.
45	MAJD K, RAZEGHI-JAHROMI M, HOMAIFAR A A stable analytical solution method for car-like robot trajectory tracking and optimization. IEEE/CAA Journal of Automatica Sinica, 2020, 7 (1): 39- 47. doi: 10.1109/JAS.2019.1911816
46	SUN B, ZHU D Q, YANG S X A bioinspired filtered backstepping tracking control of 7000-m manned submarine vehicle. IEEE Trans. on Industrial Electronics, 2013, 61 (7): 3682- 3693.
47	IBRAHIM A E-S B Wheeled mobile robot trajectory tracking using sliding mode control. Journal of Computer Science, 2016, 12 (1): 48- 55. doi: 10.3844/jcssp.2016.48.55
48	OSUSKY J, CIGANEK J. Trajectory tracking robust control for two wheels robot. Proc. of the 2018 Cybernetics & Informatics, 2018. DOI: 10.1109/CYBERI.2018.8337559.
49	MEVO B B, SAAD M R, FAREH R. Adaptive sliding mode control of wheeled mobile robot with nonlinear model and uncertainties. Proc. of the IEEE Canadian Conference on Electrical Computer Engineering, 2018. DOI: 10.1109/CCECE.2018.8447570.
50	NASCIMENTO T P, DOREA C E, GONCALVES L M G Nonholonomic mobile robots ’ trajectory tracking model predictive control: a survey. Robotica, 2018, 36 (5): 676- 696. doi: 10.1017/S0263574717000637
51	GAVILAN F, VAZQUEZ R, CAMACHO E F An iterative model predictive control algorithm for UAV guidance. IEEE Trans. on Aerospace and Electronic Systems, 2015, 51 (3): 2406- 2419. doi: 10.1109/TAES.2015.140153
52	ISKANDER A, ELKASSED O, ELBADAWY A Minimum snap trajectory tracking for a quadrotor UAV using nonlinear model predictive control. Proc. of the Novel Intelligent and Leading Emerging Sciences Conference, 2020, 344- 349.
53	LINDQVIST B, MANSOURI S S, AGHAMOHAMMADI A, et al Nonlinear MPC for collision avoidance and control of UAVs with dynamic obstacles. IEEE Robotics and Automation Letters, 2020, 5 (4): 6001- 6008. doi: 10.1109/LRA.2020.3010730
54	CUI C C, ZHU D Q, SUN B Trajectory re-planning and tracking control of unmanned underwater vehicles on dynamic model. Proc. of the Chinese Control and Decision Conference, 2018, 1971- 1976.
55	DI W, LI C H, NA G, et al Local path planning of mobile robot based on artificial potential field. Proc. of the Chinese Control Conference, 2020, 3677- 3682.
56	MINGUEZ J, MONTANO L Nearness diagram navigation (nd): a new real time collision avoidance approach. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2000, 2094- 2100.
57	SEDER M, PETROVIC I Dynamic window based approach to mobile robot motion control in the presence of moving obstacles. Proc. of the IEEE International Conference on Robotics and Automation, 2007, 1986- 1991.
58	CHEN W D, ZHU Q G Mobile robot path planning based on fuzzy algorithms. Acta Electronica Sinica, 2011, 39 (4): 971- 974.
59	FAUST A, PALUNKO I, CRUZ P, et al Learning swing-free trajectories for UAVs with a suspended load. Proc. of the IEEE International Conference on Robotics and Automation, 2013, 4902- 4909.
60	FAUST A, PALUNKO I, CRUZ P, et al Automated aerial suspended cargo delivery through reinforcement learning. Artificial Intelligence, 2017, 247 (1): 381- 398.
61	FAUST A, OSLUND K, RAMIREZ O, et al PRM-RL: long-range robotic navigation tasks by combining reinforcement learning and sampling-based planning. Proc. of the IEEE International Conference on Robotics and Automation, 2018, 5113- 5120.
62	SILVER D, LEVER G, HEESS N, et al Deterministic policy gradient algorithms. Proc. of the International Conference on Machine Learning, 2014, 387- 395.
63	FAUST A, RUYMGAART P, SALMAN M, et al Continuous action reinforcement learning for control-affine systems with unknown dynamics. IEEE/CAA Journal of Automatica Sinica, 2014, 1 (3): 323- 336. doi: 10.1109/JAS.2014.7004690
64	CHIANG H-T L, HSU J, FISER M, et al RL-RRT: kinodynamic motion planning via learning reachability estimators from RL policies. IEEE Robotics and Automation Letters, 2019, 4 (4): 4298- 4305. doi: 10.1109/LRA.2019.2931199
65	CHIANG H-T L, FAUST A, FISER M, et al Learning navigation behaviors end-to-end with AutoRL. IEEE Robotics and Automation Letters, 2019, 4 (2): 2007- 2014. doi: 10.1109/LRA.2019.2899918
66	FRANCIS A, FAUST A, CHIANG H-T L, et al Long-range indoor navigation with PRM-RL. IEEE Trans. on Robotics, 2020, 36 (4): 1115- 1134. doi: 10.1109/TRO.2020.2975428
67	PATEL U, KUMAR N, SATHYAMOORTHY A J, et al. Dynamically feasible deep reinforcement learning policy for robot navigation in dense mobile crowds. https://arxiv.org/abs/2010.14838.
68	ROUSSEAS P, BECHLIOULIS C, KYRIAKOPOULOS K J Harmonic-based optimal motion planning in constrained workspaces using reinforcement learning. IEEE Robotics and Automation Letters, 2021, 6 (2): 2005- 2011. doi: 10.1109/LRA.2021.3060711
69	CHANG L, SHAN L, JIANG C, et al Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment. Autonomous Robots, 2021, 45 (1): 51- 76. doi: 10.1007/s10514-020-09947-4
70	PEREZ-D'ARPINO C, LIU C, GOEBEL P, et al Robot navigation in constrained pedestrian environments using reinforcement learning. Proc. of the IEEE International Conference on Robotics and Automation, 2021, 1140- 1146.
71	ICHTER B, PAVONE M Robot motion planning in learned latent spaces. IEEE Robotics and Automation Letters, 2019, 4 (3): 2407- 2414. doi: 10.1109/LRA.2019.2901898
72	QURESHI A H, SIMEONOV A, BENCY M J, et al Motion planning networks. Proc. of the IEEE International Conference on Robotics and Automation, 2019, 2118- 2124.
73	HE Z C, DONG L, SONG C W, et al. Multiagent soft actor-critic based hybrid motion planner for mobile robots. IEEE Trans. on Neural Networks and Learning Systems, 2022. DOI: 10.1109/TNNLS.2022.3172168.
74	EVERETT M, CHEN Y F, HOW J P. Motion planning among dynamic, decision-making agents with deep reinforcement learning. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems. 2018: 3052−3059.
75	WANG R E, EVERETT M, HOW J P. R-MADDPG for partially observable environments and limited communication. https://arxiv.org/abs/2002.06684.
76	ZHELO O, ZHANG J W, TAI L, et al. Curiosity-driven exploration for mapless navigation with deep reinforcement learning. https://arxiv.org/pdf/1804.00456.
77	MIROWSKI P, PASCANU R, VIOLA F, et al. Learning to navigate in complex environments. https://arxiv.org/pdf/1611.03673.
78	LONG P X, FAN T X, LIAO X Y, et al Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. Proc. of the IEEE International Conference on Robotics and Automation, 2018, 6252- 6259.
79	PATHAK D, AGRAWAL P, EFROS A A, et al Curiosity-driven exploration by self-supervised prediction. Proc. of the International Conference on Machine Learning, 2017, 2778- 2787.
80	SHI H B, SHI L, XU M, et al End-to-end navigation strategy with deep reinforcement learning for mobile robots. IEEE Trans. on Industrial Informatics, 2019, 16 (4): 2393- 2402.
81	WANG Y D, HE H B, SUN C Y Learning to navigate through complex dynamic environment with modular deep reinforcement learning. IEEE Trans. on Games, 2018, 10 (4): 400- 412. doi: 10.1109/TG.2018.2849942
82	WANG N, ZHANG D Y, WANG Y Learning to navigate for mobile robot with continual reinforcement learning. Proc. of the 39th Chinese Control Conference, 2020, 3701- 3706.
83	TAI L, PAOLO G, LIU M Virtual-to-real deep reinforcement learning: continuous control of mobile robots for maples navigation. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017, 31- 36.
84	ROHMER E, SINGH S P, FREESE M V-REP: a versatile and scalable robot simulation framework. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, 1321- 1326.
85	XIE L H, WANG S, ROSA S, et al Learning with training wheels: speeding up training with a simple controller for deep reinforcement learning. Proc. of the IEEE International Conference on Robotics and Automation, 2018, 6276- 6283.
86	LUONG M, PHAM C Incremental learning for autonomous navigation of mobile robots based on deep reinforcement learning. Journal of Intelligent & Robotic Systems, 2021, 101, 1. doi: 10.1007/s10846-020-01262-5
87	CUI Y X, ZHANG H D, WANG Y, et al Learning world transition model for socially aware robot navigation. Proc. of the IEEE International Conference on Robotics and Automation, 2021, 9262- 9268.
88	JIN J, NGUYEN N M, SAKIB N, et al Mapless navigation among dynamics with social-safety-awareness: a reinforcement learning approach from 2D laser scans. Proc. of the IEEE International Conference on Robotics and Automation, 2020, 6979- 6985.
89	ZHOU Y Y, LI S J, GARCKE J. R-SARL: crowd-aware navigation based deep reinforcement learning for nonholonomic robot in complex environments. https://arxiv.org/abs/2105.13409.
90	MIROWSKI P, GRIMES M K, MALINOWSKI M, et al. Learning to navigate in cities without a map. https://arxiv.org/abs/1804.00168.
91	KULHANEK J, DERNER E, BABUSKA R Visual navigation in real-world indoor environments using end-to-end deep reinforcement learning. IEEE Robotics and Automation Letters, 2021, 6 (3): 4345- 4352. doi: 10.1109/LRA.2021.3068106
92	ZHU Y K, MOTTAGHI R, KOLVE E, et al Target-driven visual navigation in indoor scenes using deep reinforcement learning. Proc. of the IEEE International Conference on Robotics and Automation, 2017, 3357- 3364.
93	WU Y C, RAO Z H, ZHANG W, et al Exploring the task cooperation in multi-goal visual navigation. Proc. of the 28th International Joint Conference on Artificial Intelligence, 2019, 609- 615.
94	LYU Y L, SHI Y M, ZHANG X G Improving target-driven visual navigation with attention on 3D spatial relationships. Neural Processing Letters, 2022, 54, 3979- 3998. doi: 10.1007/s11063-022-10796-8
95	LUO W H, SUN P, ZHONG F W, et al End-to-end active object tracking via reinforcement learning. Proc. of the International Conference on Machine Learning, 2018, 3286- 3295.
96	LUO W H, SUN P, ZHONG F W, et al End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2020, 42 (6): 1317- 1332. doi: 10.1109/TPAMI.2019.2899570
97	TAMPUU A, MATIISEN T, KODELJA D, et al Multiagent cooperation and competition with deep reinforcement learning. Plos One, 2017, 12 (4): e0172395. doi: 10.1371/journal.pone.0172395
98	SIVANATHAN K, VINAYAGAM B K, SAMAK T, et al Decentralized motion planning for multi-robot navigation using deep reinforcement learning. Proc. of the 3rd International Conference on Intelligent Sustainable Systems, 2020, 709- 716.
99	RASHID T, SAMVELYAN M, SCHROEDER C, et al Qmix: monotonic value function factorization for deep multi-agent reinforcement learning. Proc. of the International Conference on Machine Learning, 2018, 4295- 4304.
100	LOWE R, WU Y, TAMAR A, et al Multi-agent actor-critic for mixed cooperative-competitive environments. Proc. of the 31st International Conference on Neural Information Processing Systems, 2017, 6382- 6393.
101	FAN T X, LONG P X, LIU W X, et al Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. International Journal of Robotics Research, 2020, 39 (7): 856- 892. doi: 10.1177/0278364920916531
102	YU C, VELU A, VINITSKY E, et al. The surprising effectiveness of MAPPO in cooperative, multi-agent games. https://arxiv.org/abs/2103.01955.
103	CHEN Y F, LIU M, EVERETT M, et al Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. Proc. of the IEEE International Conference on Robotics and Automation, 2017, 285- 292.
104	CHEN Y F, EVERETT M, LIU M, et al Socially aware motion planning with deep reinforcement learning. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017, 1343- 1350.
105	EVERETT M, CHEN Y F, HOW J P Collision avoidance in pedestrian-rich environments with deep reinforcement learning. IEEE Access, 2021, 9, 10357- 10377. doi: 10.1109/ACCESS.2021.3050338
106	SEMNANI S H, LIU H, EVERETT M, et al Multi-agent motion planning for dense and dynamic environments via deep reinforcement learning. IEEE Robotics and Automation Letters, 2020, 5 (2): 3221- 3226. doi: 10.1109/LRA.2020.2974695
107	SEMNANI S H, DE RUITER A H, LIU H H Force-based algorithm for motion planning of large agent. IEEE Trans. on Cybernetics, 2022, 52 (1): 654- 665. doi: 10.1109/TCYB.2020.2994122
108	TANG S, THOMAS J, KUMAR V Hold or take optimal plan (hoop): a quadratic programming approach to multi-robot trajectory generation. The International Journal of Robotics Research, 2018, 37 (9): 1062- 1084. doi: 10.1177/0278364917741532
109	ZHAO W S, QUERALTA J P, WESTERLUND T Sim-to-real transfer in deep reinforcement learning for robotics: a survey. Proc. of the IEEE Symposium Series on Computational Intelligence, 2020, 737- 744.
110	ZHANG J W, TAI L, YUN P, et al Vr-goggles for robots: real-to-sim domain adaptation for visual control. IEEE Robotics and Automation Letters, 2019, 4 (2): 1148- 1155. doi: 10.1109/LRA.2019.2894216
111	LUTJENS B, EVERETT M, HOW J P Safe reinforcement learning with model uncertainty estimates. Proc. of the IEEE International Conference on Robotics and Automation, 2019, 8662- 8668.
112	CHAFFRE T, MORAS J, CHAN-HON-TONG A, et al. Sim-to-real transfer with incremental environment complexity for reinforcement learning of depth-based robot navigation. https://arxiv.org/abs/2004.14684v1.
113	TRAORE R, CASELLES-DUPRE H, LESORT T, et al. Continual reinforcement learning deployed in real-life using policy distillation and sim2real transfer. https://arxiv.org/abs/1906.04452.
114	RUSU A A, VECERIK M, ROTHORL T, et al Sim-to-real robot learning from pixels with progressive nets. Proc. of the Conference on Robot Learning, 2017, 262- 270.
115	ZHU Y F, SCHWAB D, VELOSO M Learning primitive skills for mobile robots. Proc. of the IEEE International Conference on Robotics and Automation, 2019, 7597- 7603.
116	LIANG J, PATEL U, SATHYAMOORTHY A J, et al. Realtime collision avoidance for mobile robots in dense crowds using implicit multi-sensor fusion and deep reinforcement learning. https://arxiv.org/abs/2004.03089.
117	ANDRYCHOWICZ M, CROW D, RAY A, et al. Hindsight experience replay. Proc. of the 31st Conference on Neural Information Processing Systems, 2017: 5055−5065.
118	SUN Y S, CHENG J H, ZHANG G C, et al Mapless motion planning system for an autonomous underwater vehicle using policy gradient-based deep reinforcement learning. Journal of Intelligent & Robotic Systems, 2019, 96 (3/4): 591- 601.
119	NG A Y, HARADA D, RUSSELL S J Policy invariance under reward transformations: theory and application to reward shaping. Proc. of the International Conference on Machine Learning, 1999, 278- 287.
120	QIAO Z Q, SCHNEIDER J, DOLAN J M Behavior planning at urban intersections through hierarchical reinforcement learning. Proc. of the IEEE International Conference on Robotics and Automation, 2021, 2667- 2673.
121	CHRISTEN S, JENDELE L, AKSAN E, et al Learning functionally decomposed hierarchies for continuous control tasks with path planning. IEEE Robotics and Automation Letters, 2021, 6 (2): 3623- 3630. doi: 10.1109/LRA.2021.3060403
122	CHI Z J, ZHU L, ZHOU F, et al. A collision-free path planning method using direct behavior cloning. Proc. of the International Conference on Intelligent Robotics and Applications, 2019: 529−540.
123	YOU C X, LU J B, FILEV D, et al Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robotics and Autonomous Systems, 2019, 114, 1- 18. doi: 10.1016/j.robot.2019.01.003
124	ROSBACH S, JAMES V, GROSSJOHANN S, et al Driving with style: inverse reinforcement learning in general-purpose planning for automated driving. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019, 2658- 2665.
125	COBBE K, KLIMOV O, HESSE C, et al. Quantifying generalization in reinforcement learning. Proc. of the International Conference on Machine Learning, 2019: 1282−1289.
126	LASKIN M, LEE K, STOOKE A, et al. Reinforcement learning with augmented data. https://arxiv.org/abs/2004.14990.
127	LEE K, LEE K, SHIN J, et al. Network randomization: a simple technique for generalization in deep reinforcement learning. https://arxiv.org/abs/1910.05396.
128	SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay. https://arxiv.org/pdf/1511.05952.pdf.
129	HU Z J, GAO X G, WAN K F, et al Relevant experience learning: a deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments. Chinese Journal of Aeronautics, 2021, 34 (12): 187- 204. doi: 10.1016/j.cja.2020.12.027
130	HE Z C, DONG L, SUN C Y, et al Reinforcement learning based multi-robot formation control under separation bearing orientation scheme. Proc. of the Chinese Automation Congress, 2020, 3792- 3797.
131	LASKIN M, SRINIVAS A, ABBEEL P CURL: contrastive unsupervised representations for reinforcement learning. Proc. of the International Conference on Machine Learning, 2020, 5639- 5650.
132	KOSTRIKOV I, YARATS D, FERGUS R. Image augmentation is all you need: regularizing deep reinforcement learning from pixels. https://arxiv.org/abs/2004.13649.
133	SCHWARZER M, ANAND A, GOEL R, et al. Data-efficient reinforcement learning with self-predictive representations. https://arxiv.org/abs/2007.05929.
134	TRAUTMAN P, KRAUSE A Unfreezing the robot: navigation in dense, interacting crowds. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010, 797- 803.
135	FERRER G, GARRELL A, SANFELIU A Robot companion: a social-force based approach with human awareness navigation in crowded environments. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, 1688- 1694.
136	FERRER G, SANFELIU A Behavior estimation for a complete framework for human motion prediction in crowded environments. Proc. of the IEEE International Conference on Robotics and Automation, 2014, 5940- 5945.
137	VAN DEN BERG J, LIN M, MANOCHA D Reciprocal velocity obstacles for real-time multi-agent navigation. Proc. of the IEEE International Conference on Robotics and Automation, 2008, 1928- 1935.
138	VAN DEN BERG J, GUY S J, LIN M, et al Reciprocal n-body collision avoidance. Robotics Research, 2011, 70, 3- 19.
139	PHILLIPS M, LIKHACHEV M Sipp: safe interval path planning for dynamic environments. Proc. of the IEEE International Conference on Robotics and Automation, 2011, 5628- 5635.
140	TAI L, ZHANG J W, LIU M, et al Socially compliant navigation through raw depth inputs with generative adversarial imitation learning. Proc. of the IEEE International Conference on Robotics and Automation, 2018, 1111- 1117.
141	CHEN C G, LIU Y J, KREISS S, et al Crowd-robot interaction: crowd-aware robot navigation with attention-based deep reinforcement learning. Proc. of the International Conference on Robotics and Automation, 2019, 6015- 6022.
142	XIE L H, MIAO Y S, WANG S, et al Learning with stochastic guidance for robot navigation. IEEE Trans. on Neural Networks and Learning Systems, 2020, 32 (1): 166- 176.
143	LEIVA F, Ruiz-del SOLAR J Robust RL-based map-less local planning: using 2D point clouds as observations. IEEE Robotics and Automation Letters, 2020, 5 (4): 5787- 5794. doi: 10.1109/LRA.2020.3010732
144	ZHANG W, ZHANG Y F, LIU N. Enhancing the generalization performance and speed up training for DRL-based mapless navigation. https://arxiv.org/abs/2103.11686v1.
145	KIRKPATRICK J, PASCANU R, RABINOWITZ N, et al Overcoming catastrophic forgetting in neural networks. Proc. of the National Academy of Sciences, 2017, 114 (13): 3521- 3526. doi: 10.1073/pnas.1611835114
146	SHIN H, LEE J K, KIM J, et al. Continual learning with deep generative replay.https://arxiv.org/abs/1705.08690.
147	MALLYA A, DAVIS D, LAZEBNIK S. Piggyback: adapting a single network to multiple tasks by learning to mask weights. Proc. of the European Conference on Computer Vision, 2018: 67−82.
148	FARAJTABAR M, AZIZAN N, MOTT A, et al. Orthogonal gradient descent for continual learning. Proc. of the International Conference on Artificial Intelligence and Statistics, 2020: 3762−3773.

[1]	Xiangyang LIN, Qinghua XING, Fuxian LIU. Choice of discount rate in reinforcement learning with long-delay rewards [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 381-392.
[2]	Wanping SONG, Zengqiang CHEN, Mingwei SUN, Qinglin SUN. Reinforcement learning based parameter optimization of active disturbance rejection control for autonomous underwater vehicle [J]. Journal of Systems Engineering and Electronics, 2022, 33(1): 170-179.
[3]	Xin ZENG, Yanwei ZHU, Leping YANG, Chengming ZHANG. A guidance method for coplanar orbital interception based on reinforcement learning [J]. Journal of Systems Engineering and Electronics, 2021, 32(4): 927-938.
[4]	Yandong LI, Ling ZHU, Yuan GUO. Observer-based multivariable fixed-time formation control of mobile robots [J]. Journal of Systems Engineering and Electronics, 2020, 31(2): 403-414.
[5]	Bo Zhou, Kun Qian, Xudong Ma, and Xianzhong Dai. Ellipsoidal bounding set-membership identification approach for robust fault diagnosis with application to mobile robots [J]. Systems Engineering and Electronics, 2017, 28(5): 986-995.
[6]	Peng Wang, Qibin Zhang, and Zonghai Chen. Feature extension and matching for mobile robot global localization [J]. Journal of Systems Engineering and Electronics, 2015, 26(4): 840-.

A review of mobile robot motion planning methods: from classical motion planning workflows to reinforcement learning-based architectures

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

Share this article

Figures/Tables 8

References 148

Related Articles 6

Recommended Articles

Metrics

Comments