Deep deterministic policy gradient based cooperativeplatoon longitudinal control strategy(PDF)
长安大学学报(自然科学版)[ISSN:1006-6977/CN:61-1281/TN]
- Issue:
- 2021年4期
- Page:
- 90-100
- Research Field:
- 交通工程
- Publishing date:
Info
- Title:
- Deep deterministic policy gradient based cooperativeplatoon longitudinal control strategy
- Author(s):
- MIN Haigen1; 2; YANG Yiming1; WANG Wuqi1; FANG Yukun1; SONG Xiaopeng3
- (1. School of Information & Engineering, Changan University, Xian 710064, Shaanxi, China;2. Joint Laboratory for Internet of Vehicles, Ministry of EducationChina MobileCommunications Corporation, Changan University, Xian 710064, Shaanxi, China;3. Zhejiang Transportation Planning and Design Institute Co., Ltd, Hangzhou 310017, Zhejiang, China)
- Keywords:
- traffic engineering; deep reinforcement learning; platoon longitudinal control; deep deterministic policy gradient; platoon string stability
- PACS:
- -
- DOI:
- -
- Abstract:
- To solve the problem of continuous and accurate platoon control and string stability during platoon traveling, a deep reinforcement learning (DRL)based platoon longitudinal control strategy at moderate speed was proposed. Three key factors including spacing, vehicle speed and acceleration, were fully considered and satisfied by the proposed strategy, which considers vehicle dynamics and comfort in the learning process. First, the platoon control process was modeled and the algorithm of the reinforcement learning was illustrated. Second, a DRLbased method that determines the optimal strategy for platoon longitudinal control was proposed. Particularly, a multiobjective reward function was designed, which can integrate the rewards corresponding to the distance error, speed error, and acceleration constraints. Third, the deep deterministic policy gradient (DDPG) was adopted to solve the platoon longitudinal control problem. The algorithm combined actorcritic (AC) and deep Qnetwork (DQN) to effectively solve the problem of platoon control in continuous state space and continuous action space. The results show that the proposed platoon control method based on reinforcement learning has the same control accuracy as the distributed model predictive control algorithm, and can achieve the string stability of a platoon under the leaderfollower communication topology. 3 tabs, 11 figs, 19 refs.
Last Update: 2021-08-12