
Yuanheng Zhu
State Key Laboratory for Management and Control of Complex Systems
Institute of Automation, Chinese Academy of Sciences
Beijing 100190, China
Phone: +86-130-0118-1922; Fax: +86-10-82544799
Email: yuanheng.zhu@ia.ac.cn
Research Areas
Multi-agent reinforcement learning
Deep reinforcement learning
Sequential games
Cooperation and competition
Swarm intelligence
Education
09/2010--07/2015, Instititue of Automation, Chinese Acadey of Sciences , PhD
09/2006--07/2010, Nanjing University, B.S.
Experience
07/2015--now, Institute of Automation, Chinese Academy of Sciences, Assistant Research, Associated Researcher
12/2017--12/2018, University of Rhode Island, Visiting Scholar
Teaching Experience
2018/2019, University of Chinese Academy of Sciences, Reinforcement Learning (with Prof Dongbin Zhao)
2019/2020, 2020/2021, University of Chinese Academy of Sciences, Reinforcement Learning (with Profs Dongbin Zhao and Qichao Zhang)
Publications
Papers
[1] Synthesis of Cooperative Adaptive Cruise
Control with Feedforward Strategies, IEEE Transactions on Vehicular
Technology, 2020-02, First Author.
[2] Vision-based
control in the open racing car simulator with deep and reinforcement
learning, Journal of Ambient Intelligence and Humanized
Computing, 2019-09, First Author.
[3] LMI-Based
Synthesis of String-Stable Controller for Cooperative Adaptive Cruise
Control, IEEE Transactions on Intelligent Transportation
Systems, 2019-08, First Author.
[4] Control-limited
adaptive dynamic programming for multi-battery energy storage
systems, IEEE Transactions on Smart Grid, 2019-07, First Author.
[5] Adaptive
optimal control of heterogeneous CACC system with uncertain
dynamics, IEEE Transactions on Control Systems
Technology, 2019-07, First Author.
[6] Invariant Adaptive Dynamic Programming for
Discrete-Time Optimal Control, IEEE Transactions on Systems, Man, and
Cybernetics: Systems, 2019-04, First Author.
[7] StarCraft
Micromanagement With Reinforcement Learning and Curriculum Transfer
Learning, IEEE Transactions on Emerging Topics in Computational
Intelligence, 2019-02, Second Author.
[8] Comprehensive
comparison of online ADP algorithms for continuous-time optimal
control, Artificial Intelligence Review, 2018-04, First Author.
[9] 深度强化学习进展:
从 AlphaGo 到 AlphaGo Zero, Recent progress of deep reinforcement
learning: from AlphaGo to AlphaGo Zero, 控制理论与应用, 2017-12, Forth Author.
[10] Adaptive
dynamic programming for robust neural control of unknown
continuous-time non-linear systems, IET Control Theory &
Applications, 2017-09, Forth Author.
[11] Event-Triggered
Optimal Control for Partially Unknown Constrained-Input Systems via
Adaptive Dynamic Programming, IEEE Transactions on Industrial
Electronics, 2017-05, First Author.
[12] Iterative
Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum
Game Based on Online Data, IEEE Transactions on Neural Networks and
Learning Systems, 2017-03, First Author.
[13] Policy
iteration for Hinfty optimal control of polynomial nonlinear systems
via sum of squares programming, IEEE transactions on
cybernetics, 2017-02, First Author.
[14] Data-driven
adaptive dynamic programming for continuous-time fully cooperative
games with partially constrained inputs, Neurocomputing, 2017-02, Third Author.
[15] Probably
approximately correct reinforcement leaming solving continuous-state
control problem, 控制理论与应用, 2016-12, First Author.
[16] Using
reinforcement learning techniques to solve continuous-time non-linear
optimal tracking problem without system dynamics, IET Control Theory
Applications, 2016-07, First Author.
[17] Convergence
Proof of Approximate Policy Iteration for Undiscounted Optimal Control
of Discrete-Time Systems, Cognitive Computation, 2015-06, First Author.
[18] A
data-based online reinforcement learning algorithm satisfying probably
approximately correct principle, Neural Computing and
Applications, 2015-04, First Author.
[19] MEC-A
Near-Optimal Online Reinforcement Learning Algorithm for Continuous
Deterministic Systems, IEEE Transactions on Neural Networks and Learning
Systems, 2015-02, Second Author.
[20] Convergence
analysis and application of fuzzy-HDP for nonlinear discrete-time HJB
systems, Neurocomputing, 2015-02, First Author.
Patents
[2] 多电池储能系统的优化控制方法、系统及存储介质, 发明, 2020, 第 1 作者, 专利号: 201810967603.7
[3] 智能驾驶车道保持方法及系统, 发明, 2018, 第 5 作者, 专利号: 201811260601.0
[4] 弹簧质量阻尼器的鲁棒跟踪控制方法, 发明, 2018, 第 3 作者, 专利号: 201810004181.3
[5] 基于数据的Q函数自适应动态规划方法, 发明, 2013, 第 2 作者, 专利号: 201310036976.X
[6] 储能电池充放电异常行为检测方法及检测系统, 发明, 2016, 第 3 作者, 专利号: 201610687158.X
[7] 基于反事实回报的多智能体深度强化学习方法、系统, 发明, 2020, 第 3 作者, 专利号: 201911343902.4