Zhi WANG (王志)
Reinforcement Learning | Robotics

I am currently a Ph.D. Candidate at City University of Hong Kong, Hong Kong, China, where I am advised by Han-Xiong Li. I work on reinforcement learning, machine learning and system modeling.

I received my bachelor degree in engineering at Nanjing University, Nanjing, China, in 2015, where I was advised by Chunlin Chen. I worked on reinforcement learning and robotics.

Email  /  CV  /  Chinese CV  /  Biography  /  Google Scholar  /  Github


I'm interested in reinforcement learning (RL), machine learning, and robotics. Specifically, I work on how learning algorithms can scale RL agents to dynamic environments, allowing them to autonomously adapt to the non-stationary task distributions in real-world domains. This includes a wide range of topics such as incremental learning, online learning, continual learning, transfer learning, model-based learning, and meta-learning. I have also worked in learning based intelligent modeling of distributed parameter systems (DPSs).

Journal Articles

Incremental reinforcement learning with prioritized sweeping for dynamic environments
Zhi Wang, Chunlin Chen, Han-Xiong Li, Daoyi Dong, and Tzyh-Jong Tarn
IEEE/ASME Transactions on Mechatronics, 2019
pdf / code / BibTex / notes / Chinese notes

Traditional RL algorithms focus on learning in a stationary environment. We propose a novel Incremental Reinforcement Learning (IRL) algorithm for learning in dynamic environments where the reward function may change over time. IRL provides an appealing option for saving a significant amount of computational resources, while the dynamic environment scenario is supposed to hold in many challenging real-world domains.


Reinforcement learning based optimal sensor placement for spatiotemporal modeling
Zhi Wang, Han-Xiong Li, and Chunlin Chen
IEEE Transactions on Cybernetics, 2019
pdf / BibTex

Optimizing the sensor locations within a distributed process is challenging since most distributed processes are intrinsically nonlinear with infinite dimensions. The self-learning property from unknown environments makes RL a promising candidate for the optimization or control of real systems. In this paper, we develop an integral RL-based optimal sensor placement method for spatiotemporal modeling of DPSs. The sensor placement configuration is mathematically formulated as a Markov decision process (MDP) with specified elements, and the sensor locations are optimized through learning the optimal policies of the MDP according to the spatial objective function.

Incremental learning for online modeling of distributed parameter systems
Zhi Wang, and Han-Xiong Li
IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018
pdf / BibTeX

Traditional spatiotemporal modeling methods are performed in batch-mode, limiting their online applications. We propose an incremental learning method that recursively updates the spatial basis functions and the temporal model. In this way, the model synthesis is inherited and updated efficiently as the streaming data increases over time in an online setting.

Conference Papers

A novel incremental learning scheme for reinforcement learning in dynamic environments
Zhi Wang, Chunlin Chen, Han-Xiong Li, Daoyi Dong, and Tzyh-Jong Tarn
In: 12th World Congress on Intelligent Control and Automation, 2019
pdf / BibTeX

We initialize the concept of incremental learning in RL community, and propose a new scheme that aims at automatically adjusting the optimal policy to adapt to the ever-changing environment.

Invited Talks

Learning based intelligent modeling for distributed parameter systems
Zhi Wang, Department of Control and Systems Engineering, Nanjing University, Oct. 2018

- Reinforcement learning based optimal sensor placement
- Incremental learning for online modeling
- Multimode modeling for complex distributed processes

This guy makes a nice webpage.