Zhi WANG (王志)
Reinforcement Learning | Robotics

I am currently a Ph.D. Candidate at City University of Hong Kong, Hong Kong, China, where I am advised by Han-Xiong Li. I work on reinforcement learning, machine learning and system modeling. I will join the faculty of Department of Control and Systems Engineering, Nanjing University full time, starting in Fall 2019.

I received my bachelor degree in engineering at Nanjing University, Nanjing, China, in 2015, where I was advised by Chunlin Chen. I worked on reinforcement learning and robotics.

Email  /  CV  /  Biography  /  Google Scholar  /  Github


Research

I'm interested in reinforcement learning (RL), machine learning, and robotics. Specifically, I work on how learning algorithms can scale RL agents to dynamic environments, allowing them to autonomously adapt to the non-stationary task distributions in real-world domains. This includes a wide range of topics such as incremental learning, online learning, continual learning, transfer learning, model-based learning, and meta-learning. Representative papers are highlighted.

Journal Articles

Incremental reinforcement learning in continuous spaces via policy relaxation and importance weighting,
Zhi Wang, Han-Xiong Li, and Chunlin Chen,
IEEE Transactions on Neural Networks and Learning Systems, 2019.
pdf / code / BibTex / notes / Chinese notes

As intelligent agents are becoming more ubiquitous nowadays, an increasing number of real-world scenarios requires new learning mechanisms that are amenable for a fast adaptation to environments that may drift or change from their nominal situations. Many of today’s data-intensive computing applications require the autonomous RL agent to be capable of adapting its behavior in an incremental manner as the environment changes around it, continuously utilizing previous knowledge to benefit the future decision-making process. This paper investigates the incremental reinforcement learning problem in continuous spaces, which attempts to achieve a fast adaptation to dynamic environments.

Incremental reinforcement learning with prioritized sweeping for dynamic environments,
Zhi Wang, Chunlin Chen, Han-Xiong Li, Daoyi Dong, and Tzyh-Jong Tarn,
IEEE/ASME Transactions on Mechatronics, 2019.
pdf / code / BibTex / notes / Chinese notes

Traditional RL algorithms focus on learning in a stationary environment. We propose a novel Incremental Reinforcement Learning (IRL) algorithm for learning in dynamic environments where the reward function may change over time. IRL provides an appealing option for saving a significant amount of computational resources, while the dynamic environment scenario is supposed to hold in many challenging real-world domains.

prl

Reinforcement learning based optimal sensor placement for spatiotemporal modeling,
Zhi Wang, Han-Xiong Li, and Chunlin Chen,
IEEE Transactions on Cybernetics, 2019.
pdf / BibTex

Optimizing the sensor locations within a distributed process is challenging since most distributed processes are intrinsically nonlinear with infinite dimensions. The self-learning property from unknown environments makes RL a promising candidate for the optimization or control of real systems. In this paper, we develop an integral RL-based optimal sensor placement method for spatiotemporal modeling of DPSs. The sensor placement configuration is mathematically formulated as a Markov decision process (MDP) with specified elements, and the sensor locations are optimized through learning the optimal policies of the MDP according to the spatial objective function.

Dissimilarity analysis based multimode modeling for complex distributed parameter systems,
Zhi Wang, and Han-Xiong Li,
IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019.
pdf / BibTeX

For complex distributed parameter systems (DPSs) with strong nonlinearities and time-varying dynamics, the conventional spatiotemporal modeling methods become ill-suited since the elementary assumption that the process data follow a unimodal Gaussian distribution usually becomes invalid. To handle such strong nonlinearities and time-varying dynamics, we propose a multimode method with a two-step solution: subspace decomposition via modified dissimilarity analysis and local model ensemble via principal component regression.

Incremental learning for online modeling of distributed parameter systems,
Zhi Wang, and Han-Xiong Li,
IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018.
pdf / BibTeX

Traditional spatiotemporal modeling methods are performed in batch-mode, limiting their online applications. We propose an incremental learning method that recursively updates the spatial basis functions and the temporal model. In this way, the model synthesis is inherited and updated efficiently as the streaming data increases over time in an online setting.

Conference Papers

Incremental learning based subspace modeling for distributed parameter systems,
Zhi Wang, and Han-Xiong Li,
In: International Joint Conference on Neural Networks (IJCNN), 2019.
pdf / BibTeX

In this paper, we develop an integral incremental subspace modeling method based on dissimilarity analysis and local model updating for DPSs. The streaming snapshots are collected into small batches at a preset time interval in the online environment. The dissimilarity analysis is utilized to assign the subspace that each new batch belongs to. The whole modeling structure is inherited and updated incrementally when the new data batch is available. At last, all the local models are ensembled to approximate the system's dynamics in real-time.

A novel incremental learning scheme for reinforcement learning in dynamic environments,
Zhi Wang, Chunlin Chen, Han-Xiong Li, Daoyi Dong, and Tzyh-Jong Tarn.
In: 12th World Congress on Intelligent Control and Automation (WCICA), 2016.
pdf / BibTeX

We initialize the concept of incremental learning in RL community, and propose a new scheme that aims at automatically adjusting the optimal policy to adapt to the ever-changing environment.


Invited Talks

Incremental reinforcement learning for dynamic environments,
Zhi Wang, School of Engineering and Information Technology, University of New South Wales, Canberra, Apr. 2019.

- Changing factors/unexpected perturbations are very common in real-world scenarios
- Avoid repeatedly training, save large computational resources
- Maintain and update learned knowledge for online applications

prl

Learning based intelligent modeling for distributed parameter systems,
Zhi Wang, Department of Control and Systems Engineering, Nanjing University, Oct. 2018.

- Reinforcement learning based optimal sensor placement
- Incremental learning for online modeling
- Multimode modeling for complex distributed processes