Zhuoran Yang   杨卓然

I am a Ph.D. candidate in the Department of Operations Research and Financial Engineering at Princeton University advised by Professor Jianqing Fan and Professor Han Liu. Prior to attending Princeton, I obtained a Bachelor of Mathematics degree from Tsinghua University.

My research interests lie in the interface between machine learning, statistics and optimization. The primary goal of my research is to design efficient learning algorithms for large-scale decision making problems that arise in reinforcement learning and stochastic games, with both statistical and computational guarantees. In addition, I am also interested in the applications of reinforcement learning such as computer games and robotics.


Selected Recent Papers

*: equal contribution or alphabetic ordering.

Is Pessimism Provably Efficient for Offline RL?
Ying Jin*, Zhuoran Yang*, Zhaoran Wang*
Submitted, 2020   [arXiv] [slides] [Ying's talk at RL theory seminars]
TL;DR: We propose a pessimistic variant of the value iteration algorithm (PEVI), which incorporates an uncertainty quantifier as the penalty function. This algorithm is shown to be the best effort in the face of an arbitrary dataset.
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang, Chi Jin, Zhaoran Wang, Mengdi Wang, Michael I. Jordan
Submitted, 2020   [arXiv] [slides]
TL;DR: We propose an optimistic variant of the value iteration algorithm which incoporates powerful function approximators such as kernels and neural networks. The algorithm provably enjoys sample efficiency and computational tractability under the online setting.
Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization
Zhuoran Yang, Yufeng Zhang, Yongxin Chen, Zhaoran Wang
Submitted, 2020   [arXiv]
TL;DR: This paper proposes a particle-based algorithm for optimization problems defined on a family of probability distribution, where the objective of interest admits a variational representation. Our particle-based algorithm can be viewed as an implementable approximation of Wasserstein gradient descent.
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan*, Zhaoran Wang*, Yuchen Xie* Zhuoran Yang*
Submitted, 2020   [arXiv]
TL;DR: This work establishes the sample complexity of fitted Q iteration with the value functions represented by deep neural networks. In particular, we characterize the bias and variance involved in estimating the Bellman target using neural networks.
Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval
Jianqing Fan*, Han Liu*, Zhaoran Wang*, Zhuoran Yang*
Submitted, 2018   [arXiv]
TL;DR: Under the oracle computational model, this paper rigorously delineates the gap between the information-theretic optimal statistical rate and the optimal statistical performance achieved by the class of computationally efficient estimators.