I am a Ph.D. candidate in the Department of Operations Research and Financial Engineering at Princeton University advised by Professor Jianqing Fan and Professor Han Liu. Prior to attending Princeton, I obtained a Bachelor of Mathematics degree from Tsinghua University.
My research interests lie in the interface between machine learning, statistics and optimization. The primary goal of my research is to design efficient learning algorithms for large-scale decision making problems that arise in reinforcement learning and stochastic games, with both statistical and computational guarantees. In addition, I am also interested in the applications of reinforcement learning such as computer games and robotics.
Education
Princeton University.   2015 - present
Ph.D. Candidate in Operations Research and Financial EngineeringTsinghua University.   2011 - 2015
B.S. in MathematicsÉcole Normale Supérieure (Paris).   Spring 2015
Exchange StudentUniversity of Wisconsin-Madison.   Spring 2014
Exchange Student
Journal Publications and Submissions
*: equal contribution or alphabetic ordering.
A Theoretical Analysis of Deep Q-Learning
Submitted to Annals of Statistics, 2020   [arXiv] |
High-dimensional Varying Index Coefficient Models via Stein’s Identity
Journal of Machine Learning Research, 2019   [arXiv] |
Misspecified Nonconvex Statistical Optimization for Phase Retrieval
Mathematical Programming, 2019   [arXiv] |
On Semiparametric Exponential Family Graphical Models
Journal of Machine Learning Research, 19(57):1−59, 2018   [Link] |
Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval
Submitted to Annals of Statistics, 2018   [arXiv] |
Tensor Methods for Additive Index Models under Discordance and Heterogeneity
Submitted to Annals of Statistics, 2018   [arXiv] |
Provably Efficient Reinforcement Learning with Linear Function Approximation
Submitted, 2019   [arXiv] |
Robust One-Bit Recovery via ReLU Generative Networks: Improved Statistical Rates and Global Landscape Analysis
Submitted, 2019   [arXiv] |
Conference Publications
*: equal contribution or alphabetic ordering.
Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
Advances in Neural Information Processing Systems (NeurIPS), 2019   [arXiv] |
Neural Proximal Policy Optimization Attains Optimal Policy
Advances in Neural Information Processing Systems (NeurIPS), 2019   [arXiv] |
Neural Temporal-Difference Learning Converges to Global Optima
Advances in Neural Information Processing Systems (NeurIPS), 2019   [arXiv] |
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games
Advances in Neural Information Processing Systems (NeurIPS), 2019   [arXiv] |
Statistical-Computational Tradeoff in Single Index Models
Advances in Neural Information Processing Systems (NeurIPS), 2019 |
Variance Reduced Policy Evaluation with Smooth Function Approximation
Advances in Neural Information Processing Systems (NeurIPS), 2019 |
Convergent Policy Optimization for Safe Reinforcement Learning
Advances in Neural Information Processing Systems (NeurIPS), 2019 |
On the statistical rate of nonlinear recovery in generative models with heavy-tailed data
International Conference on Machine Learning (ICML), 2019   [Link] |
A Finite Sample Analysis of the Actor-Critic Algorithm
IEEE Conference on Decision and Control (CDC), 2018   [Link] |
Networked Multi-Agent Reinforcement Learning in Continuous Spaces
IEEE Conference on Decision and Control (CDC), 2018   [Link] |
Multi-agent reinforcement learning via double averaging primal-dual optimization
Advances in Neural Information Processing Systems (NeurIPS), 2018   [arXiv] |
Provable Gaussian Embedding with One Observation
Advances in Neural Information Processing Systems (NeurIPS), 2018   [arXiv] |
Contrastive Learning from Pairwise Measurements
Advances in Neural Information Processing Systems (NeurIPS), 2018   [Link] |
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents
International Conference on Machine Learning (ICML), 2018   [arXiv] |
The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference
International Conference on Machine Learning (ICML), 2018   [Link] |
Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding.
International Conference on Artificial Intelligence and Statistics (AISTATS), 2018   [Link] |
Estimating High-dimensional Non-Gaussian Multiple Index Models via Stein’s Lemma
Advances in Neural Information Processing Systems (NeurIPS), 2017   [Link]   [arXiv, Long Version] |
High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation
International Conference on Machine Learning (ICML), 2017   [Link] |
More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning
Advances in Neural Information Processing Systems (NeurIPS), 2016   [Link] |
Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference
International Conference on Machine Learning (ICML), 2016   [arXiv] |
Human Memory Search as Initial-Visit Emitting Random Walk
Advances in Neural Information Processing Systems (NeurIPS), 2015   [Link] |
Preprints
*: equal contribution or alphabetic ordering.
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
[arXiv] |
A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
[arXiv] |
Fast Multi-Agent Temporal-Difference Learning via Homotopy Stochastic Primal-Dual Optimization
[arXiv] |
Finite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement Learning
[arXiv] |
Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space
[arXiv] |