Markov Decision Process and Reinforcement Learning

A mathematical programming approach towards efficient reinforcement learning.

[my picture]  width=

  • Y Chen, L Li, M Wang. Scalable Bilinear Pi Learning Using State and Action Features. ICML, 2018. [link]

  • S Kakade, M Wang, L Yang. Variance Reduction Methods for Sublinear Reinforcement Learning. Submitted. [link]

  • M. Wang. Primal-Dual Pi Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems. Submitted. [link]

  • Y. Chen and M. Wang. Lower Bound On the Computational Complexity of Discounted Markov Decision Problems. Submitted in May 2017. [link]

  • A. Sidford, M. Wang, C. Wu, Y. Ye. Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes. SODA, 2018. [link]

  • M. Wang. Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear Running Time. Submitted in April 2017. [link]

  • Y. Chen and M. Wang. Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning. Submitted in December 2016. [link]

  • Y. Chen, S. Liu, H. Liu and M. Wang. Finite-Horizon Primal-Dual Reinforcement Learning. Submitted in October 2016.

  • M. Wang and Y. Chen. Online Value-Policy Iteration for Discounted Markov Decision Process. Proceedings of IEEE Conference of Decision and Control, 2016.

Nonconvex Optimization and Statistical Learning

Nonconvex optimization permeates data analysis ranging from sparse learning to principal component analysis. The goal is to understand fundamental complexity and develop statistically efficient solutions.

[my picture]  width=
  • Xudong Li, Mengdi Wang, Anru Zhang. Estimation of Markov Chain via Rank-Constrained Likelihood. ICML, 2018. [link]

  • Anru Zhang, Mengdi Wang. State Compression of Markov Processes via Empirical Low-Rank Estimation. Preprint, Feb 2017. [link]

  • Junchi Li, Mengdi Wang, Tong Zhang. A Weak Convergence Theory for Online Principal Component Analysis. NIPS, 2017. [link] (NIPS Oral Presentation)

  • J. Ge, Z. Wang, M. Wang, H. Liu. Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems. AISTATS, 2018. [link]

  • Lin Yang, Vova Braverman, Tuo Zhao, Mengdi Wang. Dynamic Factorization and Partition of Complex Networks. NIPS Workshop on Optimization for Machine Learning, 2017. [link]

  • X. Li*, J. Ge*, T. Zhang, M. Wang, H. Liu, T. Zhao. The "PICASSO" Package for High Dimensional Nonconvex Sparse Learning in R. Submitted. [PDF] [Software] [Site] (2016 ASA Best Student Paper Award on Statistical Computing)

  • J. Ge, M. Hong, M. Wang, H. Liu, T. Zhao. Homotopy Active Set Proximal Newton Algorithm for Sparse Learning. Submitted.

  • M. Wang. Vanishing Price of Anarchy in Large Coordinative Nonconvex Optimization. SIAM Journal on Optimization, 27(3) 1977-2009, 2017. [PDF]

  • Y. Chen, D. Ge, M. Wang, Z. Wang, Y. Ye, H. Yin Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions. ICML, 2017. [link]

  • Y. Chen, Y. Ye and M. Wang. Hardness of Approximation for A Class of Sparse Optimization Problems. Submitted in Feb 2017. Journal of Machine Learning Research, 2018. [PDF]

  • J. Li, M. Wang, H. Liu, T. Zhang. Near-Optimal Stochastic Approximation for Online Principal Component Estimation. Mathematical Programming, doi:10.1007/s10107-017-1182-z, 2017. [PDF]

  • X. Fang, H. Liu, M. Wang. Blessing of Massive Scale: Spatial Graphical Model Inference with a Total Cardinality Constraint. Submitted in November 2015. [PDF]

  • M. Wang, Y. Xu, and Y. Gu. Multi-Task Nonconvex Optimization with Joint Constraints: A Distributed Algorithm Using Monte Carlo Estimates. International Conference on Digital Signal Processing (DSP), Hong Kong, China, 2014.

Stochastic Nested Composition Optimization

[my picture] width=
A rich class of stochastic optimization problems that involves two or multiple nested expectations. It models decision problems involve nonlinear functionals of data distribution. This is a new optimization model with many important potential applications. Examples are risk management, big data analysis, machine learning, real-time intelligent systems. Theoretical complexity and efficient algorithms under research.

For a brief introduction, see [slides]

  • S. Ghadimi, M. Wang. Approximation Methods for Bilevel Programming. Submitted, Jan 2018. [link]

  • S. Yang, M. Wang, E. Fang. Multi-Level Stochastic Gradient Methods for Nested Composition Optimization. Submitted, Jan 2018. [link]

  • X. Lian, M. Wang, J. Liu. Finite-Sum Composition Optimization via Variance Reduced Gradient Descent. Proceedings of Artificial Intelligence and Statistics Conference (AISTATS), 2017. [link]

  • M. Wang*, J. Liu*, X. Fang. Accelerating Stochastic Composition Optimization. A short version at NIPS, 2016. Journal of Machine Learning Research, 18(105):1−23, 2017. [link]

  • M. Wang and J. Liu. A Stochastic Compositional Subgradient Method Using Markov Samples. Proceedings of Winter Simulation Conference, 2016.

  • M. Wang, X. Fang, and H. Liu. Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of Expected-Value Functions. Mathematical Programming, 161(1), 419-449, 2016; Submitted in November 2014. (Best Paper Prize in Continuous Optimization, ICCOPT 2016) [PDF]

Stochastic Methods for Optimization and Beyond

Stochastic methods for data-driven optimization, where the data points are retrieved from big data set or online streams. The research goal is to develop scalable optimization algorithms that interact with data randomization oracles and achieve optimal or near-optimal theoretical guarantees.

  • J. Liu, Y. Gu, M. Wang. Averaging Random Projection: A Fast Online Solution for Large-Scale Constrained Stochastic Optimization. International Conference on Digital Signal Processing (DSP), Hong Kong, China, 2014. [PDF]

  • M. Wang, Y. Chen. Random Multi-Constraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints. In revision with Mathematics of Operations Research. [PDF]

  • X. Wang, M. Wang, Y. Gu. A Distributed Tracking Algorithm for Reconstruction of Graph Signals. IEEE Journal of Selected Topics in Signal Processing, 9(4), June 2015. [PDF]

  • M. Wang and D. P. Bertsekas. Stochastic First-Order Methods with Random Constraint Projection. SIAM Journal on Optimization, 26(1):681–717, 2016. [PDF]

  • M. Wang and D. P. Bertsekas. Incremental Constraint Projection Methods for Variational Inequalities. Mathematical Programming, 150(2):321-363, 2015. [PDF]

  • M. Wang and D. P. Bertsekas. Stabilization of Stochastic Iterative Methods for Singular and Nearly Singular Linear Systems. Mathematics of Operations Research, 39(1):1-30,2014. [PDF] [poster]

  • M. Wang and D. P. Bertsekas. Convergence of Iterative Simulation-Based Methods for Singular Linear Systems. Stochastic Systems, 3(1):39-96,2013. [PDF]

  • N. Polydorides, M. Wang, and D. P. Bertsekas. A Quasi Monte Carlo Method for Large-Scale Inverse Problems. Monte Carlo and Quasi-Monte Carlo Methods 2010, pp. 623-637, Springer Proc. in Mathematics and Statistics, 2010. [PDF]

  • Y. Gu and M. Wang. Learning Distributed Jointly Sparse Systems by Collaborative LMS. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014.

  • M. Wang, N. Polydorides, and D. P. Bertsekas. Approximate Simulation-Based Solution of Large-Scale Least Squares Problems. Report LIDS-P-2819, MIT, 2009. [PDF]