[Machine Learning]

Samory Kpotufe

E-mail: samory@princeton.edu
Office: Room 327, Operations Research and Financial Engineering, Sherrerd Hall, Princeton University.

I'm an Assistant Professor at ORFE, Princeton University. I am also Associated Faculty in the Computer Science Department, and at the Center for Information Technology and Policy at Princeton University.

In a recent past I was Assistant Research Professor at Toyota Technological Institute Chicago. Prior to this I was a researcher at the Max Planck Institute for Intelligent Systems. At the MPI I worked in the department of Bernhard Schoelkopf, in the learning theory group of Ulrike von Luxburg. I did my PhD (Sept 2010) in Computer Science at the University of California, San Diego, advised by Sanjoy Dasgupta.


ORF 245/ENG 245 (Spring, 14-15, 15-16, 16-17). Fundamentals of Statistics. [Syllabus]

ORF 524 (Fall 15-16, 16-17, 17-18). Statistical Theory and Methods. [Syllabus]

- Princeton Engineering Commendation List for Outstanding Teaching (ORF 524, Fall 16-17).
- NIPS 2016 Workshop on Nonparametrics just got approved. We're looking forward to an exciting array of speakers.
- 2016 Seed grant from Siebel Energy Institute, jointly with Prof. Nick Feamster, to work on ML challenges in Internet-Of-Things (smart home/cities). Initial work will consider anomalous activity detection.
- Lectured at Machine Learning Summer School (MLSS) 2016, Cadiz.
- Princeton Engineering Commendation List for Outstanding Teaching (ORF 524, Fall 15-16).
I work in machine learning, with an emphasis on nonparametric methods and high-dimensional statistics. Generally, I am interested in understanding the inherent difficulty of high-dimensional learning problems (e.g. most modern data mining problems). The nonparametric setting is attractive in that it captures settings where we have little domain knowledge, and thus allows for a degree of abstraction in dealing with difficult high-dimensional learning.

More specifically, I'd like to understand quantities/structures that characterize the complexity of high-dimensional problems (e.g. intrinsic dimension, sparsity, clusters, smoothness, and so forth), where complexity is stated in terms of the resources required to learn (e.g. amount of data, computation time). Some recent highlights are in showing that many common predictors (e.g. kNN, certain regression trees) can automatically benefit from structured data (e.g. manifold data, sparse data) without no a priori knowledge of the inherent structure of the data.

My main practical aim is to derive deployable adaptive procedures, i.e. practical procedures that can self-tune to unknown structure in data, while at the same time meeting the various real-world constraints of modern applications. Examples are time complexity constaints, space constraints with sequential data, costly data, changes and inhomogeneity in data distributon, etc. An emerging picture is that, many such constraints are more easily met (i.e. better tradeoffs) when data is structured. A more in depth discussion can be found in my research statement. Also, here is a talk I've given a few times on the subject of nonparametric regression in high-dimensional spaces, and adaptivity to important problem parameters.



Samory Kpotufe, Nakul Verma. Time-Accuracy Tradeoffs in Kernel Prediction: Controlling Prediction Quality.
Journal of Machine Learning Research (JMLR) 2017. [ pdf ]

Andrea Locatelli, Alexandra, Carpentier, Samory Kpotufe. Adaptivity to Noise Parameters in Nonparametric Active Learning.
Conference on Learning Theory (COLT) 2017. [ pdf ]

Heinrich Jiang, Samory Kpotufe. Modal-set estimation with an application to clustering.
Artificial Intelligence and Statistics (AISTATS) 2017. Selected for Plenary Presentation. [ pdf ]

Samory Kpotufe. Lipschitz Density-Ratios, Structured Data, and Data-driven Tuning.
Artificial Intelligence and Statistics (AISTATS) 2017. [ pdf ]

Samory Kpotufe, Abdeslam Boularias, Thomas Schultz, Kyoungok Kim. Gradients Weights improve Regression and Classification.
Journal Of Machine Learning Research (JMLR) 2016. [ pdf ]

Samory Kpotufe, Ruth Urner, Shai Ben-David. Hierarchical label queries with data-dependent partitions.
Conference on Learning Theory, 2015. [ pdf ]

Sanjoy Dasgupta, Samory Kpotufe. Optimal rates for k-NN density and mode estimation.
Neural Information Processing Systems (NIPS) 2014. [ pdf | slides (CIRM, Luminy)]

Kamalika Chaudhuri, Sanjoy Dasgupta, Samory Kpotufe, Ulrike von Luxburg. Consistent procedures for cluster-tree estimation and pruning.
IEEE Transactions on Information Theory, 60(12):7900-7912, 2014. [ pdf ]

Shubhendu Trivedi, Jialei Wang, Samory Kpotufe, Gregory Shakhnarovich. A Consistent Estimator of the Expected Gradient Outerproduct.
Uncertainty in Artificial Intelligence (UAI) 2014. [ pdf ]

Samory Kpotufe, Eleni Sgouritsa, Dominik Janzing, Bernhard Schoelkopf. Consistency of Causal Inference under the Additive Noise Model.
International Conference on Machine Learning (ICML) 2014. [ pdf ]

Samory Kpotufe, Vikas K. Garg. Adaptivity to Local Smoothness and Dimension in Kernel Regression.
Neural Information Processing Sytems (NIPS) 2013. [ pdf ]

Samory Kpotufe, Francesco Orabona. Regression-tree Tuning in a Streaming Setting.
Neural Information Processing Sytems (NIPS) 2013. Selected for Spotlight (one of 52/1420 submissions). [ pdf ]

Samory Kpotufe, Abdeslam Boularias. Gradient weights help nonparametric regressors.
Neural Information Processing Sytems (NIPS) 2012. Selected for Plenary Presentation (one of 20/1467 submissions). [ pdf ]

Samory Kpotufe. k-NN Regression adapts to local intrinsic dimension.
Neural Information Processing Sytems (NIPS) 2011. Selected for Plenary Presentation (one of 20/1400 submissions). [ pdf ]

Samory Kpotufe, Ulrike von Luxburg. Pruning nearest neighbor cluster trees.
International Conference on Machine Learning (ICML) 2011. [ pdf | slides ]

Samory Kpotufe, Sanjoy Dasgupta. A tree-based regressor that adapts to intrinsic dimension.
Invited to Special Issue of the Journal of Computer and Systems Sciences (JCSS) 2011. [ pdf ]

Samory Kpotufe. The curse of dimension in nonparametric regression.
UCSD, Phd Dissertation 2010. [ pdf ]

Eric Flynn, Samory Kpotufe, et al. SHMTools: a new embeddable software package for SHM applications. SPIE 2010.

Samory Kpotufe. Escaping the curse of dimensionality with a tree-based regressor.
Conference on Learning Theory (COLT) 2009. Mark Fulk Best Student Paper. [ pdf | slides ]

Nakul Verma, Samory Kpotufe, Sanjoy Dasgupta. Which spatial partition trees are adaptive to intrinsic dimension?
Uncertainty in Artificial Intelligence (UAI) 2009. [ pdf | poster ]

Samory Kpotufe. Fast, smooth and adaptive regression in metric spaces.
Neural Information Processing Sytems (NIPS) 2009. [ pdf ]

Some invited talks

CAL IT2, Information Theory and Applications Workshop. February 2014 + 2013.

Carnegie Mellon, Statistics. October 2013.

ETH (Swiss Federal Institute of Technology) Zurich, ML group. April 2013.

Weierstrass Institute for Applied Analysis and Stochastics. November 2011.

Foundations of Computational Mathematics, Learning Theory Workshop. June 2011.


Professional activities
Senior Committees:

- Editorial Board Member: Journal of Machine Learning Research (2014 to present).
- Area Chair: COLT (2015, 2016), NIPS (2015, 2016), AISTATS (2017).


Journal of Machine Learning Research, IEEE Transactions on Information Theory, IEEE Transactions on Pattern Analysis and Machine Intelligence, Annals of Statistics, ESAIM Probability and Statistics, Neural Information Processing Systems (NIPS), ACM-SIAM Symposium On Discrete Algorithms (SODA), International Conference on Machine Learning (ICML), ...


Biking, basketball, I also like to draw and paint.