Suriya Gunasekar

Suriya Gunasekar

Senior Researcher

Microsoft Research, Redmond

[email protected]

I am a Senior Researcher at Microsoft Research (MSR) in the Machine Learning Foundations Group. I am interested in building AI systems that can perform reliable perception and reasoning. My current focus areas are, (a) algorithmic interventions for continual training and deployment of large neural networks, and (b) data and task curation to enhance the reasoning capabilities of AI models. In general, my research interests are built around understanding the principles of deep learning, particularly the role of optimization algorithms and their implicit regularization. Prior to joining MSR, I was a Research Assistant Professor at Toyota Technological Institute at Chicago. I received my PhD from The University of Texas at Austin.

Mentorship

I have had the priviledge of working with some awesome interns in our group.

Publications

(2022)How to Fine-Tune Vision Models with SGD. PDF

arXiv preprint.

(2022)Unveiling Transformers with LEGO: a synthetic reasoning task. PDF Code

arXiv preprint.

(2022)Neural-Sim: Learning to Generate Training Data with NeRF. PDF Code

European Conference on Computer Vision (ECCV).

(2022)Data Augmentation as Feature Manipulation. PDF

International Conference on Machine Learning (ICML).

(2022)Inductive bias of multi-channel linear convolutional networks with bounded weight norm. PDF

Conference on Learning Theory (COLT).

(2021)Methods and Analysis of The First Competition in Predicting Generalization of Deep Learning. PDF Dataset Competition Page

NeurIPS 2020 Competition and Demonstration Track.

(2021)Mirrorless mirror descent: A natural derivation of mirror descent. PDF

International Conference on Artificial Intelligence and Statistics (AISTATS).

(2020)Implicit bias in deep linear classification: Initialization scale vs training accuracy. PDF

Neural Information Processing Systems (NeurIPS).

(2020)Implicit regularization and convergence for weight normalization. PDF

Neural Information Processing Systems (NeurIPS).

(2020)Kernel and Rich Regimes in Overparametrized Models. PDF

Conference on Learning Theory (COLT).

(2019)Theory of deep learning. PDF

Princeton Univ. Princeton, NJ.

(2019)Convergence of gradient descent on separable data. PDF

International Conference on Artificial Intelligence and Statistics (AISTATS).

(2019)Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models. PDF

International Conference on Machine Learning (ICML).

(2018)Implicit bias of gradient descent on linear convolutional networks. PDF

Neural Information Processing Systems (NeurIPS).

(2018)On preserving non-discrimination when combining expert advice. PDF

Neural Information Processing Systems (NeurIPS).

(2018)Characterizing Implicit Bias in Terms of Optimization Geometry. PDF

International Conference on Machine Learning (ICML).

(2018)The Implicit Bias of Gradient Descent on Separable Data. PDF

Journal of Machine Learning Research (JMLR).

(2017)Implicit regularization in matrix factorization. PDF

Neural Information Processing Systems (NeurIPS).

(2017)Learning Non-Discriminatory Predictors. PDF

Conference on Learning Theory (COLT).

(2016)Preference Completion from Partial Rankings. PDF Code

Neural Information Processing Systems (NeurIPS).

(2016)Identifiable phenotyping using constrained non-negative matrix factorization. PDF

Machine Learning for Healthcare Conference (MLHC).

(2016)Phenotyping using Structured Collective Matrix Factorization of Multi--source EHR Data. PDF

arXiv preprint.

(2015)Unified view of matrix completion under general structural constraints. PDF

Neural Information Processing Systems (NeurIPS).

(2015)Consistent collective matrix completion under joint low rank structure. PDF

Artificial Intelligence and Statistics (AISTATS).

(2014)Face detection on distorted images augmented by perceptual quality-aware features. PDF Dataset

IEEE transactions on information forensics and security.

(2014)Exponential family matrix completion under structural constraints. PDF Errata for conference version

International Conference on Machine Learning (ICML).

(2013)Noisy matrix completion using alternating minimization. PDF

Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD).

(2012)Review quality aware collaborative filtering. PDF

ACM conference on Recommender systems (RecSys).