Research:
I work on advancing the fundamental capabilities needed to develop artificial general intelligence and create systems that can effectively interact with and add value to the real world. My current interests include:
- (i) developing full-stack training pipelines for foundation models, with particular focus on distributed systems architecture, scalable optimization algorithms, and principled approaches to data curation and composition.
- (ii) investigating the mathematical and scientific principles that govern large-scale learning systems, with emphasis on understanding emergent capabilities, scaling laws, and fundamental limits of neural architectures.
- (iii) advancing autonomous agent architectures that can reason, plan, and learn from interaction, with focus on bridging the gap between language models and embodied intelligence in complex environments.
Prospective Students:
I am seeking students with diverse backgrounds, including those experienced in applied deep learning, as well as those with strong foundations in optimization, mathematics, and theoretical computer science. As co-director of the newly-established [Kempner Institute], we offer substantial computational resources for cutting-edge research. If you're interested, I encourage you to apply to Harvard!
Recent Blog Posts:
Selected blog posts (also see [publications] and [Deeper Learning]) exploring fundamental questions in AI and advancing technical innovations:
Selected Service:
- [Committee for the ACM Prize in Computing] (active)
- [Committee for the Sloan Research Fellowships] in Computer Science
- Co-organizer for the Simons Symposium on [New Directions in Theoretical Machine Learning], May 2019
- Program chair for the 24th Annual Conference on Learning Theory (COLT 2011)