Ashutosh Saxena

Alfred P. Sloan Fellow and Microsoft Faculty Fellow.

Director, RoboBrain project, Cornell University and Stanford University.
asaxena at cs.stanford dot edu
Research Interests: Machine Learning, Robotics, Computer Vision.

PhD, Machine Learning, Stanford University with Andrew Y. Ng (advisor), Sebastian Thrun and Stephen Boyd.
Department of Computer Science, Stanford, CA 94305.



  • Eight Innovators to Watch in 2015, Smithsonian Institution.
  • World Technology Award, 2015.
  • Co-founder, Cognical Zibby, 2012-14.
  • The 50-years of Shakey at AAAI-RSS Blue Sky Ideas award, 2015.
  • RSS Early Career Award, 2014.
  • NSF Career award, 2013.
  • Microsoft Faculty Fellow, 2012.
  • Alfred P. Sloan Fellow, 2011.
  • Best Cognitive Robotics paper, IROS'14. Best student paper, RSS'13.
  • CUAir at AUVSI'12: First prize, mission performance.
  • Google Faculty Research Award, 2012.


RoboBrain project, Director

An Engine for Robots to Share Knowledge learned from Intenet, from videos, through human interaction and from robots at partner institutions.

The Plan to Build a Massive Online Brain for All the World’s Robots, Wired '14.

Brainy, Yes, but Far From Handy. John Markoff, New York Times '14.

Robot Learning

Prof. Saxena's group developed new machine learning techniques for robots: RoboBarista, TellMeDave, et al. The 50-years of Shakey at AAAI-RSS Blue Sky Ideas award.

Robot bartender can read your mind, Fox News'13. Also the Daily Show.

Cats, Robot Baristas, Tricorders, and the Future of Deep Learning. Huffington Post '15.

Holopad: TinTin, Chief Scientist

Steven Spielberg introduces the Art of the Adventures of Tintin App, with Holopad. Holopad evolves books into the digital age by re-imagining how we can engage and interact with what was once a static printed book.

TinTin iPad Art Book Blurs The Line Between Books, Movies, And Apps. Techcrunch.

Brain4Cars, Director

Machine Learning for Smart Car Cabins. Deep Learning anticipates driver's actions before they happen, using cameras.

New Technology May Prevent Accidents By Reading Drivers' Body Language, Forbes '15.

Brain4Cars: Car Predicts Driving Mistakes Before They Happen. CNN'15.

Predict Effect, Co-founder

Audience Acquisiton and Monetization Platform. Our collective intelligence graph captures all users activity (topics, interests, sentiment, etc.) into a deep latent representation space.

Zunavision, Co-founder

We enable content owners to assign ad-spaces inside the physical space of their videos, ready for advertiser branding.

Embed Ads In User-Generated Videos With ZunaVision. New York Times. ABC News.

PhD Students

Yun Jiang

Yun Jiang, 2014 McMullen Fellow

Hallucinating Humans: Learning Infinite Latent CRFs for 3D Perception and Mobile Manipulation.

Getting 'hallucinating' robots to arrange your room for you. Kurzweil AI.

Hema Koppula

Hema Koppula, 2015 Google PhD Fellow, best student paper

Modeling Humans from multi-modal data, anticipate their future actions, and have robots collaborate with them.

Robot bartender can read your mind, Fox News'13. Also the Daily Show.

Jaeyong Sung

Jaeyong Sung  

RoboBarista: Learning Maniplation Skills from Videos. The 50-years of Shakey at AAAI-RSS Blue Sky Ideas award, 2015.

Crowdsourcing a coffee-pouring, juice-making robot. Wired '15.

Chenxia Wu

Chenxia Wu  

Google Excellence Scholar '12. Unsupervised 3D Scene and Video Understanding.


Changxi Zheng

Changxi Zheng, 2012

Robot Manipulation. Doug James' PhD student in graphics, now assistant professor at Columbia University.

Robot smart enough to clean your room (but not to have excuse to get out of it). NBC News '12.

Dipendra K Misra, MS 

Tell Me Dave: Large-scale crowdsourcing for learning robot language.

New robot learns from plain speech, not computer code. LA Times '14.

Congcong Li

Congcong Li, 2012

Large-scale computer vision with Cascaded Classification Models. And aesthetics (with Prof. Tsuhan Chen)

Zhaoyin Jia

Zhaoyin Jia 2013, with Tsuhan Chen

3D Scene Understanding with Physics.