I am a PhD student in Computer Science at Stanford University. I am working with Christopher D. Manning in the Stanford Natural Language Processing (NLP) group. I also collaborate with the Interactive Data Lab under Jeffrey Heer. For my Master's thesis, I worked on combining NLP and computer vision with Ray Mooney and Kristen Grauman. Here is my (probably outdated) resume.
Conferences and Workshops
Induced Lexico-Syntactic Patterns Improve Information Extraction from Online Medical Forums.
Sonal Gupta, Diana MacLean, Jeffrey Heer, and Christopher D. Manning.
Journal of the American Medical Informatics Association (JAMIA). 2014.
pdf (email me for a copy if you are not able to access it)
Prescription Opioid Addicts Seek Advice on Opioid Withdrawal from Peers Online.
Diana MacLean, Sonal Gupta, Anna Lembke, Christopher D. Manning, and Jeffrey Heer.
Under Review. 2014
Improved Pattern Learning for Bootstrapped Entity Extraction.
Sonal Gupta and Christopher D. Manning.
In Proceedings of the Eighteenth Conference on Computational Natural Language Learning (CoNLL) (CoNLL 2014).
[pdf; Supplementary; bib; slides] Download pattern learning code
SPIED: Stanford Pattern-based Information Extraction and Diagnostics.
Sonal Gupta and Christopher D. Manning.
In Proceedings of the ACL 2014 Workshop on Interactive Language Learning, Visualization, and Interfaces (ACL-ILLVI 2014).
[pdf bib] Download visualization code
Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment.
Jason Chuang, Sonal Gupta, Christopher D. Manning, Jeffrey Heer.
In International Conference on Machine Learning (ICML 2013).
Stanford's Distantly-Supervised Slot-Filling System.
Mihai Surdeanu, Sonal Gupta, John Bauer, David McClosky, Angel X. Chang, Valentin I. Spitkovsky, and Christopher D. Manning.
In Proceedings of the Fourth Text Analysis Conference (TAC 2011).
Analyzing the Dynamics of Research by Extracting Key Aspects of Scientific Papers.
Sonal Gupta, Christopher D. Manning.
In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP 2011).
[pdf,bib; data; seed patterns]
(Earlier version) Identifying Focus, Techniques and Domain of Scientific Papers
Sonal Gupta, Christopher D. Manning. NIPS CSS 2010 : Computational Social Science and the Wisdom of Crowds, Whistler, Canada
pdf More Graphs
Activity Retrieval in Closed Captioned Videos
Master's thesis, University of Texas at Austin
Catching the Drift: Learning Broad Matches from Clickthrough Data
Sonal Gupta, Mikhail Bilenko and Matthew Richardson. ACM SIGKDD international conference on Knowledge Discovery and Data Mining (KDD 2009), Paris, France
pdf ppt abstract
Watch, Listen & Learn: Co-training on Captioned Images and Videos
Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney
In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2008) , Antwerp, Belgium, September 2008.
I spent the summer of 2012 at the Data Science team at PayPal Inc. working with Junling Hu.
I spent the summer of 2008 at Microsoft Research, working with Misha Bilenko and Matthew Richardson.
I spent the summer of 2006 at Google R&D Center at Bangalore (now Bengarulu), working with Vineet Gupta.
- Bootstrapped pattern-based entity extraction
- Patter-based entity extraction visualization
FTD dataset and seed patterns from the paper Analyzing the Dynamics of Research by Extracting Key Aspects of Scientific Papers.
I graduated from the University of Texas at Austin with a Master's (thesis option) in Computer Science. At UT, I worked with Dr. Ray Mooney on combining text and visual perception. Before that, I did my B.Tech. degree at Indian Institute of Technology (IIT) Roorkee, India in Computer Science and Engineering. IIT Roorkee (formerly University of Roorkee and Thomson College of Engineering) is the oldest engineering college in India. At IIT-R, I worked with Dr. Kumkum Garg and Dr. R. C. Joshi.
In the grad school, I have taken following relevant courses:
(Social and Information) Network Analysis; Probabilistic Graphical Models; Convex Optimization;Machine Learning; Computer Vision; Visual Recognition and Search; Networks, Protocol and Security; Natural Language Processing; Numerical Linear Algebra; Modern Statistical Methods; Introduction to Math Logic
I was born and brought up in a beautiful and quiet city-Bhopal, India. Unfortunately, it is known for Bhopal Gas Tragedy. Bhopal is sometimes called 'City of Lakes' as it has many natural and man-made lakes. I did my schooling from Jawahar Lal Nehru School. In 2003, I moved to Roorkee to study Computer Science at IIT Roorkee . Roorkee is a small town near Haridwar, Dehradun and Hrishikesh. I enjoyed my four years at Roorkee to fullest with frequent trips to near-by hill stations at the foothill of Himalayas (Rishikesh, Shimla, Manali, Auli, Mussoorie, Dehradun, Haridwar etc.). IIT Roorkee is also a nice place to be to do white water rafting at Ganges, with IITR bearing most of the cost :)
Off lately, I have been biking, swimming, and running (my weakest of the three). I hope to do a triatholon some day soon. When I have too much free time (i.e. almost never these days), I like to paint; some of the paintings can be seen here.
One of my main weaknesses is good spicy (mostly vegetarian) food, with the favorites being: Indian, Mexican, Indian Chinese and Thai. I like to learn new cuisines and cook at home.
Asha for Education, Stanford
My undergrad college: IIT Roorkee
Some best moments of my life: Information Management Group
My Pics: Picasa Web Album
Some Interesting Links
Indian English, which I find to be quite different from American English! Well, my favorite language is Hinglish. I think it will be interesting to study Hinglish as a language.
Email Address: see top of the page
Postal Address: Computer Science Department
353 Serra Mall, Stanford University
Stanford, CA 94305
Office Location: Gates 233