Georgia Tech Computational Linguistics Lab

The Georgia Tech Computational Linguistics Lab works at the intersection of computer science and linguistics, designing new computational techniques for processing and understanding human language. We are especially focused on machine learning approaches, which leverage large-scale data sets to acquire language processing capabilities from examples. Another focus area is in computational sociolinguistics, using computational analysis to better understand the rich connections between language and social phenomena.

Recent highlights

  • July 2017: Mimicking Word Embeddings using Subword RNNs (Yuval Pinter, Robert Guthrie, Jacob Eisenstein) is accepted to EMNLP.
  • June 2017: Umashanthi Pavalanathan is interning with Facebook Data Science, Sandeep Soni will be visiting the Max Planck Institute, Yuval Pinter is an intern in Home Depot’s data science team, and James Mullenbach is interning with Zappos.
  • April 2017: Congrats to Umashanthi Pavalanathan for her successful thesis proposal, and to Sandeep Soni and Ian Stewart for passing their quals!
  • March 2017A Multidimensional Lexicon for Interpersonal Stancetaking (Pavalanathan, Fitzpatrick, Kiesling, and Eisenstein) is accepted to ACL 2017. We computationally formalize interpersonal stancetaking through latent dimensions of (automatically induced) stance markers, and show that it correlates with a range of online phenomena.
  • January 2017: A kernel independence test for geographical language variation (Nguyen and Eisenstein) is accepted to the journal Computational Linguistics! This paper presents a new non-parametric method for detecting geographical language variation, using methods from reproducing kernel hilbert spaces.
  • January 2017: Yi Yang completes his PhD dissertation! Yi has moved to a position as a research scientist at Bloomberg.
  • January 2017Overcoming language variation in sentiment analysis with social attention (by Yang and Eisenstein) is accepted to the journal Transactions of the Association for Computational Linguistics! This paper shows how to overcome language variation in social media texts by exploiting the social network property of homophily.