Georgia Tech at NAACL 2016

Georgia Tech had three papers at this year’s meeting of the North American Association for Computational Linguistics (NAACL).

Yangfeng Ji (who is on his way to a postdoc at University of Washington after successfully defending his thesis in May!) presented a full paper entitled, A Latent Variable Recurrent Neural Network for Discourse Relation Language Models, which he co-authored with Gholamreza Haffari and me. This paper described a hybrid neural/latent variable architecture for incorporating shallow (PDTB-style) discourse relations into language models. The method improves language modeling as well as discourse relation classification. The paper is based on our work at the JSALT workshop last summer, and we hope to use the method to integrate discourse into multi-sentence generation tasks, such as summarization and machine translation.

Yi Yang presented another full paper, Part-of-Speech Tagging for Historical English. In this paper, we evaluated a number of approaches for accurately tagging the syntax of early modern English texts. Spelling normalization is the typical approach, but we showed that treating this as a domain adaptation problem can help. It was a nice opportunity to showcase our “Feature Embeddings for Multi-Attribute Adaptation” method, which applies the idea of skipgrams to feature templates, thereby exploiting some light domain knowledge in the form of the templates.

Finally, Yuval Pinter, who will be joining us in the fall, presented some work from Yahoo research. His paper, entitled Syntactic Parsing of Web Queries with Question Intent, showed how to identify dependency treelets in ungrammatical web queries, improving the ability of search engines to link queries to relevant question/answer pages.

We also got a chance to escape airport hotel island and explore some local cuisine. The tater tot filled burritos brought me back to my Pittsburgh days of french fries on salads.