Cornell Movie-Dialogs Corpus (English)

动态语法

管理员
Staff member
Cornell Movie-Dialogs Corpus, a large, metadata-rich collection of conversations extracted from movie scripts. The data includes over 220,000 conversational exchanges involving in total 9000+ characters from 617 movies. Prior uses of this corpus include:

* Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg and Lillian Lee.
"You had me at hello: How phrasing affects memorability". ACL 2012.

* Tyler Schnoebelen, Feb 2012: how "like" and "I mean" vary across movie genre,
gender, and cast position.
http://corplinguistics.wordpress.com/2012/02/23/like-lets-go-to-the-movies-i-mean/

* Cristian Danescu-Niculescu-Mizil and Lillian Lee, "Chameleons in imagined
conversations: A new approach to understanding coordination of linguistic style
in dialogs", ACL 2011 workshop on Cognitive Modeling and Computational Linguistics.

The download site is:
http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html

Cristian Danescu-Niculescu-Mizil and Lillian Lee
 
回复: Cornell Movie-Dialogs Corpus (English)

谢谢推荐,研究结果还在Nature上报道了,看来研究的问题要问得足够有趣
 
回复: Cornell Movie-Dialogs Corpus (English)

相当不错的免费口语语料资源库!感谢陶教授提供的链接信息!向该语料库的创建者们致敬!
 
Back
顶部