American National Corpus (ANC)

Haiyang Ai

Staff member
The American National Corpus (ANC) project is creating a massive electronic collection of American English, including texts of all genres and transcripts of spoken data produced from 1990 onward. The ANC will provide the most comprehensive picture of American English ever created, and will serve as a resource for education, linguistic and lexicographic research, and technology development.

When completed, the ANC will contain a core corpus of at least 100 million words, comparable across genres to the British National Corpus (BNC). Beyond this, the corpus will include an additional component of potentially several hundreds of millions of words, chosen to provide both the broadest and largest selection of texts possible.