The British Academic Spoken English (BASE) corpus


Staff member

About BASE
The British Academic Spoken English (BASE) corpus

The British Academic Spoken English (BASE) corpus was developed at the Universities of Warwick and Reading, under the directorship of Hilary Nesi, with Paul Thompson. Natalie Snodgrass and Sarah Creer were employed as research assistants and Tim Kelly was video director for the project. Lou Burnard (Oxford University) and Adam Kilgarriff (Lexicography MasterClass Ltd) acted as consultants. The BASE corpus consists of 160 lectures and 39 seminars recorded in a variety of university departments. It contains 1,644,942 tokens in total (lectures and seminars). Holdings are distributed across four broad disciplinary groups, each represented by 40 lectures and 10 seminars. These groups are:

* Arts and Humanities
* Social Studies and Sciences
* Physical Sciences
* Life and Medical Sciences

The project enhances the British Academic Spoken English (BASE) corpus, which functions as a companion to the Michigan Corpus of Spoken Academic English (MICASE), a record of North American academic speech.

The corpus enables, amongst other things, the investigation of:
(i) the frequency and range of academic lexis;
(ii) the meaning and use of individual words and multi-word units;
(iii) the information structure and thematic structure of academic lectures;
(iv) the pace, density and delivery styles of academic lectures;
(v) patterns of interaction, including turn-taking and topic selection in seminars;
(vi) the representation of ideas and the expression of attitudes;
(vii) variation between British and American academic speech.

BASE will remain a record of British spoken academic discourse at the turn of the century, and may also be compared with corpora compiled in the future to investigate diachronic change in academic language use.
回复: The British Academic Spoken English (BASE) corpus