Chomsky changed the direction of linguistics away from empiricism and towards rationalism in a remarkably short space of time. In doing so he apparently invalidated the corpus as a source of evidence in linguistic enquiry. Chomsky suggested that the corpus could never be a useful tool for the linguist, as the linguist must seek to model language competence rather than performance.
Competence is best described as our tacit, internalised knowledge of a language.
Performance is external evidence of language competence, and is usage on particular occasions when, crucially, factors other than our linguistic competence may affect its form.
Competence both explains and characterises a speaker's knowledge of a language. Performance, however, is a poor mirror of competence. For examples, factors diverse as short term memory limitations or whether or not we have been drinking can alter how we speak on any particular occasion. This brings us to the nub of Chomsky's initial criticism: a corpus is by its very nature a collection of externalised utterances - it is performance data and is therefore a poor guide to modelling linguistic competence.
Further to that, if we are unable to measure linguistic competence, how do we determine from any given utterance what are linguistically relevant performance phenomena? This is a crucial question, for without an answer to this, we are not sure that what we are discovering is directly relevant to linguistics. We may easily be commenting on the effects of drink on speech production without knowing it.
However, this was not the only criticism that Chomsky had of the early corpus linguistics approach.
Competence is best described as our tacit, internalised knowledge of a language.
Performance is external evidence of language competence, and is usage on particular occasions when, crucially, factors other than our linguistic competence may affect its form.
Competence both explains and characterises a speaker's knowledge of a language. Performance, however, is a poor mirror of competence. For examples, factors diverse as short term memory limitations or whether or not we have been drinking can alter how we speak on any particular occasion. This brings us to the nub of Chomsky's initial criticism: a corpus is by its very nature a collection of externalised utterances - it is performance data and is therefore a poor guide to modelling linguistic competence.
Further to that, if we are unable to measure linguistic competence, how do we determine from any given utterance what are linguistically relevant performance phenomena? This is a crucial question, for without an answer to this, we are not sure that what we are discovering is directly relevant to linguistics. We may easily be commenting on the effects of drink on speech production without knowing it.
However, this was not the only criticism that Chomsky had of the early corpus linguistics approach.