请教 求助 寻BNC-spoken-CG 部分语料文本

请问怎样才能找到BNC口语部分的conversation 语料文本呢?
(主要是 "concordance-goverened" 那部分)
因写论文,需要其中的文本内容,但是网络在线版本找不到语料,只通过bnc web indexer 找到它们的File ID, 恳请得到帮助和指导,谢谢!
 
BNC is commercially available and it is not advisable to disseminate it in this forum for the sake of copy right. Because the corpus is in itself a large one and has been annotated with a lot of metadata and POS information in XML format, the overall size reaches up to 4.35 gigabytes, which presents another obstacle in uploading and downloading work. What's more, even though you have already known the filenames containing spoken materials, without a specially designed tool, it would still be difficult to extract them and put them into just one folder.
 
回复: 请教 求助 寻BNC-spoken-CG 部分语料文本

Thank you for your reply! But I''m still wondering how to purchase the xml texts so that I might process a little part of the xml edition. I read some papers which selected some texts ( e.g. 30 demographic texts) from the spoken part of BNC, n can I know how to do it? Looking forward to any reply! Thanks
 
回复: 请教 求助 寻BNC-spoken-CG 部分语料文本

Thank you very much, Dr Xu!
不过我现在的根本问题是不知道怎样找到spoken texts, 只是看到30个左右transcription sampler, 还需要一些原始语料。。。。。。
http://www.natcorp.ox.ac.uk/docs/URG.xml?ID=cdifsp 上说 The spoken material transcribed for the BNC is also organized into ‘texts’,这些文本好象很难找到!
 
回复: 请教 求助 寻BNC-spoken-CG 部分语料文本

http://ucrel.lancs.ac.uk/bncindex/form.html
里面显示的很清楚啊,你选一下,就可以得到你要的文件名了。如果你有BNC的文本,利用Sub-corpus_creator就可以分离出你要的口语文本了。

如果你没有整个BNC文本,那就没办法了。
 
回复: 请教 求助 寻BNC-spoken-CG 部分语料文本

谢谢许教授。不好意思,可能我开始没说清楚,我就是没有BNC整个文本,只是找到一小部分,整个文本好象需要购买,但是不知哪里可以买到。
 
Back
顶部