如何将BNC语料中POS tag去掉并存为.txt格式?有程序下载哦!

chrisyang

普通会员
问题如上所述。

[本贴已被 作者 于 2005年12月29日 16时58分54秒 编辑过]

[本贴已被 xujiajin 于 2005年12月30日 10时15分56秒 编辑过]
 
PowerGREP和其它许多软的查找替换试一试
WST4和MonoConc都可以把Tags隐藏起来,并不影响词频统计和检索。
 
多谢!由于选用的是口语语料,且只想去掉词性标注符号以便按所需再另行标注,觉得挺难。之前也用EditPlus和PowerGREP试过,由于对Regex不是太熟悉,效率不是很高。
 
如何将所选择的BNC的语料比较稳妥地去掉词性标注后转换为.txt格式?

Here is the program I used to detag the BNC before retagging it using the C7 tagset. It removes everything other than the orignal texts and transcripts. you will need to install Perl in order to use the program, which is free. Then follow the steps below:

1) Make a new directory on the machine;
2) COPY the selected files to the dir;
3) Unzip the perl script into the same dir;
4) Double click the program file

A new file will be created for each BNC file, ending in .txt. These new files are what you want.

Warning: This program only works with BNC files.
http://www.corpus4u.org/upload/forum/2005122923225827.zip
 
回复:如何将BNC语料中POS tag去掉并存为.txt格式?有程序下载哦!

Many thanks, Richard! I've succeeded in doing that by following your way. And it works at an incredible speed! The attatched are one spoken text labellled as D8Y from the BNC and its .txt counterpart. Those who are interested in this topic may have a try.http://www.corpus4u.org/upload/forum/2005123011331369.rar
 
回复:如何将BNC语料中POS tag去掉并存为.txt格式?有程序下载哦!

以下是引用 xiaoz2005-12-29 23:23:05 的发言:
Here is the program I used to detag the BNC before retagging it using the C7 tagset. It removes everything other than the orignal texts and transcripts. you will need to install Perl in order to use the program, which is free. Then follow the steps below:

1) Make a new directory on the machine;
2) COPY the selected files to the dir;
3) Unzip the perl script into the same dir;
4) Double click the program file

A new file will be created for each BNC file, ending in .txt. These new files are what you want.

Warning: This program only works with BNC files.
http://www.corpus4u.org/upload/forum/2005122923225827.zip


Excume me, where can I find the Perl script?
I've downloaded Perl but don't know how to carry out these steps.
Thank you!!!
 
Back
顶部