请问如何进行快速地标注语料中的错误?

清风出袖

高级会员
有没有专门的软件进行自动标注,如果没有,进行人工标注的成功典范能否给提供一个,另外手工标注重要注意的问题有哪些?谢谢回复!
 
不知道你想进行什么类型的标注?
英汉语词性标注(POS tagging)目前基本可以进行较高准确率的自动标注。
 
Not clear what type of errors you mean, learner errors in learner corpora? orthographical typos in original texts in native corpora? or errors of automatic processing?
 
First, do a pilot annotation on a small sample of your corpus and try to major types or errors. Remember to make ur tags easy to read and consistent and then apply to the whole corpus. The 62 categories of errors of CLEC can be a reference in ur annotation.

U can follow the directions in the post 用Word 制作机助附码工具:不会编程也能做http://www.corpus4u.com/forum_view.asp?forum_id=7&view_id=678 to do ur annotation but do not expect there is such a word "faster" in manual annotation because the speed depends on ur interpretation of the errors.
 
我就是想将原来学生作文中的语言错误标示出来,进行分类,有没有比较好的办法,简单而且快捷!谢谢看来中文也不利索了!呵呵!
 
There is no reliable way to tetect learner errors. Error tagging is currently done by hand. But attention must be paid to consistency in human annotation.
 
As error tags are interpretive in nature, you will come up with many research questions during the time-consuming annotation process. In this sense, annotation is analysis.
 
Annotation is analysis, it is indeed! time is often taken to do analysis, so it is indeed back-breaking and nerve-racking to build and tag corpus! thanks a lot!
 
Basically, I am not implying that annotation is time- and energy-consuming when referring it as analysis. I mean it is a rewarding linguistic investigation.
 
Back
顶部