请问CLAWS和Tree Tagger的标注到底有什么区别

本文由 我是水2016-02-05 发表於 "语料库标注" 讨论区

?

请问CLAWS和Tree Tagger的标注到底有什么区别

  1. annotation

    0 票
    0.0%
  2. claws

    0 票
    0.0%
  3. tree tagger

    1 票
    100.0%
允许多选项投票.
  1. 请问CLAWS和Tree Tagger的标注到底有什么区别,各自有什么优势和劣势吗?看到CLAWS主要使用ASCII的文本文件,请问能不能同样适用UTF8 的文本文件啊?Tree Tagger好像可以用UTF8,不知道我理解有没有问题?请求帮助,谢谢啊。
     
  2. xujiajin

    xujiajin 管理员 Staff Member

    CLAWS has a more fine-tuned tagset, which can assign over 100 PoS codes to English words. Some find them very helpful, and other criticise that it is unnecessary to have so many codes.
    TreeTagger has a much simpler tagset of 30-something codes.
    Both tools reportedly can achieve over 95% and higher tagging accuracy, which means both can be used in our PoS tagging tasks.

    CLAWS is commercial and TreeTagger free.

    If your texts were not ASCII codified, they could be converted with tools without any difficulty. There are a lot of tools or scripts out there. So encoding does not matter in distinguishing the two taggers.
     
    Last edited: 2016-02-06
  3. 谢谢许教授。过年了您还在忙着给我答疑,白泄气。拜谢。