搜寻结果

  1. I

    BFSU Sentence Collector 1.0 基于语料库的英语例句提取工具

    回复: BFSU Sentence Collector 1.0 基于语料库的英语例句提取工具 It is possible to use our own corpus in Sentence Collector 1.0, with a little bit twisting with sentence segmentation, sorting, new words marking,file format conversion, index configuration, etc ... But I am not quite sure whether that is what the...
  2. I

    Dr Zhang Le's software for NLP

    Dr Zhang Le毕业于东北大学,很久以前曾为ICTCLAS写过Linux版,现在Edinburgh大学任职,开发了自然语言处理的很多工具,尤其是Morphix-NLP (发表于2003)值得一看。 Morphix-NLP Package List Tokenization Qtoken - a portable tokeniser MXTERMINATOR ICTCLAS Part-of-Speech Tagger Brill's TBL Tagger MXPOST...
  3. I

    日语语料库建成了

    回复: 日语语料库建成了 谢谢,这下学日语多了一个去处了,界面很有日式的唯美感。 抓了个虫,firefox下检索结果整体偏到了右边。
  4. I

    Yacsi: Another ICTCLAS 2012 GUI

    回复: Yacsi: Another ICTCLAS 2012 GUI Hi all, Yacsi 0.96 is out with two features added and one bug fixed. The previous Yacsi 0.95+ turns out to be a premature release and ineffective in reporting problems correctly (sorry for that, guys). So if you are a perfectionist, please consider...
  5. I

    Yacsi: Another ICTCLAS 2012 GUI

    回复: Yacsi: Another ICTCLAS 2012 GUI Yacsi 0.95+ One bug fixed: the program freezes when segmenting large quantities of files (no real hurt in terms of linguistic analysis, just a poor user experience)
  6. I

    大家来说说专业语料库到底怎么了?

    回复: 大家来说说专业语料库到底怎么了? 外语专业和其他专业结合比较疏松可能是主要原因,另外可能还有学科间学术领导权的问题和版权问题
  7. I

    有道和语料库的关系?

    回复: 有道和语料库的关系? 有道词典有原声例句,从技术上看是视频与文本是句对齐的,想不到多模态语料库这么快就产业化了,多年不问世事,孤陋寡闻了。:-)
  8. I

    Yacsi: Another ICTCLAS 2012 GUI

    回复: Yacsi: Another ICTCLAS 2012 GUI hi all, Two features are added in Yacsi 0.95. Thanks for your interest and suggestions. I also updated Yacsi 0.93 by fixing the "查看结果" problem (now termed Yacsi 0.93+) for those people who always see beauty in simplicity. As always, suggestions are...
  9. I

    Yacsi: Another ICTCLAS 2012 GUI

    回复: Yacsi: Another ICTCLAS 2012 GUI Hi all, One bug fixed and one feature added in Yacsi 0.94. Thanks for your suggestions. Regards, iCasino History: 2012/03/05 Yacsi 0.9 2010/03/06 Yacsi 0.92 2012/03/11 Yacsi 0.93 2012/03/14 Yacsi 0.94
  10. I

    Yacsi: Another ICTCLAS 2012 GUI

    回复: Yacsi: Another ICTCLAS 2012 GUI Hi all, Two bugs are fixed and one feature added in Yacsi 0.93. Please consider the newer version for better use. Regards, iCasino History: 2012/03/05 Yacsi 0.9 2012/03/06 Yacsi 0.92 2012/03/11 Yacsi 0.93 Bugs fixed in 0.93...
  11. I

    Apache OpenNLP

    回复: Apache OpenNLP Great stuff for building language-based web applications. Thanks.
  12. I

    Yacsi: Another ICTCLAS 2012 GUI

    回复: [Downlod] Yacsi: An ICTCLAS2012 GUI Dr Zhang Huaping said recently (2010?) that the improvement of newer ICTCLAS lies in speed rather than accuracy (around 98%).That's an inherent limitation of statistical POS tagging, plus some difficulty of tagging Chinese even for human experts, I...
  13. I

    Yacsi: Another ICTCLAS 2012 GUI

    回复: [Downlod] Yacsi: An ICTCLAS2012 GUI Dr Zhang said:" 为保障用户使用的便利,从本版(ICTCLAS2012-SDK-u0106.rar)开始,调用的dll的名称一律为ICTCLAS2011.dll,不再变化,一般用户只需要变更dll及对应的.user授权文件,无需重新编译自己的程序,即可兼容新版本分词程序。 " So I guess we still need to update the dll and .user file periodically to keep up with the pace of ICTCLAS...
  14. I

    Yacsi: Another ICTCLAS 2012 GUI

    [Downlod] Yacsi: An ICTCLAS2012 GUI 链接中为YACSI分词工具的各个历史版本。 http://ishare.iask.sina.com.cn/f/24241229.html Hi all, I've developed another graphical user interface named Yacsi to ICTCLAS (version 2012). For this program to start up, you need to download ICTCLAS2012-SDK-0101.rar from...
  15. I

    如何从网页抓取语料库

    回复: 如何从网页抓取语料库 You might try Nutch if you are comfortable with Java. http://nutch.apache.org/
  16. I

    就搭配的值z值一般大于等于2为显著搭配,那么如果用MI 值度量,一般公认为多大为显著呢?

    回复: 就搭配的值z值一般大于等于2为显著搭配,那么如果用MI 值度量,一般公认为多大为显著呢? 关于Fisher的那句话,我想作一下补充。 我主要是参考了Andy Field在“Discovering Statistics Using SPSS(3rd edition,page 51)”一书中对p<0.05这个值的讨论。他用了大约500多字对这个问题进行了论述,大意是说Fisher也是受纸张和计算量的限制,只列出了有限的几个参照值(0.05,0.02,0.01),而Fisher的“Statistical methods for research...
  17. I

    就搭配的值z值一般大于等于2为显著搭配,那么如果用MI 值度量,一般公认为多大为显著呢?

    回复: 就搭配的值z值一般大于等于2为显著搭配,那么如果用MI 值度量,一般公认为多大为显著呢? 谢谢xu博的理解,也感谢xu博(和其他很多同仁)提供了这个学习的平台,祝愿Corpus4U越办越好。
  18. I

    如何计算一个词组和它的搭配词的MI值?

    回复: 如何计算一个词组和它的搭配词的MI值? 呵呵,你想挑起第三次世界大战吗?(just a joke)http://www.corpus4u.org/showthread.php?t=5827
  19. I

    就搭配的值z值一般大于等于2为显著搭配,那么如果用MI 值度量,一般公认为多大为显著呢?

    回复: 就搭配的值z值一般大于等于2为显著搭配,那么如果用MI 值度量,一般公认为多大为显著呢? 这是两个不同性质的问题。应用语言学研究学者观测的不是具体可计算的问题,其临界不会有人争议;而语料库的单词搭配是有经验可肉眼判断的,麻烦会大些。而且,提出MI这一算法适用于词汇研究的的应属Church & Hanks(1990),他们并没有明确提及MI临界值,只是说: author = {Kenneth Ward Church and Patrick Hanks}, title = {Word Association Norms, Mutual Information, and...
  20. I

    就搭配的值z值一般大于等于2为显著搭配,那么如果用MI 值度量,一般公认为多大为显著呢?

    回复: 就搭配的值z值一般大于等于2为显著搭配,那么如果用MI 值度量,一般公认为多大为显著呢? 这不太可能吧,任何统计学家都不会提供这一值的绝对值的,因为这一统计量的临界值是随行业变动的。只有使用这一统计量的行业专家才会(根据经验)提供一个推荐值。如果我们认为Hunston和McEnery不够专业的话,那恐怕也没有办法了。在具体的操作中,即便是这个推荐的临界值也不是适用所有的语料库,还要看语料库的大小和各个具体单词的具体分布情况,Xu博自己也指出来了,有时看排序还方便些。
Back
顶部