查看完整版本 : 什么软件可以分析汉语词汇的搭配?
rainbow
2007-01-11, 10:24 AM
请教汉语检索的专家们
什么样的软件可以像wordsmith一样分析汉语的key word in context,能够检索出搭配并统计出现频率?谢谢赐教
armstrong
2007-01-11, 10:52 AM
wordsmith 4.0就可以,前提是必须进行分词处理。另外AntConC,MonoConc Pro也可以。
rainbow
2007-01-11, 10:59 AM
谢谢armstrong,我在每个字前加了空格,可wordsmith4显示找不到entry,能否请大侠演示一下具体怎么使用?
laohong
2007-01-11, 11:34 AM
请阅读下面几个帖子及其回复:
用AntConc处理中文concordance, wordlist, N-gram
http://forum.corpus4u.org/showthread.php?t=1714&page=11&highlight=antconc
AntConc3.2加入了file-based concordancing功能
http://forum.corpus4u.org/showthread.php?t=2345&highlight=antconc
xujiajin
2007-01-11, 03:50 PM
进入WordSmith的中文文本需要存为Unicode。
rainbow
2007-01-11, 07:36 PM
thanks again laolong&Dr.xu
xiaoz
2007-01-11, 07:46 PM
...and Xaira, which is free but requires the text to be marked up in XML.
rainbow
2007-01-11, 08:13 PM
thanks xiaoz, but i would like to analyse .txt text,is there any solution if xaira is used to do the research?
xiaoz
2007-01-11, 08:33 PM
.txt is a file saving format (for plain text), and XML is a mark-up format. An XML can also be saved as a .txt file.
To analyse collocation in Chinese text using Xaira, you must first of all tokenise/segment the text using software such ICTCLAS, the result would look like: word1/tag1 word2/tag2... Then you can convert the wor-tag pair frmt he backslash style to XML style - I uploaded some perl scripts for such jobs to the site last year (just search the site). It is easy to make the whole file XML-compliant: just add a tag at the beginning and end of the file, by yourself, or usng Preprocessing in Xaira Indexer tool.
After indexing your corpus, you can then use the Xaira client with your corpus.
rainbow
2007-01-12, 10:22 PM
:) many thanks for xiaoz's patient explanation.
yuliao
2007-06-05, 04:06 PM
到这个网站看看
中文助手
www.chinesehelper.cn (http://www.chinesehelper.cn)
词语解释下面的词语搭配度就是用信息处理的软件统计出来的搭配词频数据。
laohong
2007-06-05, 05:43 PM
Sketch Engine
http://www.sketchengine.co.uk/
vBulletin® v3.7.4,版权所有 ©2000-2009,Jelsoft Enterprises Ltd.