PDA

查看完整版本 : Some free online Chinese corpora


xiaoz
2005-06-16, 11:13 PM
Academia Sinica Balanced Corpus of Modern Chinese
http://www.sinica.edu.tw/SinicaCorpus/

Peking University Modern Chinese Corpus
http://ccl.pku.edu.cn/ccl_corpus/xiandaihanyu/

Xiamen University corpora (registration required but free)
http://xmuoec.com/gb/hanyu/hanyu/data/corpus/index.htm

Beijing Language and Culture University corpus
http://202.112.195.8

Lancaster Corpus of Mandarin Chinese
http://bowland-files.lancs.ac.uk/corplang/cgi-bin/conc.pl

Leeds Chinese corpus
http://corpus.leeds.ac.uk/query-zh.html

PFR People's Daily corpus (01/1998)
http://bowland-files.lancs.ac.uk/corplang/pdcorpus/pdcorpus.htm

PH corpus (Xinhua newswire data 1990-1991)
http://bowland-files.lancs.ac.uk/corplang/phcorpus/phcorpus.htm

People's Daily 2000 corpus
http://bowland-files.lancs.ac.uk/corplang/pdc2000/default.htm

Peking University Ancient Chinese Corpus
http://ccl.pku.edu.cn/ccl_corpus/jsearch/index.jsp?dir=gudai

Sinica corpus of early Chinese
http://www.sinica.edu.tw/Early_Mandarin/

Sheffield Corpus of Chinese for Diachronic Linguistic Study
http://www.shef.ac.uk/scc/

xujiajin
2005-06-17, 01:25 AM
This is the most complete list of (freely available) Chinese corpora I have ever seen. Thanks a lot, Richard.

xujiajin
2005-06-17, 01:55 AM
请求设为精华。

xiaoz
2005-06-17, 05:46 AM
Done.

清风出袖
2005-06-27, 07:42 PM
great! yet we are really need of a powerful concordancer for Chinese corpus, aren' we?

动态语法
2005-07-01, 04:19 AM
以下是引用 清风出袖 在 2005-6-27 19:42:51 的发言:
great! yet we are really need of a powerful concordancer for Chinese corpus, aren' we?



Have you tried Concordance by Rob Watt for the Windows PC plateform? Is that powerful enough?

xujiajin
2005-07-01, 09:41 AM
that demo concordancer is very user-friendly, but doesn't support Chinese. and it is just a demo version. Right? Any good idea of getting a fully functional Chinese-compatible concordancer?

xujiajin
2005-07-01, 09:43 AM
Here I quote Xiaoz "Concordance (by Rob Watt, available at http://www.concordancesoftware.co.uk/) can also work with Chinese and other East Asian languages - claims to be work with any language, but have not tried that."

动态语法
2005-07-12, 12:27 PM
以下是引用 xujiajin 在 2005-7-1 9:41:43 的发言:
that demo concordancer is very user-friendly, but doesn't support Chinese. and it is just a demo version. Right? Any good idea of getting a fully functional Chinese-compatible concordancer?


The demo version IS the full version with the only restriction being that it is limited to one month's free use.

It does support Chinese and many other (Asian) languages since it is unicode based. Marjorie Chan has a user guide for working with Chinese texts with Concordance, which I believe was linked to somehwere on this site.

xiaoz
2005-07-12, 10:58 PM
See http://www.corpus4u.org/showthread.php?t=256

xujiajin
2005-07-12, 11:01 PM
That post is pasted here for your reference:

Concordancers and Concordances: Tools for Chinese Language Teaching and Research

Marjorie K.M. Chan

In Journal of the Chinese Language Teachers Association, Volume 37:2. May 2002.

This paper presents an introduction to concordancers, and to the concordancing of Chinese e-texts in particular. Demonstrations are given of searches using spaced and non-spaced source e-texts, with the concordance results presented in Keyword-in-Context (KWIC) display format. There are illustrations to accompany discussions of full-text concordances, and of concordances targeting specific words or phrases. The writer suggests how concordancers might be used in language-teaching and in conducting research on various linguistic phenomena of the Chinese language. An appendix compares several concordancing programs capable of handling Chinese e-texts.

She is Chinese. Her Chinese name is 陈洁雯. Her home page:
http://people.cohums.ohio-state.edu/chan9/

xujiajin
2005-07-12, 11:02 PM
http://people.cohums.ohio-state.edu/chan9/articles/Chan_JCLTA-2002.pdf

cncorpus
2006-11-07, 12:01 AM
great! yet we are really need of a powerful concordancer for Chinese corpus, aren' we?

一个小的在线concordancer:

http://lingua.mtsu.edu/chinese-computing/concord/concordancer.php

xujiajin
2006-11-07, 08:18 AM
What kind of data are included in the online system?

mayerniu
2007-05-09, 11:13 PM
See http://www.corpus4u.com/forum_view.asp?forum_id=54&view_id=256

The copora offered are still very useful to me today, so I am quite grateful to Dr. Xiaozi, Dr. Xu and Dr. Tao. Thank you all!

waynereed
2007-12-27, 03:56 PM
The URL of Xiamen University corpora has been changed:
http://www.xmuoec.com/gb/hanyu/hanyu/data/corpus/index.htm
You can also visit the author's personal website at:
http://www.luweixmu.com/

waynereed
2007-12-27, 03:58 PM
Academia Sinica Balanced Corpus of Modern Chinese
http://www.sinica.edu.tw/SinicaCorpus/

Peking University Modern Chinese Corpus
http://ccl.pku.edu.cn/ccl_corpus/xiandaihanyu/

Xiamen University corpora (registration required but free)
http://www.xmuoec.com/gb/hanyu/hanyu/data/corpus/index.htm

Beijing Language and Culture University corpus
http://202.112.195.8:8089/ccir_login?input=*

Lancaster Corpus of Mandarin Chinese
http://bowland-files.lancs.ac.uk/corplang/cgi-bin/conc.pl

Leeds Chinese corpus
http://corpus.leeds.ac.uk/query-zh.html

PFR People's Daily corpus (01/1998)
http://bowland-files.lancs.ac.uk/corplang/pdcorpus/pdcorpus.htm

PH corpus (Xinhua newswire data 1990-1991)
http://bowland-files.lancs.ac.uk/corplang/phcorpus/phcorpus.htm

People's Daily 2000 corpus
http://bowland-files.lancs.ac.uk/corplang/pdc2000/default.htm

Peking University Ancient Chinese Corpus
http://ccl.pku.edu.cn/ccl_corpus/jsearch/index.jsp?dir=gudai

Sinica corpus of early Chinese
http://www.sinica.edu.tw/Early_Mandarin/

Sheffield Corpus of Chinese for Diachronic Linguistic Study
http://www.shef.ac.uk/scc/

waynereed
2007-12-27, 04:04 PM
The URL of Xiamen University corpora has been changed to:
http://www.xmuoec.com/gb/hanyu/hanyu/data/corpus/index.htm
It's available at the author's personal website:
http://www.luweixmu.com/

armstrong
2007-12-27, 08:05 PM
The URL of Xiamen University corpora has been changed to:
http://www.xmuoec.com/gb/hanyu/hanyu/data/corpus/index.htm
It's available at the author's personal website:
http://www.luweixmu.com/


thanks a lot,waynereed.

jjn_001
2008-01-12, 09:20 PM
thank u so much! I need them so urgently for my opening thesis report!

qhlonline
2008-04-29, 11:37 AM
请问有中文短信语料库没?

lvkangpeng
2008-05-08, 11:28 PM
两个样本的平均值已知,想知道他们这两个数值之间是否存在显著性差异用哪种方法呢?是单一样本t检验吗?请求指教!

xierqionger
2008-06-15, 05:16 PM
very useful!thanks!but why cannot open the peking university corpus?