# HH Clark的统计算法

#### 附件

• 555.1 KB 浏览: 55

#### qhdjason

uh uhm
0 1
1 0
0 0
1 1
1 0

0表示没有delay；1表示有delay。这样算的均值倒也是百分比

(0+1+0+1+1) / 5 = 0.6 = 60%

Last edited:

#### qhdjason

[FONT=宋体]自变量是[/FONT] [FONT=宋体]填充语，是分类变量[/FONT] [FONT=宋体]它有两个[/FONT] level[FONT=宋体]：[/FONT] Level 1: uh[FONT=宋体]；[/FONT]Level 2[FONT=宋体]：[/FONT]um
[FONT=宋体]因变量是[/FONT] [FONT=宋体]停顿时间[/FONT] [FONT=宋体]是连续性变量[/FONT]

[FONT=宋体]当自变量是分类变量；因变量是连续性变量时使用方差分析。当只有一个自变量时，用单因素方差分析。注意对比的是因变量的均值差异（如[/FONT]83[FONT=宋体]页图[/FONT]2[FONT=宋体]所示）[/FONT]

[FONT=宋体]你可以用统计软件[/FONT]R[FONT=宋体]做一次实验。你把附件中的数据放到我的文档中，打开[/FONT]R[FONT=宋体]，输入下面的命令：[/FONT]
attach(data)
oneway.test(LENGTH ~ FILLER)

[FONT=宋体]得到下面的结果：[/FONT]
F = 0.1245, num df = 2.000, denom df = 637.164, p-value = 0.883

[FONT=宋体]论文中要汇报：[/FONT]
[FONT=宋体]（[/FONT]1[FONT=宋体]）[/FONT]F[FONT=宋体]值[/FONT] : 0.1245
(2) [FONT=宋体]分子和分母的自由度：[/FONT]2;637 ([FONT=宋体]我的数据中填充语有三个[/FONT]level[FONT=宋体]，所以分子自由度为[/FONT] 2[FONT=宋体]，你给的论文中有两个[/FONT]level[FONT=宋体]，因此是[/FONT]1)
(3) [FONT=宋体]显著水平：[/FONT]0.883

[FONT=宋体]数据来自[/FONT]Gries[FONT=宋体]的著作：[/FONT]Statistics for Linguistics with R

#### 附件

• 27.1 KB 浏览: 10
Last edited:

#### miaohy

Statistics for Linguistics with R 可以分享下不？要恶补统计知识， #### Haiyang Ai

Staff member

http://cos.name 上面有很多关于 R 的讨论。

#### miaohy

denom df是指分母自由度吗？自由度都是整数吧，但结果显示denom df为小数呢?

#### qhdjason

http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)

"http://en.wikipedia.org/wiki/Degrees_of_freedom_%28statistics%29

In some complicated settings, such as unbalanced split-plot designs, the sums-of-squares no longer have scaled chi-squared distributions. Comparison of sum-of-squares with degrees-of-freedom is no longer meaningful, and software may report certain fractional 'degrees of freedom' in these cases. Such numbers have no genuine degrees-of-freedom interpretation, but are simply providing an approximate chi-squared distribution for the corresponding sum-of-squares. The details of such approximations are beyond the scope of this page."

I think the mathematical foundation of many statistical methods are very complex and it needs a systematic course to get a grasp of them. I asked a teacher of mathematics in my school. He said statistics are based on calculus and linear algebra, so if you really want to know all the details you'd better read some very technical books otherwise just make do with the results ...

You can't find the complete mathematical interpretation in the "introductory" kind of books. That's why most of the books about "linguistics and statistics" won't answer your question.

Last edited:

#### qhdjason

In the attached paper, the degree of freedom of T-test (pp. 129) is fractional. Maybe we need to spell out the software and algorithm used in our paper.

#### 附件

• 304.6 KB 浏览: 2