# 卡方检验求帮助

#### qhdjason

For your 1st question: Yes, you can use chi-squared test.

You may want to organize your data in the following way.

AgeGroup WordFrequency OtherWordsFrequency

ag1 a b
ag2 c d
ag3 e f

It is a 3 x 2 contingency table and you can feed these data into your favorite statistical package and do chi-squared test.

The null hypothesis of this test is:
The distribution of the specific word is independent of age group
or formulate mathematically

P(i, j) = P(i)P(j)

Here, i = 1, 2, 3 (the number of rows) and j = 1, 2 (the number of columns)

To interpret the chi-squared test better, you can compute the residuals for each cell of the table.

Take my favorite statistical package R for example.
First, you input the data like this:
data <- matrix(c(a,b,c,d,e,f),nrow=3,byrow=T)
Then, do chi-squared test:
fit <- chisq.test(data)
Finally, find the standardized residual for each cell:
fit\$stdres

If any of the residual is greater than 1.96 or less than -1.96, you can be 95% sure that the null hypothesis is wrong for that cell.

#### lisang

>prop.test()

For your 1st question: Yes, you can use chi-squared test.

You may want to organize your data in the following way.

AgeGroup WordFrequency OtherWordsFrequency

ag1 a b
ag2 c d
ag3 e f

It is a 3 x 2 contingency table and you can feed these data into your favorite statistical package and do chi-squared test.

The null hypothesis of this test is:
The distribution of the specific word is independent of age group
or formulate mathematically

P(i, j) = P(i)P(j)

Here, i = 1, 2, 3 (the number of rows) and j = 1, 2 (the number of columns)

To interpret the chi-squared test better, you can compute the residuals for each cell of the table.

Take my favorite statistical package R for example.
First, you input the data like this:
data <- matrix(c(a,b,c,d,e,f),nrow=3,byrow=T)
Then, do chi-squared test:
fit <- chisq.test(data)
Finally, find the standardized residual for each cell:
fit\$stdres

If any of the residual is greater than 1.96 or less than -1.96, you can be 95% sure that the null hypothesis is wrong for that cell.

#### zwq763

For your 1st question: Yes, you can use chi-squared test.

You may want to organize your data in the following way.

AgeGroup WordFrequency OtherWordsFrequency

ag1 a b
ag2 c d
ag3 e f

It is a 3 x 2 contingency table and you can feed these data into your favorite statistical package and do chi-squared test.

The null hypothesis of this test is:
The distribution of the specific word is independent of age group
or formulate mathematically

P(i, j) = P(i)P(j)

Here, i = 1, 2, 3 (the number of rows) and j = 1, 2 (the number of columns)

To interpret the chi-squared test better, you can compute the residuals for each cell of the table.

Take my favorite statistical package R for example.
First, you input the data like this:
data <- matrix(c(a,b,c,d,e,f),nrow=3,byrow=T)
Then, do chi-squared test:
fit <- chisq.test(data)
Finally, find the standardized residual for each cell:
fit\$stdres

If any of the residual is greater than 1.96 or less than -1.96, you can be 95% sure that the null hypothesis is wrong for that cell.

#### qhdjason

90次，你想知道这个词在各个语料库里所占的比率是否有差异。

(100/10,000) / (9900/10,000) = 1/99

Last edited: