chisq.test(stats)
chisq.test()所属R语言包:stats
Pearson's Chi-squared Test for Count Data
Pearson的卡方检验,计数资料
译者:生物统计家园网 机器人LoveR
描述----------Description----------
chisq.test performs chi-squared contingency table tests and goodness-of-fit tests.
chisq.test执行联表卡方测试和拟合优度测试。
用法----------Usage----------
chisq.test(x, y = NULL, correct = TRUE,
p = rep(1/length(x), length(x)), rescale.p = FALSE,
simulate.p.value = FALSE, B = 2000)
参数----------Arguments----------
参数:x
a numeric vector or matrix. x and y can also both be factors.
一个数值向量或矩阵。 x和y也都是因素。
参数:y
a numeric vector; ignored if x is a matrix. If x is a factor, y should be a factor of the same length.
数字向量;忽略x如果是一个矩阵。 x如果是一个因素,y应该是相同长度的一个因素。
参数:correct
a logical indicating whether to apply continuity correction when computing the test statistic for 2 by 2 tables: one half is subtracted from all |O - E| differences. No correction is done if simulate.p.value = TRUE.
逻辑说明2表2计算检验统计时,是否适用连续性校正:减去一半是从所有的|O - E|差异。不改正的,如果simulate.p.value = TRUE。
参数:p
a vector of probabilities of the same length of x. An error is given if any entry of p is negative.
的x长度相同的概率向量。给出了一个错误,如果任何条目p负。
参数:rescale.p
a logical scalar; if TRUE then p is rescaled (if necessary) to sum to 1. If rescale.p is FALSE, and p does not sum to 1, an error is given.
逻辑标量;如果为TRUE然后p重新调整(如有必要)总结1。 rescale.p如果是假的,p不等于1,给出了一个错误。
参数:simulate.p.value
a logical indicating whether to compute p-values by Monte Carlo simulation.
逻辑表明是否通过蒙特卡罗模拟计算p值。
参数:B
an integer specifying the number of replicates used in the Monte Carlo test.
一个整数,指定在蒙特卡洛测试使用复制的数量。
Details
详情----------Details----------
If x is a matrix with one row or column, or if x is a vector and y is not given, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). The entries of x must be non-negative integers. In this case, the hypothesis tested is whether the population probabilities equal those in p, or are all equal if p is not given.
如果x是一行或一列的矩阵,或x如果是矢量和y不给,然后进行1善良的拟合检验(x 被视为一维应变表)。 x条目必须非负整数。在这种情况下,测试的假设是人口的概率是否等于那些p,都是平等的p如果没有给出。
If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table: the entries of x must be non-negative integers. Otherwise, x and y must be vectors or factors of the same length; cases with missing values are removed, the objects are coerced to factors, and the contingency table is computed from these. Then Pearson's chi-squared test is performed of the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
如果x是一个至少有两个行和列的矩阵,它被视为一个二维的应变表:x必须非负整数的条目。否则,x和y必须是相同长度的向量或因素;与缺失值的情况下被删除,对象被强制的因素,并从这些计算的应变表。 Pearson的卡方检验,然后进行单元计数在2维列联表的联合分布的行和列的边际产品的零假设。
If simulate.p.value is FALSE, the p-value is computed from the asymptotic chi-squared distribution of the test statistic; continuity correction is only used in the 2-by-2 case (if correct is TRUE, the default). Otherwise the p-value is computed for a Monte Carlo test (Hope, 1968) with B replicates.
如果simulate.p.value是FALSE,p值计算从渐近卡方检验统计量的分布;连续性校正仅在2-2的情况下使用(如果correct 是TRUE,默认值)。否则,P-值计算的蒙特卡罗测试(希望,1968年)B复制。
In the contingency table case simulation is done by random sampling from the set of all contingency tables with given marginals, and works only if the marginals are strictly positive. (A C translation of the algorithm of Patefield (1981) is used.) Continuity correction is never used, and the statistic is quoted without it. Note that this is not the usual sampling situation assumed for the chi-squared test but rather that for Fisher's exact test.
在应急模拟表的情况下被给予勉强应急表进行随机抽样,只有当边际是严格正。 (交流翻译的Patefield算法(1981))连续性校正从来没有使用过,并没有引述统计。请注意,这不是通常的抽样情况,卡方检验而是Fisher精确检验所承担。
In the goodness-of-fit case simulation is done by random sampling from the discrete distribution specified by p, each sample being of size n = sum(x). This simulation is done in R and may be slow.
在善良的配合的情况下模拟p,每个样品的大小n = sum(x)指定从离散分布的随机抽样。这种模拟是在R,可能会很慢。
值----------Value----------
A list with class "htest" containing the following components:
一类"htest"包含以下组件的列表:
参数:statistic
the value the chi-squared test statistic.
卡方检验统计值。
参数:parameter
the degrees of freedom of the approximate chi-squared distribution of the test statistic, NA if the p-value is computed by Monte Carlo simulation.
自由的近似卡方检验统计量的分布度,NA如果p值由Monte Carlo模拟计算。
参数:p.value
the p-value for the test.
p值的测试。
参数:method
a character string indicating the type of test performed, and whether Monte Carlo simulation or continuity correction was used.
一个字符串,指示类型的测试执行,以及是否使用了蒙特卡洛模拟或连续性校正。
参数:data.name
a character string giving the name(s) of the data.
字符串数据的名称(S)。
参数:observed
the observed counts.
观测到的计数。
参数:expected
the expected counts under the null hypothesis.
零假设下的预计数。
参数:residuals
the Pearson residuals, (observed - expected) / sqrt(expected).
皮尔森残差,(observed - expected) / sqrt(expected)。
参数:stdres
standardized residuals, (observed - expected) / sqrt(V), where V is the residual cell variance (Agresti, 2007, section 2.4.5 for the case where x is a matrix, n * p * (1 - p) otherwise).
标准化残差,(observed - expected) / sqrt(V),其中V是残余的单元变异(Agresti,2007年,情况2.4.5条x是一个矩阵,n * p * (1 - p)否则)。
参考文献----------References----------
A simplified Monte Carlo significance test procedure. J. Roy, Statist. Soc. B 30, 582–598.
Algorithm AS159. An efficient method of generating r x c tables with given row and column totals. Applied Statistics 30, 91–97.
An Introduction to Categorical Data Analysis, 2nd ed., New York: John Wiley & Sons. Page 38.
参见----------See Also----------
For goodness-of-fit testing, notably of continuous distributions, ks.test.
善良,合适的测试,尤其是连续分布,ks.test。
举例----------Examples----------
## From Agresti(2007) p.39[#从Agresti(2007)第39页]
M <- as.table(rbind(c(762, 327, 468), c(484,239,477)))
dimnames(M) <- list(gender=c("M","F"),
party=c("Democrat","Independent", "Republican"))
(Xsq <- chisq.test(M)) # Prints test summary[打印测试总结]
Xsq$observed # observed counts (same as M) [观察计数(同为M)]
Xsq$expected # expected counts under the null[根据空预期的计数]
Xsq$residuals # Pearson residuals[Pearson残差]
Xsq$stdres # standardized residuals[标准化残差]
## Effect of simulating p-values[#模拟p值的影响]
x <- matrix(c(12, 5, 7, 7), ncol = 2)
chisq.test(x)$p.value # 0.4233[0.4233]
chisq.test(x, simulate.p.value = TRUE, B = 10000)$p.value
# around 0.29![约0.29!]
## Testing for population probabilities[#测试的人口概率]
## Case A. Tabulated data[A.#案例列表数据]
x <- c(A = 20, B = 15, C = 25)
chisq.test(x)
chisq.test(as.table(x)) # the same[同一]
x <- c(89,37,30,28,2)
p <- c(40,20,20,15,5)
try(
chisq.test(x, p = p) # gives an error[给出了一个错误]
)
chisq.test(x, p = p, rescale.p = TRUE)
# works[作品]
p <- c(0.40,0.20,0.20,0.19,0.01)
# Expected count in category 5[预计在5类的计数]
# is 1.86 < 5 ==> chi square approx.[1.86 <5 ==>卡方约。]
chisq.test(x, p = p) # maybe doubtful, but is ok![也许怀疑,但OK!]
chisq.test(x, p = p, simulate.p.value = TRUE)
## Case B. Raw data[B.#案例的原始数据]
x <- trunc(5 * runif(100))
chisq.test(table(x)) # NOT 'chisq.test(x)'![不是“chisq.test(X)!]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|