R语言:cor()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-17 10:16:03

cor(stats)
cor()所属R语言包：stats

                                    Correlation, Variance and Covariance (Matrices)
                                       相关，方差和协方差（矩阵）

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

var, cov and cor compute the variance of x and the covariance or correlation of x and y if these are vectors. If x and y are matrices then the covariances (or correlations) between the columns of x and the columns of y are computed.
var，cov和cor计算x的方差和协方差或x相关y如果这些向量。如果x和yx和y计算列的列之间的协方差（或相关）矩阵然后。

cov2cor scales a covariance matrix into the corresponding correlation matrix efficiently.
cov2cor尺度到相应的相关矩阵的协方差矩阵有效。

用法----------Usage----------

var(x, y = NULL, na.rm = FALSE, use)

cov(x, y = NULL, use = "everything",
method = c("pearson", "kendall", "spearman"))

cor(x, y = NULL, use = "everything",
   method = c("pearson", "kendall", "spearman"))

cov2cor(V)

参数----------Arguments----------

参数：x
a numeric vector, matrix or data frame.
一个数值向量，矩阵或数据框。

参数：y
NULL (default) or a vector, matrix or data frame with compatible dimensions to x. The default is equivalent to y = x (but more efficient).
NULL（默认）或与x兼容尺寸向量，矩阵或数据框。默认相当于y = x（但更有效的）。

参数：na.rm
logical. Should missing values be removed?
逻辑。应被删除缺失值吗？

参数：use
an optional character string giving a method for computing covariances in the presence of missing values.  This must be (an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs".
一个可选的字符串，给人一种存在缺失值的计算协方差的方法。这必须是（缩写）字符串"everything"，"all.obs"，"complete.obs"，"na.or.complete"或"pairwise.complete.obs"。

参数：method
a character string indicating which correlation coefficient (or covariance) is to be computed.  One of "pearson" (default), "kendall", or "spearman", can be abbreviated.
字符串说明哪些相关系数（或协方差）计算。 "pearson"（默认），"kendall"或"spearman"，可以缩写。

参数：V
symmetric numeric matrix, usually positive definite such as a covariance matrix.
对称数字矩阵，通常作为一个协方差矩阵正定等。

Details

详情----------Details----------

For cov and cor one must either give a matrix or data frame for x or give both x and y.
为cov和cor必须给x的的矩阵或数据框，或给双方x和y。

The inputs must be numeric (as determined by is.numeric: logical values are also allowed for historical compatibility): the "kendall" and "spearman" methods make sense for ordered inputs but xtfrm can be used to find a suitable prior transformation to numbers.
输入必须是数字（取决于is.numeric：逻辑值也允许历史的兼容性）："kendall"和"spearman"方法使有序的投入感，但xtfrm可以用来寻找一个合适的前转换为数字。

var is just another interface to cov, where na.rm is used to determine the default for use when that is unspecified.  If na.rm is TRUE then the complete observations (rows) are used (use = "na.or.complete") to compute the variance.  Otherwise, by default use = "everything".
var只不过是另一种接口cov，其中na.rm用确定use的默认时是不确定的。 na.rm如果是TRUE然后完整的观测（行）（use = "na.or.complete"）计算方差。否则，默认use = "everything"。

If use is "everything", NAs will propagate conceptually, i.e., a resulting value will be NA whenever one of its contributing observations is NA.<br> If use is "all.obs", then the presence of missing observations will produce an error.  If use is "complete.obs" then missing values are handled by casewise deletion (and if there are no complete cases, that gives an error). <br> "na.or.complete" is the same unless there are no complete cases, that gives NA. Finally, if use has the value "pairwise.complete.obs" then the correlation or covariance between each pair of variables is computed using all complete pairs of observations on those variables. This can result in covariance or correlation matrices which are not positive semi-definite, as well as NA entries if there are no complete pairs for that pair of variables. For cov and var, "pairwise.complete.obs" only works with the "pearson" method. Note that (the equivalent of) var(double(0), use=*) gives NA for use = "everything" and "na.or.complete", and gives an error in the other cases.
如果use是"everything"，NA的传播概念，即，结果值将NA只要其贡献意见之一是NA。<BR > use如果是"all.obs"，然后存在失踪意见将产生一个错误。如果use是"complete.obs"然后遗漏值是由casewise删除处理（如果没有完整的情况下，给出了一个错误）。参考"na.or.complete"是相同的，除非有没有完整的情况下，这给了NA。最后，如果use值"pairwise.complete.obs"然后计算上使用这些变量的完整的观测对每对变量之间的协方差或相关。这可能会导致协方差或相关矩阵，这是不积极的半正定，以及NA项，如果有变量，对没有完成对。 cov和var，"pairwise.complete.obs""pearson"方法只有作品。注意（相当于）var(double(0), use=*)的给NAuse = "everything"和"na.or.complete"，在其他情况下，给出了一个错误。

The denominator n - 1 is used which gives an unbiased estimator of the (co)variance for i.i.d. observations. These functions return NA when there is only one observation (whereas S-PLUS has been returning NaN), and fail if x has length zero.
给出了一个IID（CO）方差的无偏估计的分母n - 1意见。这些函数返回NA时有只有一个观察（而，S-PLUS已返回NaN），如果失败x长度为零。

For cor(), if method is "kendall" or "spearman", Kendall's tau or Spearman's rho statistic is used to estimate a rank-based measure of association.  These are more robust and have been recommended if the data do not necessarily come from a bivariate normal distribution.<br> For cov(), a non-Pearson method is unusual but available for the sake of completeness.  Note that "spearman" basically computes cor(R(x), R(y)) (or cov(.,.)) where R(u) := rank(u, na.last="keep"). In the case of missing values, the ranks are calculated depending on the value of use, either based on complete observations, or based on pairwise completeness with reranking for each pair.
cor()如果method是"kendall"或"spearman"，肯德尔的tau或斯皮尔曼rho统计是用来估计排名为基础的措施关联。这些更强大，并已建议，如果数据不一定从二元正态分布来。参考cov()，非皮尔逊方法是不寻常的，但为了完整性。请注意，"spearman"基本计算cor(R(x), R(y))（或cov(.,.)）R(u) := rank(u, na.last="keep")。在遗漏值的情况下，队伍都取决于值use，要么基于完整的意见，或每一对重排序成对完整性的基础上计算。

Scaling a covariance matrix into a correlation one can be achieved in many ways, mathematically most appealing by multiplication with a diagonal matrix from left and right, or more efficiently by using sweep(.., FUN = "/") twice.  The cov2cor function is even a bit more efficient, and provided mostly for didactical reasons.
扩大到在许多方面可以实现，由乘法数学最吸引人的一个对角线矩阵左，右，或更有效地使用sweep(.., FUN = "/")两次相关协方差矩阵。 cov2cor功能，甚至是有点更有效率，主要是提供教育方法的原因。

值----------Value----------

For r <- cor(*, use = "all.obs"), it is now guaranteed that all(r <= 1).
r <- cor(*, use = "all.obs")，它现在保证all(r <= 1)。

参考文献----------References----------

The New S Language. Wadsworth & Brooks/Cole.

参见----------See Also----------

cor.test for confidence intervals (and tests).
cor.test置信区间（和测试）。

cov.wt for weighted covariance computation.
cov.wt加权方差计算。

sd for standard deviation (vectors).
sd标准偏差（向量）。

举例----------Examples----------

var(1:10)# 9.166667[9.166667]

var(1:5,1:5)# 2.5[2.5]

## Two simple vectors[两个简单的向量]
cor(1:10,2:11)# == 1[== 1]

## Correlation Matrix of Multivariate sample:[＃多元样本相关矩阵：]
(Cl <- cor(longley))
## Graphical Correlation Matrix:[＃图形相关矩阵：]
symnum(Cl) # highly correlated[高度相关]

## Spearman's rho  and  Kendall's tau[＃斯皮尔曼rho和Kendall的tau蛋白]
symnum(clS <- cor(longley, method = "spearman"))
symnum(clK <- cor(longley, method = "kendall"))
## How much do they differ?[＃多少钱，他们有什么区别？]
i <- lower.tri(Cl)
cor(cbind(P = Cl[i], S = clS[i], K = clK[i]))

## cov2cor() scales a covariance matrix by its diagonal[，＃cov2cor（）尺度，其对角协方差矩阵]
##          to become the correlation matrix.[＃成为相关矩阵。]
cov2cor # see the function definition {and learn ..}[看到函数的定义{和学习......}]
stopifnot(all.equal(Cl, cov2cor(cov(longley))),
      all.equal(cor(longley, method="kendall"),
         cov2cor(cov(longley, method="kendall"))))

##--- Missing value treatment:[＃---缺失值处理：]
C1 <- cov(swiss)
range(eigen(C1, only.values=TRUE)$values) # 6.19       1921[6.19 1921年]
swM <- swiss
swM[1,2] <- swM[7,3] <- swM[25,5] <- NA # create 3 "missing"[创建3个“失踪”]
try(cov(swM)) # Error: missing obs...[错误：缺少OBS ...]
C2 <- cov(swM, use = "complete")
range(eigen(C2, only.values=TRUE)$values) # 6.46       1930[6.46 1930年]
C3 <- cov(swM, use = "pairwise")
range(eigen(C3, only.values=TRUE)$values) # 6.19       1938[6.19 1938年]

symnum(cor(swM, method = "kendall", use = "complete"))
## Kendall's tau doesn't change much:[＃肯德尔的头并没有太大变化：]
symnum(cor(swiss, method = "kendall"))

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册