R语言 robustHD包 corHuber()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-27 22:22:32

corHuber(robustHD)
corHuber()所属R语言包：robustHD

                                    Robust correlation based on winsorization.
                                       基于极值调整的强大的相关性。

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Compute a robust correlation estimate based on winsorization, i.e., by shrinking outlying observations to the border of the main part of the data.
基于极值调整，即计算一个强大的相关性估计，，通过缩小边远观测到边界的主要部分的数据。

用法----------Usage----------

  corHuber(x, y,
type = c("bivariate", "adjusted", "univariate"),
standardized = FALSE, centerFun = median,
scaleFun = mad, const = 2, prob = 0.95,
tol = .Machine$double.eps^0.5, ...)

参数----------Arguments----------

参数：x
a numeric vector.
一个数值向量。

参数：y
a numeric vector.
一个数值向量。

参数：type
a character string specifying the type of winsorization to be used.  Possible values are "univariate" for univariate winsorization, "adjusted" for adjusted univariate winsorization, or "bivariate" for bivariate winsorization.
一个字符串，指定要使用的类型极值调整。可能的值是"univariate"单因素极值调整，"adjusted"调整后的单变量极值调整，或"bivariate"二元极值调整。

参数：standardized
a logical indicating whether the data are already robustly standardized.
一个逻辑指示数据是否已经鲁棒标准化。

参数：centerFun
a function to compute a robust estimate for the center to be used for robust standardization (defaults to median).  Ignored if standardized is TRUE.
一个函数来计算一个稳健估计为中心，以强大的的标准化（默认为median）使用。忽略如果standardized是TRUE。

参数：scaleFun
a function to compute a robust estimate for the scale to be used for robust standardization (defaults to mad). Ignored if standardized is TRUE.
一个功能强大的标准化（默认到mad）用于计算一个强大的规模估计。忽略如果standardized是TRUE。

参数：const
numeric; tuning constant to be used in univariate or adjusted univariate winsorization (defaults to 2).
数字;时间常数中使用的单因素或调整单因素极值调整（默认为2）。

参数：prob
numeric; probability for the quantile of the chi-squared distribution to be used in bivariate winsorization (defaults to 0.95).
数字chi-squared分布二元极值调整至0.95（默认）中要使用的分位数的概率。

参数：tol
a small positive numeric value.  This is used in bivariate winsorization to determine whether the initial estimate from adjusted univariate winsorization is close to 1 in absolute value.  In this case, bivariate winsorization would fail since the points form almost a straight line, and the initial estimate is returned.
一个小的正数值。这是用在二元极值调整，以确定是否从调整后的单变量极值调整的初始估计是接近1的绝对值。在这种情况下，二元极值调整会失败，因为点形成几乎是一条直线，并返回的初步估计。

参数：...
additional arguments to be passed to robStandardize.
额外的参数传递给robStandardize。

Details

详细信息----------Details----------

The borders of the main part of the data are defined on the scale of the robustly standardized data.  In univariate winsorization, the borders for each variable are given by +/-const, thus a symmetric distribution is assumed.  In adjusted univariate winsorization, the borders for the two diagonally opposing quadrants containing the minority of the data are shrunken by a factor that depends on the ratio between the number of observations in the major and minor quadrants.  It is thus possible to better account for the bivariate structure of the data while maintaining fast computation.  In bivariate winsorization, a bivariate normal distribution is assumed and the data are shrunken towards the boundary of a tolerance ellipse with coverage probability prob.  The boundary of this ellipse is thereby given by all points that have a squared Mahalanobis distance equal to the quantile of the chi-squared distribution given by prob.  Furthermore, the initial correlation matrix required for the Mahalanobis distances is computed based on adjusted univariate winsorization.
的数据的主要部分的边界被定义的鲁棒标准化的数据的规模。在单因素极值调整，为每个变量的边界+/-“const，对称分布假设。以调整后的单变量极值调整，两个对角线相对的象限包含少数的数据的边界的收缩的一个因素，取决于主要和次要的象限中的观测值的数目之间的比率。因此，能够更好地考虑的二元结构的数据，同时保持快速计算。在二元极值调整，二元正态分布假定的数据对边界的公差椭圆覆盖概率prob萎缩。此椭圆的边界，由此，给定所有点有平方Mahalanobis距离等于chi-squared分布给出prob的位数。此外，初始计算的相关矩阵所需的马氏距离，基于调整后的单变量极值调整。

值----------Value----------

The robust correlation estimate.
强大的相关性估计。

（作者）----------Author(s)----------

Andreas Alfons, based on code by Jafar A. Khan, Stefan
Van Aelst and Ruben H. Zamar

参考文献----------References----------

linear model selection based on least angle regression. Journal of the American Statistical Association, 102(480), 1289–1299.

参见----------See Also----------

winsorize
winsorize

实例----------Examples----------

## generate data[＃生成数据]
library("mvtnorm")
set.seed(1234)  # for reproducibility[可重复性]
Sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2)
xy <- rmvnorm(100, sigma=Sigma)
x <- xy[, 1]
y <- xy[, 2]

## introduce outlier[＃介绍离群]
x[1] <- x[1] * 10
y[1] <- y[1] * (-5)

## compute correlation[＃计算相关]
cor(x, y)
corHuber(x, y)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册