R语言 robustbase包 covMcd()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-27 22:07:58

covMcd(robustbase)
covMcd()所属R语言包：robustbase

                                    Robust Location and Scatter Estimation via MCD
                                       通过MCD的稳健的位置与散布估计

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Compute a robust multivariate location and scale estimate with a high breakdown point, using the "Fast MCD" (Minimum Covariance Determinant) estimator.
计算一个强大的多元高的击穿点的位置和规模估计，使用“快速MCD（最低协方差的决定因素）的估计。

用法----------Usage----------

covMcd(x, cor = FALSE, alpha = 1/2, nsamp = 500, nmini = 300, seed = NULL,
   trace = FALSE, use.correction = TRUE, control = rrcov.control())

参数----------Arguments----------

参数：x
a matrix or data frame.
一个矩阵或的数据框。

参数：cor
should the returned result include a correlation matrix? Default is cor = FALSE </table>
返回的结果应包括相关矩阵？默认是cor = FALSE</ TABLE>

参数：alpha
numeric parameter controlling the size of the subsets over which the determinant is minimized, i.e., alpha*n observations are used for computing the determinant.  Allowed values are between 0.5 and 1 and the default is 0.5.
数值参数控制行列式最小的子集的大小，即，alpha*n观测用于计算行列式。允许的值是0.5~1之间，默认为0.5。

参数：nsamp
number of subsets used for initial estimates or "best" or "exact".  Default is nsamp = 500.  For nsamp = "best" exhaustive enumeration is done, as long as the number of trials does not exceed 100'000 (= nLarge).  For "exact", exhaustive enumeration will be attempted however many samples are needed.  In this case a warning message may be displayed saying that the computation can take a very long time.
用于初步估计或"best"或"exact"的子集数。默认是nsamp = 500。对于nsamp = "best"穷举完成，只要试验的次数不超过100000（= nLarge）。对于"exact"，穷举尝试，然而，许多样品需要。在这种情况下，可能会显示一个警告信息，说，计算需要很长的时间。

参数：nmini
for large n, the algorithm splits the data into maximally krep = 5 subsets of size nmini.  The original algorithm had nmini = 300 hard coded.
大n，算法分割的数据到最大限度krep = 5的子集的大小nmini。原来的算法nmini = 300硬编码。

参数：seed
initial seed for random generator, see rrcov.control.
随机数发生器的初始种子，请参阅rrcov.control。

参数：trace
logical (or integer) indicating if intermediate results should be printed; defaults to FALSE; values >= 2 also produce print from the internal (Fortran) code.
逻辑（或整数）表示，如果中间结果应印，默认为FALSE值>= 2也产生打印的内部代码（Fortran语言）。

参数：use.correction
whether to use finite sample correction factors; defaults to TRUE.
是否使用有限样本修正系数;默认为TRUE的。

参数：control
a list with estimation options - this includes those above provided in the function specification, see rrcov.control for the defaults.  If control is supplied, the parameters from it will be used.  If parameters are passed also in the invocation statement, they will override the corresponding elements of the control object.
估计选项的列表 - 包括那些上述规定在功能规格，请参阅rrcov.control的默认值。如果control被供给的，从它的参数将被使用。如果参数传递的调用语句，它们将覆盖相应元素的控制对象。

Details

详细信息----------Details----------

The minimum covariance determinant estimator of location and scatter implemented in covMcd() is similar to R function cov.mcd() in MASS.  The MCD method looks for the h (> n/2) (h = h(α,n,p) = h.alpha.n(alpha,n,p)) observations (out of n) whose classical covariance matrix has the lowest possible determinant.
最低协方差决定位置和分散实施covMcd()的估计是类似于R的功能cov.mcd()中MASS。 MCD方法查找h (> n/2)（h = h(α,n,p) =h.alpha.n(alpha,n,p)）意见（n），其经典的协方差矩阵具有最低的可能的决定因素。

The raw MCD estimate of location is then the average of these h points, whereas the raw MCD estimate of scatter is their covariance matrix, multiplied by a consistency factor and a finite sample correction factor (to make it consistent at the normal model and unbiased at small samples).
的原料MCD估计的位置，然后这些h点的平均值，而：MCD原始估计的分散是它们的协方差矩阵，乘以由一致性因子和一个有限的样品校正因子（以使其保持一致，在正常的模型和公正的，在小样本）。

The implementation of covMcd uses the Fast MCD algorithm of Rousseeuw and Van Driessen (1999) to approximate the minimum covariance determinant estimator.
执行covMcd使用快速MCD算法的Rousseeuw和Van Driessen的（1999年）的最小方差的决定因素估计。

Both rescaling factors (consistency and finite sample) are returned also in the vector raw.cnp2 of length 2.  Based on these raw MCD estimates, a reweighting step is performed which increases the finite-sample eficiency considerably - see Pison et al. (2002).  The rescaling factors for the reweighted estimates are returned in the vector cnp2 of length 2.  Details for the computation of the finite sample correction factors can be found in Pison et al. (2002).
向量raw.cnp2长度为2的两个重标度的因素（一致性和有限样本）也将返回。这些原料MCD估计的基础上，进行权重调整步骤，这大大增加了有限样本的高效性 - 看皮松等。（2002年）。重新缩放因素再加权估计中返回向量cnp2长度为2。的有限样本校正因子的计算的细节，可以发现在皮松等。（2002年）。

The finite sample corrections can be suppressed by setting use.correction = FALSE.
有限样本的修正，可以抑制，通过设置use.correction = FALSE。

值----------Value----------

An object of class "mcd" which is basically a list with components
对象的类"mcd"这基本上是一个list的组件

参数：center
the final estimate of location.
最终的位置估计。

参数：cov
the final estimate of scatter.
分散的最终估计。

参数：cor
the (final) estimate of the correlation matrix (only if cor = TRUE).
（最终）估计的相关矩阵（只有cor = TRUE“）。

参数：crit
the value of the criterion, i.e. the determinant.
的标准值，即行列式。

参数：best
the best subset found and used for computing the raw estimates, with length(best) == quan =    h.alpha.n(alpha,n,p).
找到最好的子集，用于计算的原始估计，length(best) == quan =    h.alpha.n(alpha,n,p)。

参数：mah
mahalanobis distances of the observations using the final estimate of the location and scatter.
马哈拉诺比斯距离的观测使用的位置和散射的最终估值。

参数：mcd.wt
weights of the observations using the final estimate of the location and scatter.
权重，使用的位置和散射的最终估值的观测。

参数：cnp2
a vector of length two containing the consistency correction factor and the finite sample correction factor of the final estimate of the covariance matrix.
含有的一致性的校正因子和有限样品校正因子的最终估计的协方差矩阵的长度为2的向量。

参数：raw.center
the raw (not reweighted) estimate of location.
原（不重加权）的位置估计。

参数：raw.cov
the raw (not reweighted) estimate of scatter.
原（不重加权）估计分散。

参数：raw.mah
mahalanobis distances of the observations based on the raw estimate of the location and scatter.
基于原始估计的位置和散射的观测的马哈拉诺比斯距离。

参数：raw.weights
weights of the observations based on the raw estimate of the location and scatter.
观测的基础上的位置和散射的原始估计的权重。

参数：raw.cnp2
a vector of length two containing the consistency correction factor and the finite sample correction factor of the raw estimate of the covariance matrix.
含有的一致性校正因子和有限样本的原始估计的协方差矩阵的校正因子的长度为2的向量。

参数：X
the input data as numeric matrix, without NAs.
的输入数据作为数值矩阵没有NA的。

参数：n.obs
total number of observations.
观察的总数。

参数：alpha
the size of the subsets over which the determinant is minimized (the default is (n+p+1)/2).
的决定因素最小化的子集的大小（默认为(n+p+1)/2）。

参数：quan
the number of observations, h, on which the MCD is based.  If quan equals n.obs, the MCD is the classical covariance matrix.
的若干意见，h，上MCD的基础。如果quan等于n.obs，MCD是经典的协方差矩阵。

参数：method
character string naming the method (Minimum Covariance Determinant).
字符串命名方法（最小协方差的决定因素）。

参数：call
the call used (see match.call).
使用的呼叫（见match.call“）。

（作者）----------Author(s)----------

Valentin Todorov <a href="mailto:valentin.todorov@chello.at">valentin.todorov@chello.at</a>, based on
work written for S-plus by Peter Rousseeuw and Katrien van Driessen
from University of Antwerp.

参考文献----------References----------

Robust Regression and Outlier Detection. Wiley.
A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.
Small Sample Corrections for LTS and MCD, Metrika, 55, 111-123.

参见----------See Also----------

cov.mcd from package MASS; covOGK as cheaper alternative for larger dimensions.
cov.mcd包MASS; covOGK大尺寸的作为廉价的替代品。

实例----------Examples----------

data(hbk)
hbk.x <- data.matrix(hbk[, 1:3])
covMcd(hbk.x)

## the following three statements are equivalent[＃以下三个语句是等价的]
c1 <- covMcd(hbk.x, alpha = 0.75)
c2 <- covMcd(hbk.x, control = rrcov.control(alpha = 0.75))
## direct specification overrides control one:[＃直接指定覆盖控制1：]
c3 <- covMcd(hbk.x, alpha = 0.75,
         control = rrcov.control(alpha=0.95))
c1

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册