robustPca(pcaMethods)
robustPca()所属R语言包:pcaMethods
PCA implementation based on robustSvd
基于对robustSvd PCA的实施
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This is a PCA implementation robust to outliers in a data set. It can also handle missing values, it is however NOT intended to be used for missing value estimation. As it is based on robustSVD we will get an accurate estimation for the loadings also for incomplete data or for data with outliers. The returned scores are, however, affected by the outliers as they are calculated inputData X loadings. This also implies that you should look at the returned R2/R2cum values with caution. If the data show missing values, scores are caluclated by just setting all NA - values to zero. This is not expected to produce accurate results. Please have also a look at the manual page for robustSvd. Thus this method should mainly be seen as an attempt to integrate robustSvd() into the framework of this package. Use one of the other methods coming with this package (like PPCA or BPCA) if you want to do missing value estimation. It is not recommended to use this function directely but rather to use the pca() wrapper
这是一个PCA实现强大的数据集离群。它也可以处理缺失值,但是它不打算将用于缺失值估计。因为它是基于robustSVD,我们将得到一个准确的估计,也是不完整的数据或离群数据的负荷。返回的成绩,但是,受他们计算inputData X负荷离群。这也意味着,你应该看看谨慎在返回R2/R2cum值。如果缺失值的数据显示,分数是caluclated只需通过设置所有适用 - 值为零。这是预期不会产生准确的结果。也有在手册页看看robustSvd。因此,这种方法主要应被视为企图整合到这个包的框架robustSvd()。使用这个包(如申诉机关或BPCA)如果你想要做的缺失值估计的其他方法之一。这是不建议使用此功能directely而是使用PCA()包装
用法----------Usage----------
robustPca(Matrix, nPcs=2, verbose=interactive(), ...)
参数----------Arguments----------
参数:Matrix
matrix – Data containing the variables in columns and observations in rows. The data may contain missing values, denoted as NA.
matrix - 包含列和行的观测变量的数据。数据可能包含缺失值,记为NA。
参数:nPcs
numeric – Number of components to estimate. The preciseness of the missing value estimation depends on the number of components, which should resemble the internal structure of the data.
numeric - 组件的数量估计。严谨的缺失值估计取决于元件的数量,应该像数据的内部结构。
参数:verbose
boolean Print some output to the command line if TRUE
boolean打印一些输出到命令行,如果为TRUE
参数:...
Reserved for future use. Currently no further parameters are used </table>
保留供将来使用。目前,没有进一步的参数</ TABLE>
Details
详情----------Details----------
The method is very similar to the standard prcomp() function. The main difference is that robustSvd() is used
该方法是非常类似的标准prcomp()功能。 robustSvd()使用的主要区别在于
值----------Value----------
Standard PCA result object used by all PCA-based methods of this package. Contains scores, loadings, data mean and
使用这个包的所有基于PCA方法的标准PCA结果对象。包含分数,载荷,数据的意思,
作者(S)----------Author(s)----------
Wolfram Stacklies
参见----------See Also----------
robustSvd, svd, prcomp,
robustSvd的<code>,SVD,prcomp
举例----------Examples----------
data(metaboliteDataComplete)
mdc <- scale(metaboliteDataComplete, center=TRUE, scale=FALSE)
## Now create 5% of outliers.[#现在创建5离群%。]
cond <- runif(length(mdc)) < 0.05;
mdcOut <- mdc
mdcOut[cond] <- 10
## Now we do a conventional PCA and robustPca on the original and the data[#现在我们对原始数据的一个传统的PCA和robustPca]
## with outliers.[#离群。]
## We use center=FALSE here because the large artificial outliers would[#我们使用中心= FALSE,在这里,因为大的人工离群]
## affect the means and not allow to objectively compare the results.[#影响的手段,而不是让客观的比较结果。]
resSvd <- pca(mdc, method = "svd", nPcs = 10, center = FALSE)
resSvdOut <- pca(mdcOut, method = "svd", nPcs = 10, center = FALSE)
resRobPca <- pca(mdcOut, method = "robustPca", nPcs = 10, center = FALSE)
## Now we plot the results for the original data against those with outliers[#现在我们针对离群的原始数据绘制的结果]
## We can see that robustPca is hardly effected by the outliers.[#我们可以看到,robustPca是很难离群值的影响。]
plot(loadings(resSvd)[,1], loadings(resSvdOut)[,1])
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|