CSimca(rrcovHD)
CSimca()所属R语言包:rrcovHD
Classification in high dimensions based on the (classical) SIMCA method
在高维空间中的分类的基础上SIMCA方法(经典)
译者:生物统计家园网 机器人LoveR
描述----------Description----------
CSimca performs the (classical) SIMCA method. This method classifies a data matrix x with a known group structure. To reduce the dimension on each group a PCA analysis is performed. Afterwards a classification rule is developped to determine the assignment of new observations.
CSimca执行(古典)SIMCA方法。这种方法分类的数据矩阵X与已知的组结构。对每个组一个主成分分析进行降维。分类规则之后,大发展的新的观测,以确定分配。
用法----------Usage----------
CSimca(x, ...)
## Default S3 method:[默认方法]
CSimca(x, grouping, prior=proportions, k, kmax = ncol(x),
tol = 1.0e-4, trace=FALSE, ...)
## S3 method for class 'formula'[类formula的方法]
CSimca(formula, data = NULL, ..., subset, na.action)
参数----------Arguments----------
参数:formula
a formula of the form y~x, it describes the response and the predictors. The formula can be more complicated, such as y~log(x)+z etc (see formula for more details). The response should be a factor representing the response variable, or any vector that can be coerced to such (such as a logical variable).
一个公式的形式y~x的,它描述了响应的预测。计算公式可以更复杂,如y~log(x)+z等(见formula更多的细节)。的反应应该是一个因素代表响应变量,或任何向量,可以强制转换为例如(如一个逻辑变量)。
参数:data
an optional data frame (or similar: see model.frame) containing the variables in the formula formula.
一个可选的数据框(或相似:model.frame),其中包含公式formula中的变量。
参数:subset
an optional vector used to select rows (observations) of the data matrix x.
的可选的向量选择行(观察)的数据矩阵x。
参数:na.action
a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The default is na.omit.
一个函数,它表示当数据包含NA的,应该发生什么。默认设置是由na.action的options,是na.fail,如果是没有设置的。默认的na.omit。
参数:x
a matrix or data frame containing the explanatory variables (training set).
矩阵或数据框包含的解释变量(训练集)。
参数:grouping
grouping variable: a factor specifying the class for each observation.
分组变量:指定一个类为每个观测的一个因素。
参数:prior
prior probabilities, default to the class proportions for the training set.
先验概率,默认为类的训练集的比例。
参数:tol
tolerance
公差
参数:k
number of principal components to compute. If k is missing, or k = 0, the algorithm itself will determine the number of components by finding such k that l_k/l_1 >= 10.E-3 and Σ_{j=1}^k l_j/Σ_{j=1}^r l_j >= 0.8. It is preferable to investigate the scree plot in order to choose the number of components and then run again. Default is k=0.
主成分的数目来计算。如果k失踪,或k = 0,算法本身决定的元件数量由找到这样的k,l_k/l_1 >= 10.E-3和Σ_{j=1}^k l_j/Σ_{j=1}^r l_j >= 0.8。这是最好的卵石在选择组件的数量,然后再次运行图进行调查。默认是k=0。
参数:kmax
maximal number of principal components to compute. Default is kmax=10. If k is provided, kmax does not need to be specified, unless k is larger than 10.
最大的主成分个数来计算。默认是kmax=10。如果k提供,kmax不需要被指定,除非k是大于10。
参数:trace
whether to print intermediate results. Default is trace = FALSE
是否要打印的中间结果。默认是trace = FALSE
参数:...
arguments passed to or from other methods.
传递的参数或其他方法。
Details
详细信息----------Details----------
CSimca, serving as a constructor for objects of class CSimca-class is a generic function with "formula" and "default" methods.
CSimca,作为一个构造函数的类的对象CSimca-class是一个通用的功能与“公式”和“默认”的方法。
SIMCA is a two phase procedure consisting of PCA performed on each group separately for dimension reduction followed by classification rules built in the lower dimensional space (note that the dimension in each group can be different). In original SIMCA new observations are classified by means of their deviations from the different PCA models. Here (and also in the robust versions implemented in this package) the classification rules will be obtained using two popular distances arising from PCA - orthogonal distances (OD) and score distances (SD). For the definition of these distances, the definition of the cutoff values and the standartization of the distances see Vanden Branden K, Hubert M (2005) and Todorov and Filzmoser (2009).
SIMCA是一个两阶段的程序组成的PCA对每个组执行分别进行降维,然后通过建在低维空间(请注意,在每个组的尺寸可以是不同的)的分类规则。在原来的SIMCA新的观测结果进行分类通过从不同的PCA模型的偏差。这里(也包括在此程序包实现了强大的版本)的分类规则将被使用两种流行的距离所产生的PCA - 正交距离(OD)和得分的距离(SD)。对于这些距离的定义,该定义的临界值和standartization的距离,请参阅:范登布兰登K,休伯特中号(2005)和(2009)托多罗夫和Filzmoser。
值----------Value----------
An S4 object of class CSimca-class which is a subclass of of the virtual class Simca-class.
S4对象的类CSimca-class这是一个子类,虚拟类Simca-class。
(作者)----------Author(s)----------
Valentin Todorov <a href="mailto:valentin.todorov@chello.at">valentin.todorov@chello.at</a>
参考文献----------References----------
dimensions based on the SIMCA method. Chemometrics and Intellegent Laboratory Systems 79:10–21
An Object Oriented Framework for Robust Multivariate Analysis. Journal of Statistical Software, 32(3), 1–47. URL http://www.jstatsoft.org/v32/i03/.
实例----------Examples----------
data(pottery)
cs <- CSimca(origin~., data=pottery)
cs
summary(cs)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|