R语言 CMA包 evaluation()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 15:19:46

evaluation(CMA)
evaluation()所属R语言包：CMA

                                    Evaluation of classifiers
                                       分类器的评价

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

The performance of classifiers can be evaluted by six different measures and two different schemes that are described more precisely below.<br> For S4 method information, s. evaluation-methods.
分类器的性能，可以由6个不同的措施和两个不同的方案描述S4方法信息的更准确。参考evaluted。 evaluation-methods。

用法----------Usage----------

evaluation(clresult, cltrain = NULL, cost = NULL, y = NULL, measure = c("misclassification", "sensitivity", "specificity", "average probability", "brier score", "auc", "0.632", "0.632+"),
                  scheme = c("iterationwise", "observationwise", "classwise"))

参数----------Arguments----------

参数：clresult
A list of objects of class cloutput or clvarseloutput
一个的cloutput或clvarseloutput类对象名单

参数：cltrain
An object of class cloutput in which the whole dataset was used as learning set. Only used if method = "0.632" or method = "0.632+" in order to obtain an estimation for the resubsitution error rate.
一个类cloutput在整个数据集被用来作为学习集的对象。只使用method = "0.632"或method = "0.632+"为了获得估计为resubsitution错误率。

参数：cost
An optional cost matrix used if measure = "misclassification". If it is not specified (default), the cost is the usual indicator loss. Otherwise, entry i,j of cost quantifies the loss when the true class is class i-1 and the predicted class is j-1, provided the conventional coding 0,...,K-1 in the case of K classes is used. Usually, the matrix contains only non-negative entries with zeros on the diagonal, but this is not obligatory. Make sure that the dimension of the matrix matches the number of classes.
一个可选的成本矩阵如果measure = "misclassification"。如果没有指定（默认），成本是通常的指标损失。否则，进入i,jcost量化的损失时，真正的类是类i-1和预测类j-1，提供了传统的编码0,...,K-1在K类的情况下使用。通常情况下，矩阵只包含对角线上的零的非负项，但是这并不是强制性的。确保维矩阵匹配的班级数目。

参数：y
A vector containing the true class labels. Only needed if scheme = "classwise".
一个真正的类标签的向量。只需如果scheme = "classwise"。

参数：measure
Peformance measure to be used:
peformance措施将用于：

"misclassification"The missclassifcation rate.
"misclassification" missclassifcation率。

"sensitivity"The sensitivity or 1-false negative rate. Can only be computed for binary classifcation.
"sensitivity"的敏感性或1假阴性率。只能二进制classifcation计算。

"specificity"The specificity or 1-false positive rate. Can only be computed for binary classification.
"specificity"特异性或1假阳性率。只能计算为二元分类。

"average probability"The average probability assigned to the correct class. Requirement is that the used classifier provides probability estimations. The optimum performance is 1.
"average probability"平均概率分配到正确的类。要求是所用的分类提供了概率估计。最佳性能1。

"brier score"The Brier Score is generally defined as <sum over all observation i> <sum over all classes k> (I(y_i=k)-P(k))^2, with I() denoting the indicator function and P(k) the estimated probability for class k. The optimum performance is 0.
"brier score"马库斯 - 布莱尔分数通常被定义为<sum over all observation i> <sum over all classes k> (I(y_i=k)-P(k))^2，I()表示指示功能和P(k)类的估计概率k。最佳性能为0。

"auc"The Area under the Curve (AUC) belonging to the empirical ROC curve computed from the estimated probabilities and the true class labels. Can only be computed for binary classification and if "scheme = iterationwise", s. below. S. also roc,cloutput-method.
"auc"属于经验的ROC曲线，计算估计概率和真正的类标签的曲线下面积（AUC）的区域。只能计算为二元分类，如果"scheme = iterationwise"，S。以下。学也roc,cloutput-method。

"0.632"The 0.632 estimator (s. reference) for the misclassification rate (applied iteration- or) observationwise, if bootstrap learning sets have been used. Note that cltrain must be provided.
"0.632"0.632估计的误判率（S.参考）（应用迭代）observationwise的，如果被用来引导学习套。注意cltrain必须的。

"0.632+"The 0.632+ estimator (s. reference) for the misclassification rate (applied iteration- or) observationwise, if bootstrap learning sets have been used. Note that cltrain must be provided.
"0.632+"0.632 +估计的误判率（S.参考）（应用迭代）observationwise的，如果被用来引导学习套。注意cltrain必须的。

参数：scheme


"iterationwise"The performance measures listed above are computed for each different iteration, i.e. each different learningset
"iterationwise"上面列出的是为每一个不同的迭代计算性能测量，即每一个不同的learningset

"observationwise"The performance measures listed above (except for "auc") are computed separately for each observation classified one or several times, depending on the learningset scheme.
"observationwise"性能的措施上面列出的（"auc"）分别计算各观察一次或几次，这取决于learningset计划分为除外。

"classwise"The performance measures (exceptions: "auc", "0.632", "0.632+") are computed separately for each class, averaged over both iterations and observations.
"classwise"性能的措施（例外："auc", "0.632", "0.632+"）分别计算每个类，平均每两个循环和意见。

值----------Value----------

An object of class evaloutput.
对象类evaloutput。

作者（S）----------Author(s)----------

Martin Slawski <a href="mailto:ms@cs.uni-sb.de">ms@cs.uni-sb.de</a>

Anne-Laure Boulesteix <a href="mailto:boulesteix@ibe.med.uni-muenchen.de">boulesteix@ibe.med.uni-muenchen.de</a>

Christoph Bernau <a href="mailto:bernau@ibe.med.uni-muenchen.de">bernau@ibe.med.uni-muenchen.de</a>

参考文献----------References----------

Improvements on cross-validation: The .632+ bootstrap method.<br>  Journal of the American Statistical Association, 92, 548-560.
CMA - A comprehensive Bioconductor package for supervised classification with high dimensional data. BMC Bioinformatics 9: 439

参见----------See Also----------

evaloutput, classification, compare
evaloutput，classification，compare

举例----------Examples----------

### simple linear discriminant analysis example using bootstrap datasets:[＃简单的线性判别分析的例子，使用引导集：]
### datasets:[＃数据集：]
data(golub)
golubY <- golub[,1]
### extract gene expression from first 10 genes[＃提取从第10个基因的基因表达]
golubX <- as.matrix(golub[,2:11])
### generate 25 bootstrap datasets[＃产生25个引导数据集]
set.seed(333)
bootds <- GenerateLearningsets(y = golubY, method = "bootstrap", ntrain = 30, niter = 10, strat = TRUE)
### run classification()[＃运行的分类（）]
ldalist <- classification(X=golubX, y=golubY, learningsets = bootds, classifier=ldaCMA)
### Evaluation:[＃＃的评价：]
eval_iter <- evaluation(ldalist, scheme = "iter")
eval_obs <- evaluation(ldalist, scheme = "obs")
show(eval_iter)
show(eval_obs)
summary(eval_iter)
summary(eval_obs)
### auc with boxplot[＃AUC与盒形图]
eval_auc <- evaluation(ldalist, scheme = "iter", measure = "auc")
boxplot(eval_auc)
### which observations have often been misclassified ?[＃意见常常被误判？]
obsinfo(eval_obs, threshold = 0.75)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册