compare(CMA)
compare()所属R语言包:CMA
Compare different classifiers
比较不同的分类
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Classifiers can be evaluated separately using the method evaluation. Normally, several classifiers are used for the same dataset and their performance is compared. This comparison procedure is essentially facilitated by this method.
量词可以单独进行评估,使用的方法evaluation。通常情况下,几个量词使用相同的数据集,他们的表现进行了比较。这种比较过程实际上是用这种方法促进。
用法----------Usage----------
compare(clresultlist, measure = c("misclassification", "sensitivity",
"specificity", "average probability", "brier score", "auc"), aggfun =
meanrm, plot = FALSE, ...)
参数----------Arguments----------
参数:clresultlist
A list of lists (!) of objects of class cloutput or clvarseloutput. Each inner list is usually returned by classification. Additionally, the different list elements of the outer list should have been created by different classifiers, s. also example below.
类cloutput或clvarseloutput对象的一个列表(!)。每个内部列表通常由classification返回。此外,不同的外部列表列表元素应该由不同的分类,S被创建的。同时下面的例子。
参数:measure
A character vector containing one or more of the elements listed below. By default, all measures are computed, using evaluation with scheme = "iterationwise". Note that "sensitivity", "specificity", "auc" cannot be computed for the multiclass case.
字符向量包含一个或多个下面列出的元素。默认情况下,所有的措施计算,使用与evaluationscheme = "iterationwise"。注意"sensitivity", "specificity", "auc"不能被计算为多例。
"misclassification"The missclassifcation rate.
"misclassification" missclassifcation率。
"sensitivity"The sensitivity or 1-false negative rate. Can only be computed for binary classifcation.
"sensitivity"的敏感性或1假阴性率。只能二进制classifcation计算。
"specificity"The specificity or 1-false positive rate. Can only be computed for binary classification.
"specificity"特异性或1假阳性率。只能计算为二元分类。
"average probability"The average probability assigned to the correct class. Requirement is that the used classifier provides probability estimations. The optimum performance is 1.
"average probability"平均概率分配到正确的类。要求是所用的分类提供了概率估计。最佳性能1。
"brier score"The Brier Score is generally defined as <sum over all observation i> <sum over all classes k> (I(y_i=k)-P(k))^2, with I() denoting the indicator function and P(k) the estimated probability for class k. The optimum performance is 0.
"brier score"马库斯 - 布莱尔分数通常被定义为<sum over all observation i> <sum over all classes k> (I(y_i=k)-P(k))^2,I()表示指示功能和P(k)类的估计概率k。最佳性能为0。
"auc"The Area under the Curve (AUC) belonging to the empirical ROC curve computed from the estimated probabilities and the true class labels. Can only be computed for binary classification and if "scheme = iterationwise", s. below. S. also roc,cloutput-method.
"auc"属于经验的ROC曲线,计算估计概率和真正的类标签的曲线下面积(AUC)的区域。只能计算为二元分类,如果"scheme = iterationwise",S。以下。学也roc,cloutput-method。
参数:aggfun
Function that determines how performance among different iterations are aggregared. Default is meanrm, which computes the mean using na.rm=T. Other possible choices are quantiles.
aggregared功能,性能如何确定不同的迭代。默认是meanrm,计算平均使用na.rm=T。其他可能的选择是位数。
参数:plot
Should the performance of different classifiers be visualized by a joint boxplot ? Default is FALSE.
应不同分类的性能,可视化联合盒形图?默认FALSE。
参数:...
Further arguments passed to boxplot in the case that plot = TRUE.
进一步的参数传递到boxplot在的情况下,plot = TRUE。
值----------Value----------
A data.frame with rows corresponding to the compared classifiers
一个data.frame行相应的比较分类
注意----------Note----------
If more than one measure is computed and plot = TRUE, one separate
如果一个以上的措施计算plot = TRUE,一个单独的
作者(S)----------Author(s)----------
Martin Slawski <a href="mailto:ms@cs.uni-sb.de">ms@cs.uni-sb.de</a>
Anne-Laure Boulesteix <a href="mailto:boulesteix@ibe.med.uni-muenchen.de">boulesteix@ibe.med.uni-muenchen.de</a>
Christoph Bernau <a href="mailto:bernau@ibe.med.uni-muenchen.de">bernau@ibe.med.uni-muenchen.de</a>
参考文献----------References----------
Comparison of discrimination methods for the classification of tumors using gene expression data.<br> Journal of the American Statistical Association 97, 77-87
CMA - A comprehensive Bioconductor package for supervised classification with high dimensional data. BMC Bioinformatics 9: 439
参见----------See Also----------
classification, evaluation
classification,evaluation
举例----------Examples----------
## Not run: [#无法运行:]
### compare the performance of several discriminant analysis methods[#几个判别分析方法的性能比较]
### for the Khan dataset:[#汗集:]
data(khan)
khanX <- as.matrix(khan[,-1])
khanY <- khan[,1]
set.seed(27611)
fiveCV10iter <- GenerateLearningsets(y=khanY, method = "CV", fold = 5, niter = 2, strat = TRUE)
### candidate methods: DLDA, LDA, QDA, pls_LDA, sclda[#候选方法:DLDA,LDA,QDA,pls_LDA,sclda]
class_dlda <- classification(X = khanX, y=khanY, learningsets = fiveCV10iter, classifier = dldaCMA)
### peform GeneSlection for LDA, FDA, QDA (using F-Tests):[#有为,GeneSlection LDA的,美国FDA,QDA(F检验):]
genesel_da <- GeneSelection(X=khanX, y=khanY, learningsets = fiveCV10iter, method = "f.test")
###[#]
class_lda <- classification(X = khanX, y=khanY, learningsets = fiveCV10iter, classifier = ldaCMA, genesel= genesel_da, nbgene = 10)
class_qda <- classification(X = khanX, y=khanY, learningsets = fiveCV10iter, classifier = qdaCMA, genesel = genesel_da, nbgene = 2)
### We now make a comparison concerning the performance (sev. measures):[##我们现在有关的性能(sev.措施)作一比较:]
### first, collect in a list:[##首先,收集在一个列表:]
dalike <- list(class_dlda, class_lda, class_qda)
### use pre-defined compare function:[#使用预先定义的比较函数:]
comparison <- compare(dalike, plot = TRUE, measure = c("misclassification", "brier score", "average probability"))
print(comparison)
## End(Not run)[#结束(不运行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|