sda.ranking(sda)
sda.ranking()所属R语言包:sda
Shrinkage Discriminant Analysis 1: Predictor Ranking
收缩判别分析:预测排名
译者:生物统计家园网 机器人LoveR
描述----------Description----------
sda.ranking determines a ranking of predictors by computing CAT scores (correlation-adjusted t-scores) between the group centroids and the pooled mean.
sda.ranking决定排名的预测计算CAT组的质心和合并均得分(相关调整后的T值)之间。
plot.sda.ranking provides a graphical visualization of the top ranking features..
plot.sda.ranking提供了一个图形化的一流的功能..
用法----------Usage----------
sda.ranking(Xtrain, L, diagonal=FALSE, fdr=TRUE, plot.fdr=FALSE, verbose=TRUE)
## S3 method for class 'sda.ranking'
plot(x, top=40, ...)
参数----------Arguments----------
参数:Xtrain
A matrix containing the training data set. Note that the rows correspond to observations and the columns to variables.
一个矩阵包含训练数据集。需要注意的是行对应于观测值和列的变量。
参数:L
A factor with the class labels of the training samples.
训练样本的类标签的一个因素。
参数:diagonal
Chooses between LDA (default, diagonal=FALSE) and DDA (diagonal=TRUE).
之间选择,LDA(默认情况下,diagonal=FALSE)和DDA(diagonal=TRUE)。
参数:fdr
compute FDR values and HC scores for each feature.
计算每个功能FDR值和HC分数。
参数:plot.fdr
Show plot with estimated FDR values.
显示,估计的FDR值的图。
参数:verbose
Print out some info while computing.
打印出一些信息,而计算。
参数:x
An "sda.ranking" object – this is produced by the sda.ranking() function.
一个“sda.ranking”对象 - 这是由sda.ranking()函数产生的。
参数:top
The number of top-ranking features shown in the plot (default: 40).
排名靠前的功能,在图中所示的数量(默认值:40)。
参数:...
Additional arguments for generic plot.
其他参数通用的图。
Details
详细信息----------Details----------
For each predictor variable and centroid a shrinkage CAT scores of the mean versus the pooled mean is computed. The overall ranking of a feature is determined by the sum of the squared cat scores across all centroids. For the diagonal case (LDA) the (shrinkage) CAT score reduces to the (shrinkage) t-score. Thus in the two-class diagonal case the feature are simply ranked according to the (shrinkage) t-scores.
对于每个预测变量和质心收缩CAT与汇集的平均值的平均分数计算。一个功能的整体排名的所有质心的平方猫分数的总和来确定。对于对角线的情况下(LDA)(收缩)的CAT评分降低(收缩)T-评分。因此,在两个阶级的对角情况下的功能简单地根据(收缩)叔分数指数排名。
Calling sda.ranking is step 1 in a classification analysis with the sda package. Steps 2 and 3 are sda and predict.sda
调用sda.ranking是步骤1中的分类分析与SDA包。步骤2和步骤3是sda和predict.sda的
See Ahdesm\"aki and Strimmer (2010) for details on multi-class CAT scores, Zuber and Strimmer (2009) for CAT scores in general. For shrinkage t scores see Opgen-Rhein and Strimmer (2007).
为多类CAT的分数,朱伯和Strimmer(2009)的成绩一般CAT的详细信息,请参阅Ahdesm \“AKI和Strimmer的(2010)。收缩T分数Opgen - 莱茵Strimmer的(2007)。
值----------Value----------
sda.ranking returns a matrix with the following columns:
sda.ranking返回一个矩阵的列:
参数:idx
original feature number
原始特征数
参数:score
sum of the squared CAT scores across groups - this determines the overall ranking of a feature
总和的平方CAT,各组得分 - 这决定了整体排名的功能
参数:cat
for each group and feature the cat score of the centroid versus the pooled mean
每个组和功能合并均与质心的猫得分
If fdr=TRUE then additionally local false discovery rate (FDR) values as well as higher criticism (HC) scores are computed for each feature (using fdrtool).
如果fdr=TRUE然后再用虚假的发现率(FDR)值以及较高的批评(HC)的分数,计算每个功能(使用fdrtool)。
(作者)----------Author(s)----------
Miiika Ahdesm\"aki, Verena Zuber and Korbinian Strimmer (<a href="http://strimmerlab.org">http://strimmerlab.org</a>).
参考文献----------References----------
using cat scores and false non-discovery rate control. Ann. Appl. Stat. 4: 503-519. Preprint available from http://arxiv.org/abs/0903.2003.
genes by a distribution-free shrinkage approach. Statist. Appl. Genet. Mol. Biol. 6:9.
Bioinformatics 25: 2700-2707.
参见----------See Also----------
catscore, sda, predict.sda.
catscore,sda,predict.sda。
实例----------Examples----------
# load sda library[加载SDA库]
library("sda")
################# [################]
# training data #[训练数据#]
#################[################]
# prostate cancer set[前列腺癌组]
data(singh2002)
# training data[训练数据]
Xtrain = singh2002$x
Ytrain = singh2002$y
######################################### [########################################]
# feature ranking (diagonal covariance) #[功能分级(对角线协方差)]
#########################################[########################################]
# ranking using t-scores (DDA)[排名使用的T分数(DDA)]
ranking.DDA = sda.ranking(Xtrain, Ytrain, diagonal=TRUE)
ranking.DDA[1:10,]
# plot t-scores for the top 40 genes[图T-分数的40个基因]
plot(ranking.DDA, top=40)
# number of features with local FDR < 0.8 [功能与当地FDR <0.8]
# (i.e. features useful for prediction)[(即预测有用的功能)]
sum(ranking.DDA[,"lfdr"] < 0.8)
# number of features with local FDR < 0.2 [功能与当地FDR <0.2]
# (i.e. significant non-null features)[(即显著非空)]
sum(ranking.DDA[,"lfdr"] < 0.2)
# optimal feature set according to HC score[最佳的功能集根据HC得分的]
plot(ranking.DDA[,"HC"], type="l")
which.max( ranking.DDA[1:1000,"HC"] )
##################################### [####################################]
# feature ranking (full covariance) #[功能分级(全协方差)]
#####################################[####################################]
# ranking using CAT-scores (LDA)[排名使用CAT-分数(LDA)]
ranking.LDA = sda.ranking(Xtrain, Ytrain, diagonal=FALSE)
ranking.LDA[1:10,]
# plot t-scores for the top 40 genes[图T-分数的40个基因]
plot(ranking.LDA, top=40)
# number of features with local FDR < 0.8 [功能与当地FDR <0.8]
# (i.e. features useful for prediction)[(即预测有用的功能)]
sum(ranking.LDA[,"lfdr"] < 0.8)
# number of features with local FDR < 0.2 [功能与当地FDR <0.2]
# (i.e. significant non-null features)[(即显著非空)]
sum(ranking.LDA[,"lfdr"] < 0.2)
# optimal feature set according to HC score[最佳的功能集根据HC得分的]
plot(ranking.LDA[,"HC"], type="l")
which.max( ranking.LDA[1:1000,"HC"] )
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|