R语言 sda包 sda.ranking()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-29 23:16:22

sda.ranking(sda)
sda.ranking()所属R语言包：sda

                                    Shrinkage Discriminant Analysis 1: Predictor Ranking
                                       收缩判别分析：预测排名

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

sda.ranking determines a ranking of predictors by computing  CAT scores (correlation-adjusted t-scores) between the group centroids and the pooled mean.
sda.ranking决定排名的预测计算CAT组的质心和合并均得分（相关调整后的T值）之间。

plot.sda.ranking provides a graphical visualization of the top ranking features..
plot.sda.ranking提供了一个图形化的一流的功能..

用法----------Usage----------

sda.ranking(Xtrain, L, diagonal=FALSE, fdr=TRUE, plot.fdr=FALSE, verbose=TRUE)
## S3 method for class 'sda.ranking'
plot(x, top=40, ...)

参数----------Arguments----------

参数：Xtrain
A matrix  containing the training data set. Note that  the rows correspond to observations and the columns to variables.
一个矩阵包含训练数据集。需要注意的是行对应于观测值和列的变量。

参数：L
A factor with the class labels of the training samples.
训练样本的类标签的一个因素。

参数：diagonal
Chooses between LDA (default, diagonal=FALSE) and DDA (diagonal=TRUE).
之间选择，LDA（默认情况下，diagonal=FALSE）和DDA（diagonal=TRUE）。

参数：fdr
compute FDR values and HC scores for each feature.
计算每个功能FDR值和HC分数。

参数：plot.fdr
Show plot with estimated FDR values.
显示，估计的FDR值的图。

参数：verbose
Print out some info while computing.
打印出一些信息，而计算。

参数：x
An "sda.ranking" object – this is produced by the sda.ranking() function.
一个“sda.ranking”对象 - 这是由sda.ranking（）函数产生的。

参数：top
The number of top-ranking features shown in the plot (default: 40).
排名靠前的功能，在图中所示的数量（默认值：40）。

参数：...
Additional arguments for generic plot.
其他参数通用的图。

Details

详细信息----------Details----------

For each predictor variable and centroid a shrinkage CAT scores of the mean versus the pooled mean is computed.  The overall ranking of a feature is determined by the sum of the squared cat scores across all centroids. For the diagonal case (LDA) the (shrinkage) CAT score reduces to the (shrinkage) t-score.  Thus in the two-class diagonal case the feature are simply ranked according to the (shrinkage) t-scores.
对于每个预测变量和质心收缩CAT与汇集的平均值的平均分数计算。一个功能的整体排名的所有质心的平方猫分数的总和来确定。对于对角线的情况下（LDA）（收缩）的CAT评分降低（收缩）T-评分。因此，在两个阶级的对角情况下的功能简单地根据（收缩）叔分数指数排名。

Calling sda.ranking is step 1 in a classification analysis with the sda package.  Steps 2 and 3 are  sda and predict.sda
调用sda.ranking是步骤1中的分类分析与SDA包。步骤2和步骤3是sda和predict.sda的

See Ahdesm\"aki and Strimmer (2010) for details on multi-class CAT scores, Zuber and Strimmer (2009) for CAT scores in general.  For shrinkage t scores see Opgen-Rhein and Strimmer (2007).
为多类CAT的分数，朱伯和Strimmer（2009）的成绩一般CAT的详细信息，请参阅Ahdesm \“AKI和Strimmer的（2010）。收缩T分数Opgen  - 莱茵Strimmer的（2007）。

值----------Value----------

sda.ranking returns a matrix with the following columns:
sda.ranking返回一个矩阵的列：

参数：idx
original feature number
原始特征数

参数：score
sum of the squared CAT scores across groups - this determines the overall ranking of a feature
总和的平方CAT，各组得分 - 这决定了整体排名的功能

参数：cat
for each group and feature the cat score of the centroid versus the pooled mean
每个组和功能合并均与质心的猫得分

If fdr=TRUE then additionally local false discovery rate (FDR) values  as well as higher criticism (HC) scores are computed for each feature (using fdrtool).
如果fdr=TRUE然后再用虚假的发现率（FDR）值以及较高的批评（HC）的分数，计算每个功能（使用fdrtool）。

（作者）----------Author(s)----------

Miiika Ahdesm\"aki, Verena Zuber and Korbinian Strimmer (<a href="http://strimmerlab.org">http://strimmerlab.org</a>).

参考文献----------References----------

using cat scores and false non-discovery rate control. Ann. Appl. Stat. 4: 503-519. Preprint available from http://arxiv.org/abs/0903.2003.
genes by a distribution-free shrinkage approach. Statist. Appl. Genet. Mol. Biol. 6:9.
Bioinformatics 25: 2700-2707.

参见----------See Also----------

catscore, sda, predict.sda.
catscore，sda，predict.sda。

实例----------Examples----------

# load sda library[加载SDA库]
library("sda")

################# [＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃]
# training data #[训练数据＃]
#################[＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃]

# prostate cancer set[前列腺癌组]
data(singh2002)

# training data[训练数据]
Xtrain = singh2002$x
Ytrain = singh2002$y

######################################### [＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃]
# feature ranking (diagonal covariance) #[功能分级（对角线协方差）]
#########################################[＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃]

# ranking using t-scores (DDA)[排名使用的T分数（DDA）]
ranking.DDA = sda.ranking(Xtrain, Ytrain, diagonal=TRUE)
ranking.DDA[1:10,]

# plot t-scores for the top 40 genes[图T-分数的40个基因]
plot(ranking.DDA, top=40)

# number of features with local FDR < 0.8 [功能与当地FDR <0.8]
# (i.e. features useful for prediction)[（即预测有用的功能）]
sum(ranking.DDA[,"lfdr"] < 0.8)

# number of features with local FDR < 0.2 [功能与当地FDR <0.2]
# (i.e. significant non-null features)[（即显著非空）]
sum(ranking.DDA[,"lfdr"] < 0.2)

# optimal feature set according to HC score[最佳的功能集根据HC得分的]
plot(ranking.DDA[,"HC"], type="l")
which.max( ranking.DDA[1:1000,"HC"] )

##################################### [＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃]
# feature ranking (full covariance) #[功能分级（全协方差）]
#####################################[＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃＃]

# ranking using CAT-scores (LDA)[排名使用CAT-分数（LDA）]
ranking.LDA = sda.ranking(Xtrain, Ytrain, diagonal=FALSE)
ranking.LDA[1:10,]

# plot t-scores for the top 40 genes[图T-分数的40个基因]
plot(ranking.LDA, top=40)

# number of features with local FDR < 0.8 [功能与当地FDR <0.8]
# (i.e. features useful for prediction)[（即预测有用的功能）]
sum(ranking.LDA[,"lfdr"] < 0.8)

# number of features with local FDR < 0.2 [功能与当地FDR <0.2]
# (i.e. significant non-null features)[（即显著非空）]
sum(ranking.LDA[,"lfdr"] < 0.2)

# optimal feature set according to HC score[最佳的功能集根据HC得分的]
plot(ranking.LDA[,"HC"], type="l")
which.max( ranking.LDA[1:1000,"HC"] )

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 sda包 sda.ranking()函数中文帮助文档(中英文对照)

浏览过的版块