R语言 samr包 samr.assess.samplesize()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-29 22:11:10

samr.assess.samplesize(samr)
samr.assess.samplesize()所属R语言包：samr

                                    Assess the sample size for a SAM analysis
                                       评估的样本量为SAM分析

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Estimate the false discovery rate,  false negative rate, power and type I error for a SAM analysis. Currently implemented only for two class (unpaired or paired), one-sample and survival problems).
估计假发现率，假阴性率，电源和I型错误的SAM分析。目前只实现两个类（未配对或配对），一组样本与生存的问题）。

用法----------Usage----------

samr.assess.samplesize(samr.obj, data, dif, samplesize.factors=c(1,2,3,5),
min.genes = 10, max.genes = nrow(data$x)/2)

参数----------Arguments----------

参数：samr.obj
Object returned from call to samr
返回的对象调用SAMR

参数：data
Data list, same as that passed to samr.train
数据列表，一样传递给samr.train

参数：dif
Change in  gene expression between groups 1 and 2, for genes that are differentially expressed. For log base 2 data, a value of 1 means a 2-fold change. For One-sample problems, dif is the number of units away from zero for differentially expressed genes. For survival data, dif is the numerator of the Cox score statistic (this info is provided in the output of samr).
组1和2中，在基因表达之间的差异表达的基因的变化。对于log基座2的数据中，值1表示2倍的变化。一组样本的问题，不同的是外单位的数量从零差异表达基因。为了生存数据，不同的分子考克斯得分统计信息（该信息中提供的输出SAMR）。

参数：samplesize.factors
Integer vector of length 4, indicating the sample sizes to be examined. The values are factors that multiply the original sample size. So the value 1 means a sample size of ncol(data$x), 2 means a sample size of ncol(data$x), etc.
整数向量的长度为4，表示要检查的样本规模。的值乘以原始样品尺寸的因素。因此，值1表示NCOL的样本量（数据$ X），2表示一个数据样本量的NCOL（$ X）等。

参数：min.genes
Minimum number of genes that are assumed to truly changed  in the population
在人群中的最小数量的基因被假定为真正改变

参数：max.genes
Maximum number of genes that are assumed to truly changed  in the population
的最大数目的基因被假定为真正改变人口

Details

详细信息----------Details----------

Estimates  false discovery rate, false negative rate, power and type I error for a SAM analysis. The argument samplesize.factor allows the use to assess the effect of  varying the  sample size (total number of samples).  A  detailed  description of this calculation is given in the SAM manual
估计错误发现率，假阴性率，电源和I型错误的SAM分析。的的参数samplesize.factor允许使用不同的样本量（总样本数）的效果进行评估。此计算的详细描述中给出的SAM手册

值----------Value----------

A list with components <table summary="R valueblock"> <tr valign="top"><td>Results</td> <td> A matrix with columns: number of genes- both  the number differentially expressed genes in the population and number called significant; cutpoint- the threshold used for the absolute  SAM score d; FDR, 1-power- the median false discovery rate, also equal to the power for each gene; FDR-90perc- the upper 90th percentile of the FDR; FNR, Type 1 error- the false negative rate, also equal to  the type I error for each gene; FNR-90perc- the upper 90th percentile of the FNR </td></tr> <tr valign="top"><td>dif.call</td> <td> Change in  gene expression between groups 1 and 2, that was provided in the call to samr.assess.samplesize</td></tr> <tr valign="top"><td>difm</td> <td> The average difference in SAM score d for the genes differentially expressed vs unexpressed</td></tr> <tr valign="top"><td>samplesize.factor</td> <td> The  samplesize.factor that was passed to  samr.assess.samplesiz</td></tr> <tr valign="top"><td>n</td> <td> Number of samples in input data (i.e. ncol of x component in data)</td></tr> </table>
列表组件表summary="R valueblock"> <tr valign="top"> <TD> Results </ TD> <td>一个矩阵的列数的基因数的差异表达在人口和数字的基因称为显着性;分割点阈用于绝对SAM得分ð; FDR，1  - 电源中位数的假发现率，也针对每个基因的功率等于FDR-90perc上的第90个百分位FDR FNR，第1类错误的假阴性率，也等于每个基因I型错误; FNR-90perc-上的第90个百分位的FNR </ TD> </ TR> <TR VALIGN =“顶“<TD> dif.call </ TD> <TD>组1和2之间，提供了在调用samr.assess.samplesize </ TD> </ TR> <TR基因表达的变化VALIGN =“”> <TD>difm </ TD> <TD> SAM得分ð的平均差异的基因差异表达与不表达</ TD> </ TR> <TR VALIGN =“顶” > <TD> samplesize.factor </ TD> <TD>的samplesize.factor通过到samr.assess.samplesiz </ TD> </ TR> <tr valign="top"> <TD> X> </ TD> <TD>输入数据中的样品数（即NCOL的x分量的数据）</ TD> </ TR> </ TABLE>

（作者）----------Author(s)----------

Jun Li and Balasubrimanian Narasimhan and Robert Tibshirani

参考文献----------References----------

Significance analysis of microarrays applied to the ionizing radiation response"  PNAS 2001 98: 5116-5121, (Apr 24). http://www-stat.stanford.edu/~tibs/sam
Taylor, J., Tibshirani, R. and Efron. B. (2005).  The “Miss rate” for the analysis of gene expression data.  Biostatistics 2005 6(1):111-117.
A more complete  description is given in the SAM manual at http://www-stat.stanford.edu/~tibs/SAM

实例----------Examples----------

#generate some example data[产生一些示例数据]
set.seed(100)
x<-matrix(rnorm(1000*20),ncol=20)
dd<-sample(1:1000,size=100)

u<-matrix(2*rnorm(100),ncol=10,nrow=100)
x[dd,11:20]<-x[dd,11:20]+u

y<-c(rep(1,10),rep(2,10))

data=list(x=x,y=y, geneid=as.character(1:nrow(x)),
genenames=paste("g",as.character(1:nrow(x)),sep=""), logged2=TRUE)

log2=function(x){log(x)/log(2)}

# run SAM first[首先运行SAM]
samr.obj<-samr(data,  resp.type="Two class unpaired", nperms=100)

# assess current sample size (20), assuming 1.5fold difference on log base 2 scale[评估目前的样本量（20），假设2为底规模1.5fold差异]

samr.assess.samplesize.obj<- samr.assess.samplesize(samr.obj, data, log2(1.5))

# assess the effect of doubling the sample size[评估的样本量加倍的效果]

samr.assess.samplesize.obj2<- samr.assess.samplesize(samr.obj, data, log2(1.5))

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册