R语言 limma包 roast()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 23:26:08

roast(limma)
roast()所属R语言包：limma

                                    Rotation Gene Set Tests
                                       旋转基因组测试

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Rotation gene set testing for linear models.
旋转线性模型的基因组测试。

用法----------Usage----------

roast(iset=NULL, y, design, contrast=ncol(design), set.statistic="mean",
   gene.weights=NULL, array.weights=NULL, block=NULL, correlation,
   var.prior=NULL, df.prior=NULL, trend.var=FALSE, nrot=999)
mroast(iset=NULL, y, design, contrast=ncol(design), set.statistic="mean",
   gene.weights=NULL, array.weights=NULL, block=NULL, correlation,
   var.prior=NULL, df.prior=NULL, trend.var=FALSE, nrot=999, adjust.method="BH", midp=TRUE)

参数----------Arguments----------

参数：iset
index vector specifying which rows (probes) of y are in the test set.  This can be a vector of indices, or a logical vector of the same length as statistics, or any vector such as y[iset,] contains the values for the gene set to be tested. For mroast, iset is a list of index vectors.
索引向量指定行（探针）y在测试集。这可能是一个指数的向量，或statistics，或如向量的长度相同的逻辑向量y[iset,]包含基因的值设置进行测试。 mroast，iset是一个索引向量列表。

参数：y
numeric matrix giving log-expression or log-ratio values for a series of microarrays, or any object that can coerced to a matrix including ExpressionSet, MAList, EList or PLMSet objects. Rows correspond to probes and columns to samples. If either var.prior or df.prior are null, then y should contain values for all genes on the arrays. If both prior parameters are given, then only y values for the test set are required.
提供log中表达或数比一个芯片系列，或任何对象，可以强制矩阵，包括ExpressionSet，MAList，EList或PLMSet值的数值矩阵对象。行对应探针和样品列。如果任var.prior或df.prior是空的，然后y应该包含所有的阵列基因的值。如果双方事先给出的参数，那么只有y值的测试集需要。

参数：design
design matrix
设计矩阵

参数：contrast
contrast for which the test is required. Can be an integer specifying a column of design, or else a contrast vector of length equal to the number of columns of design.
对比测试是必需的。可以是一个整数，指定的列design，否则相反向量的长度等于design列数。

参数：set.statistic
summary set statistic. Possibilities are "mean","floormean","mean50" or "msq".
设置汇总统计。可能性是"mean""floormean"，"mean50"或"msq"。

参数：gene.weights
optional numeric vector of weights for genes in the set. Can be positive or negative.  For mroast this vector must have length equal to nrow(y).  For roast, can be of length nrow(y) or of length equal to the number of genes in the test set.
可选的数字组中的基因的权重向量。可以是积极或消极的。 mroast这个向量的长度必须等于nrow(y)。对于roast，可以是长度nrow(y)或在测试组的基因数目相等的长度。

参数：array.weights
optional numeric vector of array weights.
可选的数字阵列权重向量。

参数：block
optional vector of blocks.
块可选向量。

参数：correlation
correlation between blocks.
块之间的相关性。

参数：var.prior
prior value for residual variances. If not provided, this is estimated from all the data using squeezeVar.
前值剩余差额。如果没有提供，这是估计从所有的使用squeezeVar的数据。

参数：df.prior
prior degrees of freedom for residual variances. If not provided, this is estimated using squeezeVar.
之前的自由程度，剩余差额。如果没有提供，这是估计使用squeezeVar。

参数：trend.var
logical, should a trend be estimated for var.prior?  See eBayes for details.  Only used if var.prior or df.prior are NULL.
逻辑，应该的趋势估计var.prior？看到eBayes详情。只用了如果var.prior或df.prior是NULL。

参数：nrot
number of rotations used to estimate the p-values.
用来估计p值旋转。

参数：adjust.method
method used to adjust the p-values for multiple testing. See p.adjust for possible values.
方法采用P-值调整为多个测试。看到p.adjust可能的值。

参数：midp
logical, should mid-p-values be used in instead of ordinary p-values when adjusting for multiple testing?
逻辑，应中旬P-值，而不是用在普通的p值时，调整为多个测试？

Details

详情----------Details----------

This function implements the ROAST gene set test from Wu et al (2010). It tests whether any of the genes in the set are differentially expressed. The function can be used for any microarray experiment which can be represented by a linear model. The design matrix for the experiment is specified as for the lmFit function, and the contrast of interest is specified as for the contrasts.fit function. This allows users to focus on differential expression for any coefficient or contrast in a linear model. If contrast is not specified, the last coefficient in the linear model will be tested. The arguments array.weights, block and correlation have the same meaning as they for for the lmFit function.
此功能实现从武等人（2010）的的烤基因组测试。它测试是否集合中的任何基因的差异表达。该功能可用于任何微阵列实验，可以通过一个线性模型来表示。实验设计矩阵被指定为lmFit功能，和利益的对比度contrasts.fit函数指定。这允许用户在任何系数线性模型中的对比差异表达的重点。 contrast如果不指定，在去年系数线性模型将受到考验。的论点array.weights，block和correlation为lmFit功能，因为他们有相同的含义。

The arguments df.prior and var.prior have the same meaning as in the output of the eBayes function. If these arguments are not supplied, they are estimated exactly as is done by eBayes.
论据df.prior和var.prioreBayes函数的输出有相同的含义。如果不提供这些参数，它们估计完全一样eBayes。

The argument gene.weights allows directions or weights to be set for individual genes in the set.
参数gene.weights允许指示或重量为个人的基因组中的设置。

The gene set statistics "mean", "floormean", "mean50" and msq are defined by Wu et al (2010). The different gene set statistics have different sensitivities to small number of genes. If set.statistic="mean" then the set will be statistically significantly only when the majority of the genes are differentially expressed. "floormean" and "mean50" will detect as few as 25% differentially expressed. "msq" is sensitive to even smaller proportions of differentially expressed genes, if the effects are reasonably large.
基因组的统计"mean"，"floormean"，"mean50"和msqWu等（2010）定义的。不同基因组的统计数据有少数基因的敏感程度不同。如果set.statistic="mean"然后设置将统计显著只有当广大的基因差异表达。 "floormean"和"mean50"将检测数高达25％的差异表达。 "msq"是敏感的差异表达基因的比例就更小了，如果效果是相当大的。

The output gives p-values three possible alternative hypotheses,  "Up" to test whether the genes in the set tend to be up-regulated, with positive t-statistics, "Down" to test whether the genes in the set tend to be down-regulated, with negative t-statistics, and "Mixed" to test whether the genes in the set tend to be differentially expressed, without regard for direction.
输出使P-值的三种可能的替代假设，"Up"测试是否在一组的基因往往是积极的t-统计上调，"Down"测试是否在一组的基因往往是下调，负的t-统计量，和"Mixed"测试是否往往差异表达方面没有方向，在一组的基因。

roast estimates p-values by simulation, specifically by random rotations of the orthogonalized residuals (Langsrud, 2005), so p-values will vary slightly from run to run. To get more precise p-values, increase the number of rotations nrot. The p-value is computed as (b+1)/(nrot+1) where b is the number of rotations giving a more extreme statistic than that observed (Phipson and Smyth, 2010). This means that the smallest possible p-value is 1/(nrot+1).
roastp值由模拟估计，特别是通过随机正交残差（Langsrud，2005年）的旋转，所以p值将略有不同运行运行。为了得到更精确的p值，增加旋转nrot。 p值计算为(b+1)/(nrot+1)b是旋转，给人一种更加极端的统计数比观察（Phipson和史密斯，2010）。这意味着尽可能最小的p值是1/(nrot+1)。

mroast does roast tests for multiple sets, including adjustment for multiple testing. By default, mroast reports ordinary p-values but uses mid-p-values at the multiple testing stage. Mid-p-values are probably a good choice when using false discovery rates (adjust.method="BH") but not when controlling the family-wise type I error rate (adjust.method="holm").
mroast烤多套测试，包括多个测试调整。默认情况下，mroast报告普通的p值，但使用在多个测试阶段中期p值。中期p值可能是一个不错的选择时，使用虚假的发现率（adjust.method="BH"），但不控制家庭明智的I型错误率（adjust.method="holm"）。

值----------Value----------

roast produces an object of class "Roast". This consists of a list with the following components:
roast生产类"Roast"的对象。这包括以下组件的列表：

参数：p.value
data.frame with columns Active.Prop and P.Value, giving the proportion of genes in the set contributing meaningfully to significance and estimated p-values, respectively. Rows correspond to the alternative hypotheses mixed, up or down.
与列数据框Active.Prop和P.Value贡献有意义的意义和估计的p值，分别在一套基因的比例。行对应的混合替代假说，向上或向下。

参数：var.prior
prior value for residual variances.
前值剩余差额。

参数：df.prior
prior degrees of freedom for residual variances.
之前的自由程度，剩余差额。

mroast produces a list of three matrices, each with a row for each set:
mroast产生了三个矩阵，每一个行每套列表：

参数：P.Value
unadjusted p-values for the mixed, up and down alternative hypotheses
未经调整的p值的混合，向上和向下的替代假说

参数：Adj.P.Value
adjusted p-values for each set and each hypothesis
每个组和每个假设调整后的P-值

参数：Active.Proportion
proportion of active genes for each set and each hypothesis
活跃基因的比例为每一集，每一种假说

注意----------Note----------

The default setting for the set statistic was changed in limma 3.5.9 (3 June 2010) from "msq" to "mean".
更改默认设置的一组统计，从"msq""mean"在limma 3.5.9（2010年日至6月3日）。

作者（S）----------Author(s)----------

Gordon Smyth and Di Wu

参考文献----------References----------

Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23, 980-987.
Rotation tests. Statistics and Computing 15, 53-60.
Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn. Statistical Applications in Genetics and Molecular Biology, Volume 9, Article 39.
Practicing safe statistics with the mid-p. Canadian Journal of Statistics 22, 103-110.
Bioinformatics 26, 2176-2182. http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btq401?

参见----------See Also----------

roast performs a self-contained test in the sense defined by Goeman and Buhlmann (2007). For a competitive gene set test, see wilcoxGST. For a competitive gene set enrichment analysis using a database of gene sets, see romer.
roast执行Goeman和Buhlmann（2007）所定义的意义自足的测试。对于有竞争力的基因组试验中，看到wilcoxGST。对于有竞争力的基因组富集基因组数据库的分析，看到romer.

An overview of tests in limma is given in 08.Tests.
的概述在limma的测试是在08.Tests。

举例----------Examples----------

y <- matrix(rnorm(100*4),100,4)
design <- cbind(Intercept=1,Group=c(0,0,1,1))

# First set of 5 genes contains 3 that are genuinely differentially expressed[第一组的5个基因包含3个真正差异表达]
iset1 <- 1:5
y[iset1,3:4] <- y[iset1,3:4]+3

# Second set of 5 genes contains none that are DE[第二组包含5个基因中没有被DE]
iset2 <- 6:10

roast(iset1,y,design,contrast=2)
mroast(list(set1=iset1,set2=iset2),y,design,contrast=2)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 limma包 roast()函数中文帮助文档(中英文对照)

浏览过的版块