geneSetTest(limma)
geneSetTest()所属R语言包:limma
Mean-rank Gene Set Test
平均排名基因组测试
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Test whether a set of genes is highly ranked relative to other genes in terms of a given statistic. Genes are assumed to be independent.
测试一组基因是否是高度排在一个给定的统计方面相对其他基因。基因被认为是独立的。
用法----------Usage----------
geneSetTest(selected, statistics, alternative="mixed", type="auto", ranks.only=TRUE, nsim=9999)
wilcoxGST(selected, statistics, ...)
barcodeplot(statistics,selected,selected2=NULL,labels=c("Up","Down"),
quantiles=c(-1,1),col.bars=NULL,offset.bars=!is.null(selected2), ...)
参数----------Arguments----------
参数:selected
index vector for the gene set. This can be a vector of indices, or a logical vector of the same length as statistics or, in general, any vector such that statistic[selected] gives the statistic values for the gene set to be tested.
指数向量的基因组。这可能是一个指数的向量,或statistics,一般长度相同的逻辑向量,任何向量statistic[selected]设置的基因进行测试的统计值。
参数:selected2
index vector for a second gene set. Usually used to specify down-regulated genes when selected is used for up-regulated genes.vector specifying the elements of statistic in the test group.
第二个基因集的索引向量。通常用于指定当selected为上调genes.vector使用的指定元素的下调基因statistic在测试组。
参数:statistics
numeric vector giving the values of the test statistic for every gene or probe in the reference set, usually every probe on the microarray.
数字向量提供参考集的每一个基因或探针测试统计值,通常每个探针微阵列。
参数:alternative
character string specifying the alternative hypothesis, must be one of "mixed", "either", "up" or "down". "two.sided", "greater" and "less" are also permitted as synonyms for "either", "up" and "down" respectively.
字符串指定替代假说,必须是一个"mixed","either","up"或"down"。 "two.sided","greater"和"less"也允许"either","up"和"down"分别同义词。
参数:type
character string specifying whether the statistics are signed (t-like, "t") or unsigned (F-like, "f") or whether the function should make an educated guess ("auto"). If the statistic is unsigned, then it assume that larger statistics are more significant.
字符串指定的统计数据是否签署(T-"t")或符号(F - ,像"f"),或是否功能应该让一个受过教育的猜测("auto")。如果统计数据是无符号的,那么它承担更大的统计数字更为显着。
参数:ranks.only
logical, if TRUE only the ranks of the statistics are used.
逻辑,如果TRUEstatistics用于行列。
参数:nsim
number of random samples to take in computing the p-value. Not used if ranks.only=TRUE.
采取随机抽样计算p值的数目。如果ranks.only=TRUE不使用。
参数:labels
character vector of length 2 of labels associated with large and small statistics respectively. Displayed at ends of the barcodeplot.
特征向量的长度分别大大小小的统计相关的标签2。显示在barcodeplot的两端。
参数:quantiles
numeric vector of length 2, giving cutoff values for statistics considered small or large respectively. Used to color the rectangle of the barcodeplot.
数字矢量长度为2,给statistics或大或小的分别临界值。使用颜色的barcodeplot的矩形。
参数:col.bars
character vector giving colors for the bars on the barcodeplot. Defaults to "black" for one set or c("red","blue") for two sets.
字符向量为条形barcodeplot给予的颜色。默认为"black"一组或c("red","blue")两套。
参数:offset.bars
logical. When there are two sets, bars are normally offset up and down from the rectangle of the barcodeplot.
逻辑。当有两套,条形通常抵消从矩形的barcodeplot的上下。
参数:...
other arguments are passed to geneSetTest (wilcoxGST) or to plot (barcodelot).
其他参数传递给geneSetTest(wilcoxGST)或plot(barcodelot)。
Details
详情----------Details----------
wilcoxGST is a synonym for geneSetTest with ranks.only=TRUE. This test procedure was developed by Michaud et al (2008), who called it mean-rank gene-set enrichment.
wilcoxGST是geneSetTest与ranks.only=TRUE的代名词。这个测试程序的开发米肖等人(2008年),谁被称为它意味着排名基因组富集。
These functions compute a p-value to test the hypothesis that the selected set of genes tends to be more highly ranked in terms of some test statistic compared to randomly selected genes. The statistic might be any statistic of interest, for example a t-statistic or F-statistic for differential expression.
这些函数的计算p值测试的假设,即选定的一组基因,往往是在一些测试统计,随机选择的基因相比,更高度排名。统计可能是任何利益的统计,例如t-统计或F-统计的差异表达。
These functions perform competitive tests in the sense that genes in the test set are compared to other genes (Goeman and Buhlmann, 2007). By contrast, a self-contained gene set test such as roast tests for differential expression for the test set only without regard to other genes on the array. Like all gene set tests, these functions can be used to detect differential expression for a group of genes, even when the effects are too small or there is too little data to detect the genes individually. The also provides a means to compare the results between different experiments.
这些函数执行竞争力的测试,在测试集的基因比其他基因(Goeman和Buhlmann,2007年)的意义。相比之下,只设置一个自包含的基因,如roast差异表达的测试,测试集测试,而不考虑阵列中的其他基因。像所有的基因组测试,这些功能可用于检测差异表达的基因组,即使影响太小或太少的数据单独检测的基因。还提供了一种方法来比较不同的实验之间的结果。
Because it is based on permuting genes, geneSetTest assumes that the different genes (or probes) are independent. (Strictly speaking, it assumes that the genes in the set are no more correlated on average than randomly selected genes.) This assumption may be reasonable if the gene set is relatively small and if there is relatively little genotypic variation in the data, for example if the data is obtained from genetically identical inbred mice. The independence assumption may be misleading if the gene set is large or if the data contains a lot of genotypic variation, for example for human cancer samples. These assumptions, when valid, permit a much quicker and more powerful significance test to be conducted.
因为它是置换基因的基础上,geneSetTest假设,不同的基因(或探针)是独立的。 (严格地说,它假设在一套基因,没有更多的平均比随机选择的基因相关。)如果基因组相对较小,这种假设可能是合理的,如果是比较小的基因型变化的数据,例如,如果数据是从近交系小鼠的基因完全相同。独立性假设可能会误导基因组,如果是大型或如果数据中包含大量的基因型差异,例如人类癌症样本。这些假设,有效时,允许进行一个更快和更强大的测试意义。
The statistics are usually a set of probe-wise statistics arising for some comparison from a microarray experiment. They may be t-statistics, meaning that the genewise null hypotheses would be rejected for large positive or negative values, or they may be F-statistics, meaning that only large values are significant. Any set of signed statistics, such as log-ratios, M-values or moderated t-statistics, are treated as t-like. Any set of unsigned statistics, such as F-statistics, posterior probabilities or chi-square tests are treated as F-like. If type="auto" then the statistics will be taken to be t-like if they take both positive and negative values and will be taken to be F-like if they are all of the same sign.
statistics通常是一组一些微阵列实验比较明智的探针产生的统计。他们可能是t-统计,2-6。虚无假设被拒绝大的积极或消极的价值,也可能是F-统计,这意味着只有大的值是显着的。任何集,签署的统计数据,如log比,M值或主持t-统计,被视为T-。任何无符号的统计数据,如F-统计,后验概率或卡方检验,被视为类似的F-。如果type="auto"然后将采取的统计数字是T-样,如果他们采取的积极和消极的价值观和将采取的F-样,如果他们是相同的符号。
There are four possible alternatives to test for. alternative=="up" means the genes in the set tend to be up-regulated, with positive t-statistics. alternative=="down" means the genes in the set tend to be down-regulated, with negative t-statistics. alternative=="either" means the set is either up or down-regulated as a whole. alternative=="mixed" test whether the genes in the set tend to be differentially expressed, without regard for direction. In this case, the test will be significant if the set contains mostly large test statistics, even if some are positive and some are negative.
有四种可能的替代方案测试。 alternative=="up"是指在一组的基因往往是上调的,积极的t-统计,。 alternative=="down"是指在一组的基因往往是负的t-统计,下调。 alternative=="either"意味着集向上或作为一个整体下调。 alternative=="mixed"测试是否倾向于将差异表达方面没有方向,在一组的基因。在这种情况下,该测试将是重要的,如果集合中包含大多是大型的测试统计,即使有些是积极的,有些是负面的。
The latter three alternatives are appropriate if you have a prior expection that all the genes in the set will react in the same direction. The "mixed" alternative is appropriate if you know only that the genes are involved in the relevant pathways, possibly in different directions. The "mixed" is the only meaningful alternative with F-like statistics.
后者的三种办法是适当的,如果你有一个事先expection集合中的所有基因,将反应在同一方向。 "mixed"替代是适当的,如果你只知道基因有关的途径,可能在不同的方向,在参与。 "mixed"是唯一有意义的F-统计的替代品。
The test statistic used for the gene-set-test is the mean of the statistics in the set. If ranks.only is TRUE the only the ranks of the statistics are used. In this case the p-value is obtained from a Wilcoxon test. If ranks.only is FALSE, then the p-value is obtained by simulation using nsim random selected sets of genes.
基因组测试使用的测试统计,是集统计的平均值。如果ranks.only是TRUE唯一使用的统计队伍。在这种情况下,p值从Wilcoxon检验。 ranks.only如果是FALSE,p值是通过使用nsim随机选取的基因套模拟。
barcodeplot is a graphical representation of the Wilcox gene set test using ranks. It can be used for one set, or to displaying directional sets, when there are separate sets of genes expected to go up and down respectively.
barcodeplot是威尔科克斯基因组测试使用行列的图形表示。它可以用来为一组,或显示方向性套,有两套独立的基因有望去向上和向下分别时。
值----------Value----------
geneSetTest and wilcoxGST return a numeric value giving the estimated p-value.
geneSetTest和wilcoxGST返回一个数值,使估计的p值。
barcodeplot and barcodeplot2 return no value but produce a plot as a side effect.
barcodeplot和barcodeplot2没有返回值,但作为一个副作用产生的图。
作者(S)----------Author(s)----------
Gordon Smyth and Di Wu
参考文献----------References----------
Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23, 980-987.
Integrative analysis of RUNX1 downstream pathways and target genes. BMC Genomics 9, 363. http://www.biomedcentral.com/1471-2164/9/363
参见----------See Also----------
roast, romer, wilcox.test
roast,romer,wilcox.test
An overview of tests in limma is given in 08.Tests.
的概述在limma的测试是在08.Tests。
举例----------Examples----------
stat <- rnorm(100)
sel <- 1:10
wilcoxGST(sel,stat)
barcodeplot(stat,sel)
sel2 <- 11:20
barcodeplot(stat,sel,sel2)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|