R语言 gage包 esset.grp()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 18:22:47

esset.grp(gage)
esset.grp()所属R语言包：gage

                                       The non-redundant signcant gene set list
                                       在非冗余signcant的基因组名单

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

This function extract a non-redundant signcant gene set list, groups of redundant gene sets, and related data from gage results. Redundant gene sets are those overlap heavily in their effective member gene lists or core genes.
功能提取signcant非冗余的基因组名单，多余的基因套组，和gage结果的相关数据。多余的基因组的严重重叠在其有效的基因列表成员或核心基因。

用法----------Usage----------

esset.grp(setp, exprs, gsets, ref = NULL, samp = NULL, test4up = TRUE,
same.dir = TRUE, compare = "paired", use.fold = TRUE, cutoff = 0.01,
use.q = FALSE, pc = 10^-10, output = TRUE, outname = "esset.grp",
make.plot = FALSE, pdf.size = c(7, 7), core.counts = FALSE, get.essets =
TRUE, bins = 10, bsize = 1, cex = 0.5, layoutType = "circo", name.str =
c(10, 100), ...)

参数----------Arguments----------

参数：setp
a numeric matrix, the result p-value matrix returned by gage function. Check gage help information for details.
数字矩阵，p值矩阵gage函数返回的结果。检查gage有关详细信息，帮助信息。

参数：exprs
an expression matrix or matrix-like data structure, with genes as rows and samples as columns.
表达矩阵或矩阵类似的数据结构，行和列的样本的基因。

参数：gsets
a named list, each element contains a gene set that is a character vector of gene IDs or symbols. For example, type head(kegg.gs). A gene set can also be a "smc" object defined in PGSEA package. Make sure that the same gene ID system is used for both gsets and exprs.
一个名为列表，每个元素包含一个基因组，基因ID或符号，是一个特征向量。例如，键入head(kegg.gs)。 A基因组也可以是“SMC”PGSEA包中定义的对象。确保相同的基因ID系统是使用两个gsets和exprs。

参数：ref
a numeric vector of column numbers for the reference condition or phenotype (i.e. the control group) in the exprs data matrix. Default ref = NULL, all columns are considered as target experiments.
一个参考条件，或在exprs数据矩阵型（即对照组）的列数的数字向量。默认REF = NULL，所有列被视为靶实验。

参数：samp
a numeric vector of column numbers for the target condition or phenotype (i.e. the experiment group) in the exprs data matrix. Default samp = NULL, all columns other than ref are considered as target experiments.
一列数为目标的条件或在exprs数据矩阵型（即实验组）的数字向量。默认SAMP = NULL，比文献中的所有列被视为靶实验。

参数：test4up
boolean, whether the input gage result or signficant gene sets are test results for up-regulated gene sets or not. This information is needed for selecting core member genes which contribute to the overall signficance of a gene sets.
布尔，输入是否上调基因组或不gage结果或signficant基因组的测试结果。此信息需要选择的核心成员，有助于整体建设的重大意义的基因组的基因。

参数：same.dir
boolean, whether the input gage result test for changes in a gene set toward a single direction (all genes up or down regulated) or changes towards both directions simultaneously.
布尔，是否输入gage朝着一个方向（所有基因或下调）或同时向两个方向变化的基因变化的结果测试。

参数：compare
character, which comparison scheme to be used: 'paired', 'unpaired', '1ongroup', 'as.group'. 'paired' is the default, ref and samp are of equal length and one-on-one paired by the original experimental design; 'as.group', group-on-group comparison between ref and samp; 'unpaired' (used to be '1on1'), one-on-one comparison between all possible ref and samp combinations, although the original experimental design may not be one-on-one paired; '1ongroup', comparison between one samp column at a time vs the average of all ref columns.
性格，比较计划将用于：配对，未成，1 ongroup，as.group“。 “配对”是默认，ref和桑普是平等的长度和原来的实验设计配对的一对“as.group”，组组ref和桑普之间的比较“未成”（用于1 ON1“），一对所有可能的ref和桑普的组合，虽然比较原始的实验设计可能没有一对配对; 1 ongroup”，一个桑普列比平均时间之间的比较所有文献列。

参数：use.fold
Boolean, whether the input gage results used fold changes or t-test statistics as per gene statistics. Default use.fold= TRUE.
布尔，是否输入gage结果倍变更或每个基因统计t检验统计。默认use.fold = TRUE。

参数：cutoff
numeric, p- or q-value cutoff, between 0 and 1. Default 0.01 (for p-value). When q-value is used, recommended cutoff value is 0.1.
数字0和1之间，P-Q-值截止。默认0.01（p值）。当Q值，推荐的临界值是0.1。

参数：use.q
boolean, whether to use q-value or not as the pre-selection of a signficant gene set list. Default to be FALSE, i.e. use the p-value instead.
布尔值，是否使用Q值或不作为预选一个signficant基因组列表。默认是假的，即p值，而不是使用。

参数：pc
numeric, cutoff p-value for the overlap between gene sets to be called 'redundant', default to 10e-10, may need trial-and-error to find the best value.
基因之间的重叠数字，截止p值设置被称为“多余的”，默认为10e-10，可能需要试验和错误，以便找到最好的价值。

参数：output
boolean, whether output the non-redundant gene set list as tab-delimited text file? Default to be TRUE.
布尔，是否输出制表符分隔的文本文件中的非冗余的基因组名单？默认为TRUE。

参数：outname
character, the prefix used to label the output file names when output = TRUE.
字符，标记输出文件名，输出= TRUE时使用的前缀。

参数：make.plot
boolean, whether to generate the network graph to visualize the redundancy (overlap in core genes) between significant gene sets. Currently the only feasible option is FALSE.
布尔，是否生成网络图，以可视化的重要基因集之间的冗余（核心基因重叠）。目前唯一可行的选择是假的。

参数：pdf.size
numeric vector of length 2, spcifies the PDF file size for network graph outpout. Currently unsupported.
数字向量，长度为2 spcifies的网络图outpout的PDF文件的大小。目前不支持。

参数：core.counts
Currently unsupported.
目前不支持。

参数：get.essets
Currently unsupported.
目前不支持。

参数：bins
Currently unsupported.
目前不支持。

参数：bsize
Currently unsupported.
目前不支持。

参数：cex
Currently unsupported.
目前不支持。

参数：layoutType
Currently unsupported.
目前不支持。

参数：name.str
numeric vector of length 2, specifies the substring range of the gene set name to show in the network graph. Currently unsupported.
向量长度为2的数字，指定子基因组的名称显示在网络图的范围。目前不支持。

参数：...
extra arguments to be passed into internal function make.graph. Currently unsupported.
额外的参数被传递到内部功能make.graph。目前不支持。

Details

详情----------Details----------

Redundant gene sets are defined to be those overlap heavily in their effective member gene lists or core genes. Core genes are those member genes that really contribute to the signficance of the gene set in GAGE analysis in the interesting direction(s). Argument pc set the cutoff for the overlap to be called "redundant". The redundancy between gene sets is then represented by a undirected graph/network. Groups of redundant gene sets are then derived as the connected component in the network graph.
多余的基因集的定义是那些重叠严重，在其有效的基因列表成员或核心基因。核心基因是这些成员的基因，真正有助于有趣的方向（S）在压力计分析基因组的建设的重大意义。参数pc被称为“多余”的重叠设置截止。基因集之间的冗余，然后由一个无向图/网络的代表。多余的基因套组，然后派生的网络图中的连接组件。

The selection criterion for gene sets here is p-value, instead of the commonly used q-value. This is because for extracting a non-redundant list of signficant gene sets, p-value is relative stable, but q-value changes when the total number of gene sets being considered changes. Of course, q-value is also a sensible selection criterion, when one take this step as a further refinement on the list of signficant gene sets.
该基因的选择标准，设置在这里是p值，而不是常用的Q值。这是因为提取非冗余的signficant基因组名单，p值是相对稳定，但Q值的变化时设置被视为改变的基因总数。当然，Q值也是一个明智的选择标准，当一个人采取这一步作为一个列表signficant基因组上的进一步细化。

值----------Value----------

The value returned by pairData is a list of 7 elements:
pairData返回的值是7个元素的列表：

参数：essentialSets
character vector, the non-redundant signficant gene set list.
特征向量，非冗余signficant的基因组列表。

参数：setGroups
list, each element is a character vector of a group of redundant gene sets.
列表，每个元素是一组冗余基因组的特征向量。

参数：allSets
character vector, the full list of signficant gene sets.
特征向量的signficant基因组的完整列表。

参数：setGroups
list, each element is a character vector of a connected component in the redundancy graph representation of the gene set.
列表，每个元素是一个连接组成的基因组中的冗余图形表示的特征向量。

参数：overlapCounts
numeric matrix, the overlap core gene counts between the signficant gene sets.
数字矩阵，重叠的核心之间的signficant的基因集基因计数。

参数：overlapPvals
numeric matrix, the significance (in p-values) of the overlap core gene counts between the signficant gene sets.
数字矩阵，重叠的核心之间的signficant基因组的基因数量的意义（p值）。

参数：coreGeneSets
list, each element is a character vector of the core genes in a significant gene set.
列表，每个元素是一个重要的基因组中的核心基因的特征向量。

作者（S）----------Author(s)----------

Weijun Luo <luo_weijun@yahoo.com>

参考文献----------References----------

Generally Applicable Gene Set Enrichment for Pathways Analysis. BMC Bioinformatics 2009, 10:161

参见----------See Also----------

gage the main function for GAGE analysis; sigGeneSet significant gene set from GAGE analysis; essGene essential member genes in a gene set;
gage压力计分析的主要功能;sigGeneSet压力计分析显着的基因组;essGene在基因组的重要成员的基因;

举例----------Examples----------

data(gse16873)
cn=colnames(gse16873)
hn=grep('HN',cn, ignore.case =TRUE)
dcis=grep('DCIS',cn, ignore.case =TRUE)
data(kegg.gs)

#kegg test for 1-directional changes[KEGG测试为1方向变化]
gse16873.kegg.p <- gage(gse16873, gsets = kegg.gs,
ref = hn, samp = dcis)
#kegg test for 2-directional changes[KEGG为2方向变化的测试]
gse16873.kegg.2d.p <- gage(gse16873, gsets = kegg.gs,
ref = hn, samp = dcis, same.dir = FALSE)
gse16873.kegg.esg.up <- esset.grp(gse16873.kegg.p$greater,
gse16873, gsets = kegg.gs, ref = hn, samp = dcis,
test4up = TRUE, output = TRUE, outname = "gse16873.kegg.up", make.plot = FALSE)
gse16873.kegg.esg.dn <- esset.grp(gse16873.kegg.p$less,
gse16873, gsets = kegg.gs, ref = hn, samp = dcis,
test4up = FALSE, output = TRUE, outname = "gse16873.kegg.dn", make.plot = FALSE)
gse16873.kegg.esg.2d <- esset.grp(gse16873.kegg.2d.p$greater,
gse16873, gsets = kegg.gs, ref = hn, samp = dcis,
test4up = TRUE, output = TRUE, outname = "gse16873.kegg.2d", make.plot = FALSE)
names(gse16873.kegg.esg.up)
head(gse16873.kegg.esg.up$essentialSets, 4)
head(gse16873.kegg.esg.up$setGroups, 4)
head(gse16873.kegg.esg.up$coreGeneSets, 4)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册