找回密码
 注册
查看: 702|回复: 0

R语言 GOstats包 GOHyperG()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-25 20:54:20 | 显示全部楼层 |阅读模式
GOHyperG(GOstats)
GOHyperG()所属R语言包:GOstats

                                        (DEPRECATED) Hypergeometric Tests for GO
                                         (下策)超几何测试

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Use hyperGTest instead.
使用hyperGTest代替。

Given a set of unique Entrez Gene Identifiers, a microarray annotation data package name, and the GO category of interest, this function will compute Hypergeomtric p-values for overrepresentation of each GO term in the specified category among the GO annotations for the interesting genes (as indicated by the Entrez Gene ids).
给出了一套独特的Entrez基因标识,一个芯片标注的数据包名称,以及权益的类别,这个函数将计算为每个GO术语的比例过高,在指定类别之间的有趣的基因的GO注释Hypergeomtric带够值( Entrez基因IDS)。


用法----------Usage----------


GOHyperG(x, lib, what="MF", universe=NULL)



参数----------Arguments----------

参数:x
A character vector of unique Entrez Gene identifiers.  
一个独特的Entrez基因标识符的字符向量。


参数:lib
The name of the annotation data package for the chip that was used or "YEAST", see details for more information.
注释数据包使用或"YEAST",看到更多的信息的详细信息的芯片的名称。


参数:what
One of "MF", "BP", or "CC" indicating which of the GO categories to use for the computation.  In GOKEGGHyperG, what can also be "KEGG"
“MF”,“BP”,或“CC”的指示计算使用的GO类别之一。在GOKEGGHyperG,也可以被“KEGG”


参数:universe
A character vector of unique Entrez Gene identifiers or NULL.  This is the population (the urn) of the Hypergeometric test.  When NULL (default), the population is all Entrez Gene ids in the annotation package that have a GO term annotation in the specified GO category (see details).
一个独特的Entrez基因标识或NULL特征向量。这是一个人口超几何测试(瓮)。当NULL(默认),人口中的所有批注包Entrez基因IDS有一个GO在指定的GO类别的术语注释(见详情)。


Details

详情----------Details----------

The Entrez Gene ids given in x define the selected set of genes.  The universe of Entrez Gene ids is determined by the chip annotation data package (lib) or specified by the universe argument which must be a subset of the Entrez Gene ids represented on the chip.  Both the selected genes and the universe are reduced by removing Entrez Gene ids that do not have any annotations in the specified GO category.
Entrez基因在x定义选定的一组基因标识。 Entrez基因IDS的宇宙是由芯片标注的数据包(lib)或universe参数必须是Entrez基因芯片为代表的IDS的一个子集指定。双方选定的基因和宇宙的减少消除Entrez基因标识,没有在指定好类别的任何注释。

For each GO term in the specified category that has at least one annotation in the selected gene set (x), we determine how many of its Entrez Gene annotations are in the universe set and how many are in the selected set.  With these counts we perform a Hypergeometric test using phyper.  This is equivalent to using Fisher's exact test.
对于每个在指定的类别中至少有一个在选定的基因组的注释(x),我们确定其Entrez基因注释中,有多少是在宇宙中集,并在选定的一组有多少好任期。根据这些记录,我们执行超几何的测试,使用phyper。这相当于使用Fisher精确检验。

It is important that the correct chip annotation data package be identified as it determines the GO term to Entrez Gene id mapping as well as the universe of Entrez Gene ids in the case that the universe argument is omitted.
重要的是确定正确的芯片注解数据包,因为它决定去术语Entrez基因ID映射以及宇宙的情况下,universe参数被忽略的Entrez基因IDS。

For S. cerevisiae if the lib argument is set to "YEAST" then comparisons and statistics are computed using common names and are with respect to all genes annotated in the S. cerevisiae genome not with respect to any microarray chip.  This will not be the right thing to do if you are working with a yeast microarray.
对于酿酒酵母,如果lib参数设置为"YEAST"然后比较和统计计算使用的通用名称和所有基因在酿酒酵母基因组的注释,不与任何微阵列芯片。这会不会是正确的事做,如果你是用酵母基因芯片的工作。


值----------Value----------

The returned value is a list with components:
返回值是一个组件的列表:


参数:pvalues
The ordered p-values.
有序的p值。


参数:goCounts
The vector of counts of Entrez Gene ids from the universe at each node.
Entrez基因ID的数量从宇宙在每个节点的向量。


参数:intCounts
The vector of counts of the supplied Entrez Gene ids annotated at each GO term.
提供的Entrez基因向量计数的ID在每个GO术语注释。


参数:numLL
The number of unique Entrez Gene ids in the universe that are mapped to some term in the specified GO category.
在一些术语在指定的GO类别映射到宇宙的数量,独特的Entrez基因标识。


参数:numInt
The number of unique Entrez Gene ids in the selected gene set, x, that are mapped to some term in the specified GO category.
在选定的基因数目的独特的Entrez基因标识设置,x,映射到一些术语在指定的GO类别。


参数:chip
A string identifying the chip annotation data package used.
一个字符串识别芯片的注释数据包使用。


参数:intLLs
The input vector x.
输入向量x。


参数:go2Affy
A list with one element for each GO term tested, containing the Affymetrix identifiers associated with that node, for the whole chip (not just the interesting genes).  This is the same as extracting the tested GO ids from the annotation package's GO2ALLPROBES environment.
每个GO术语的列表与一个元素测试,Affymetrix公司标识与该节点关联的整个芯片(不只是有趣的基因)。这是因为提取注释包的GO2ALLPROBES,环境的测试的好ID相同。


注意----------Note----------

Typically, one has a set of interesting genes/probes obtained from a microarray experiment and is interested in determining whether there is an overrepresentation of these genes at particular GO terms. GOHyperG carries out simple Hypergeometric tests to assess the overrepresentation of GO terms.
通常情况下,有一组有趣的基因/从芯片实验获得的探针和兴趣,以确定是否有这些基因在特定的GO术语的比例过高。 GOHyperG进行简单的超几何测试,以评估的GO术语的比例过高。

Two substantial issues arise.  First, it is not clear how to do any form of p-value correction.  The tests are not independent and the underlying structure of the GO graph presents certain problems that need to be addressed.  The second substantial issue is that not all probes on a microarray map to a unique Entrez Gene identifer.  In GOHyperG every attempt to appropriately correct for non-uniqueness of mappings has been made.
出现两个重大问题。首先,它并不清楚如何做任何形式的P-值校正。不是独立的测试和底层结构的好图,提出需要解决的若干问题。第二个重大问题是,并非所有的探针微阵列映射到一个独特的Entrez基因标识符。已经取得了在GOHyperG千方百计妥善纠正非唯一映射。


作者(S)----------Author(s)----------


R. Gentleman



参见----------See Also----------

hyperGTest, geneKeggHyperGeoTest,  phyper
hyperGTest,geneKeggHyperGeoTest,phyper


举例----------Examples----------


## Not run: [#无法运行:]
library("hgu95av2.db")
library("GO.db")
w1<-as.list(hgu95av2ENTREZID)
w2<-unique(unlist(w1))
set.seed(123)
## pick a 25 interesting genes[#选择一个有趣的基因25]
myLL <- sample(w2, 25)
xx <- GOHyperG(myLL, lib="hgu95av2.db", what="CC")
xx$numLL
xx$numInt
sum(xx$pvalues < 0.01)

## End(Not run)[#结束(不运行)]

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-6 23:31 , Processed in 0.023724 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表