sigora-package(sigora)
sigora-package()所属R语言包:sigora
Signature Overrepresentation Analysis
签名人数过多分析
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This section gives a brief overview of the most important functions of the pathway analysis package SIGORA. This documentation uses the terminology described in our manuscript submitted to Bioinformatics, 2012. In short: a GPS (gene pair signature) is a (weighted) pair of genes that as a combination occurs only in a single pathway within a pathway repository. A query list is a vector containing a gene list of interest (e.g. genes that are differentially expressed in a particular condition). A present GPS is a GPS for which both components are in the query list. SIGORA identifies relevant pathways based on the over-representation analysis of their (present) GPS.
本节简要介绍的途径分析包SIGORA的最重要的功能。本文档使用的术语在我们的来稿生物信息学,2012年。总之:一个GPS(基因对签名)是一个(加权)对基因,作为一个组合只发生在一个单一的途径内的通路存储库。查询列表是一个向量,包含的基因列表的权益(如差异表达的基因,在特定的条件下)。阿目前的GPS是一个全球定位系统,这两个组件是在查询列表中。 SIGORA确定了相关的途径(目前)GPS所占比例过高分析的基础上。
Details
详细信息----------Details----------
To install from CRAN:<br> >install.packages('sigora')<br> As an alternative, you can download the tarball and install from the local file: <br> >install.packages('sigora_0.9.2.tar.gz',type='source',repos = NULL) <br> To load the library:<br> > library('sigora')<br> Now it would be a good idea to look at a man page.<br> >?sigs<br> This shows the man page of sigs, SIGORA's most important function.<br> You will notice the following entry for the parameter samplename:<br> "A user specified list of genes of interest ('query list'), as a vector of SIGORA IDs. To obtain such a vector, import the list of ENSEMBL/Entrez IDs using scan(..,what='character') and apply the appropriate mapping_function (entrez_converter or ens_converter) to it."<br> We will get to this shortly, but let's try the example from the 'example' section of ?sigs first. <br>
从CRAN安装:参考install.packages('sigora')<BR>作为一种替代方法,您可以下载的压缩包,并从本地文件安装:参考> install.packages('sigora_0.9.2.tar.gz',type='source',repos = NULL)<BR>要加载的库:参考>library('sigora')参考现在这将是一个很好的主意,看看一个男人页。<BR>>?sigs参考的手册页的sigs,您会注意到SIGORA最重要的功能。<BR>以下项的参数samplename:<BR>“用户自定义的列表基因的兴趣(”查询列表“),作为向量SIGORA的ID。要获得这样的向量,导入列表的ENSEMBL / Entrez的ID,使用scan(..,what='character')和应用适当的mapping_function(entrez_converter或ens_converter)吧。“<br>我们将得到这不久,但让我们尝试例如,从“示例”部分 ?sigs第一。参考
A few datasets are pre-loaded. For example:<br> dengue_hoang is a Dengue fever dataset (significantly up-regulated genes from GSE25001, Hoang et al. 2010),<br> rcc_lenburg a Renal Cell Carcinoma set (Differentially expressed genes in GSE781, Lenburg et al. 2003). <br> You could run the following:<br> > sigs(dengue_hoang,'k' ,1 ,level=2)<br> This runs Signature Over-representation on the Dengue Fever dataset, using KEGG_signatures ('k'), including marker genes and examines two levels of the hierarchy in the repository (see manuscript). Once sigs finishes, you are presented with the list of pathways that it deems to be statistically significant for this dataset, along with a few lines on how to access the results and what you might want to do next with them.<br> The output of the above command should end on something like:
几个数据集预加载的。例如:<br>dengue_hoang是一个登革热的数据集(显着上调的基因从GSE25001,晃等人,2010),参考rcc_lenburg肾单元癌组(差异表达的基因GSE781,Lenburg等人,2003年)。 <br>可以运行以下命令:运行参考> sigs(dengue_hoang,'k' ,1 ,level=2)参考此签名表示过对登革热的数据集,使用KEGG_signatures(K),包括标记基因,并研究两个层面的在存储库中的层次结构(见稿)。一旦sigs完成,你提出的名单的途径,它认为该数据集有统计学意义,以及如何访问的结果,你可能想与他们做下的几行。< BR>上面的命令的输出应该结束是这样的:
You also have access to all results and calculations: <br> > ls()<br> [1] "detailed_results" "summary_results" "padj_number" <br> detailed_results contains all present GPS (i.e. all Gene-Pair-Signatures for which both component genes were in the input list)<br> summary_results gives an overview of the over-representation analysis of the GPS: i.e. number of present GPS per pathway, the sum of their weights, the associated p-value and adjusted p-values and the parameters of the hypergeometric function that produces the p-values.<br> padj_number is the number of tests and is used to adjust the p-values for multiple testing. By default, this is set to the number of all pathways in the repository for which at the specified level of hierarchy a possible GPS exists.<br> > head(summary_results) <br> > write.table(summary_results,'myresults.txt')<br> > ?det_out<br> > det_out('some_file.txt')<br> > ?related_genes<br> This function gives the list of genes involved in all present GPS per pathway.
您还可以访问到所有的结果和计算方法:参考>ls()参考[1] "detailed_results" "summary_results" "padj_number" 参考detailed_results包含了目前所有的GPS(即所有基因对签名输入列表中的两个组件基因)参考summary_results的代表过多的GPS分析给出了一个概述:即本GPS每通路的数目,它们的权重的总和,关联的p值和调整后的p-值和超几何函数产生的p值的参数。<br>物理化学学报padj_number是测试的数量和多个测试是用来调节的p值。默认情况下,这是设置在指定的级别层次存在可能的GPS库中的所有途径的数量。<BR>>head(summary_results)参考>write.table(summary_results,'myresults.txt')<BR >>?det_out参考> det_out('some_file.txt')参考>?related_genes <br>此功能给出了在目前所有GPS每通路相关的基因列表。
As mentioned above, to examine your own datasets you first need to convert the identifiers of your genes of interest to SIGORA IDs. This can be done using ens_converter which converts human or mouse ENSEMBL IDs to SIGORA IDs or using entrez_converter which converts human or mouse Entrez IDs to SIGORA IDs. <br> As a last resort, genename_converter can be used to map gene names to SIGORA IDs, but whenever possible, you should use ENSEMBL IDs <br> idb_to_genename and idb_converter can be used to reverse these mappings.<br> So, if you have a file with human ENSEMBL IDs, you would run:<br> >tmp<-scan('test_ensemble_data.txt', what='character')<br> >mysample<-ens_converter(tmp)<br> ## It would be a good idea to make sure that the mapping went OK:<br> >length(mysample)<br> ## Now, you can proceed with sigs as described above.<br>
正如上面提到的,检查自己的数据集,你首先需要将您感兴趣的基因的标识符SIGORA的ID。这是可以做到使用ens_converter转换人类或小鼠ENSEMBL ID的ID来SIGORA的或使用entrez_converter转换人类或小鼠的Entrez ID的ID来SIGORA。作为最后的手段,<BR>genename_converter可以被用来基因的名字SIGORA标识,但只要有可能,你应该使用idb_to_genename和idb_converter可以使用的ID参考ENSEMBL来扭转映射。<BR>的所以,如果你有一个文件与人类ENSEMBL的标识,你可以运行:参考>tmp<-scan('test_ensemble_data.txt', what='character')参考> mysample<-ens_converter(tmp)参考##去确定的映射是一个好主意,以确保参考> length(mysample)参考##现在,你可以继续兴趣小组,如上面所述。<BR>
(作者)----------Author(s)----------
Amir Foroushani <br>
Maintainer: <amir.foroushani@teagasc.ie>
实例----------Examples----------
## Not run: [#不运行:]
### sigs is the most important function of sigora. [##兴趣小组的sigora是最重要的功能。]
## A few datasets are already preloaded, for example dengue_hoang is a [#A的数据集,例如已经预装dengue_hoang是一个]
##list of differentially up-regulated genes in a Dengue Fever Data.[#差异上调基因的登革热数据列表。]
sigs(dengue_hoang,'k',1,level=2)
### There are various converters for mapping different gene identifier types to[##有各种不同的基因标识符类型映射到转换器]
## SIGORA IDs.[#SIGORA的ID。]
tmp<-scan('test_ensemble_data.txt', what='character')
mysample<-ens_converter(tmp)
sigs(mysample,'k',1,level=2)
head(summary_results)
det_out("detailed_evidence.txt")
related_genes()
### Returns list of human marker genes in KEGG as SIGORA IDs:[##返回的人的标记基因在KEGG作为SIGORA的ID列表:]
tmp<-get_marker_genes(archive='keg')
# convert the above results to human readable gene names and ENSEMBL ids[上述结果转换到人类可读的基因名称和ENSEMBL IDS的]
idb_converter(tmp)
## End(Not run)[#(不执行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|