getSignatures(RTools4TB)
getSignatures()所属R语言包:RTools4TB
A function to retrieve transcriptional signature IDs from the TranscriptomeBrowser database (TBrowserDB).
函数检索TranscriptomeBrowser数据库(TBrowserDB)的转录签名的ID。
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This is one of the main function of the RTools4TB package. It allows direct access to TBrowserDB (http://tagc.univ-mrs.fr/tbrowser). The getSignatures function can be used to retrieve transcriptional signatures (i)derived from a given experiment or microarray platform, (ii)containing a user-defined list of genes or probes (using or not a boolean query) or (iii)enriched in genes sharing a common annotation term (user must provide a q-value).
这是RTools4TB包的主要功能之一。它允许直接访问TBrowserDB(http://tagc.univ-mrs.fr/tbrowser)。 getSignatures函数可以用于检索来自一个给定的实验或芯片平台,用户定义的列表中包含的基因或探针(使用或不是一个布尔查询)(二)转录签名(I)或(三)共享一个共同的注解术语(用户必须提供一个Q值)的基因丰富。
See "Details" section for more information about the syntax.
看到“详细资料”部分有关语法的更多信息。
用法----------Usage----------
getSignatures(field=c("gene", "probe", "platform", "experiment", "annotation"), value = NULL, qValue = NULL, nbMin = NULL, verbose = TRUE, save = FALSE)
参数----------Arguments----------
参数:field
The request type. Should be one of: "gene", "probe", "platform", "experiment", "annotation".
请求类型。应该是:“基因”,“探针”,“平台”,“实验”,“注释”。
参数:value
Depends on the field argument: if field is set to "gene" value must contain HUGO IDs (e.g., "CD4 CD3E CD3D"). Logical operators are supported (e.g., "CD4 & CD3E & CD3D", see "details" section). if field is set to "probe" value must contain a list of probe IDs (e.g., Affymetrix probe IDs). Logical operators are supported. if field is set to "platform" value must contain one platform ID (e.g., "GPL96"). if field is set to "experiment" value must contain one experiment ID (e.g., "GSE2004"). if field is set to "annotation" value must contain a list of annotation terms separated by logical operators(e.g., "breast cancer" or "18q11.2|18q12.1|18q21.1|18q22-q23").
field参数取决于:如果field设置为“基因”,“value必须包含雨果的ID(例如,"CD4 CD3E CD3D")。支持逻辑运算符(例如,"CD4 & CD3E & CD3D",看到“细节”一节)。如果field设置为“探针”value必须包含一个探针ID的列表(例如,Affymetrix公司探针的ID)。支持逻辑运算符。如果field设置为“平台”value必须包含一个平台ID(例如,"GPL96")。如果field设置为“实验”value必须包含一个实验ID(例如,"GSE2004")。如果field设置为“注释”value必须包含一个由逻辑运算符分隔的注释术语列表(例如,"breast cancer"或"18q11.2|18q12.1|18q21.1|18q22-q23")。“
参数:qValue
an integer (10E-"qValue"). Default to 0. This q-value is used to select signatures associated with a given annotation term (see examples section). Used only when field = "annotation".
一个整数(10E“qValue”)。默认为0。这Q值用来选择与一个给定的注释术语(见示例一节)的签名。只有当field = "annotation"使用。
参数:nbMin
an integer. Used only when value corresponds to a gene list without logical operators (see details). Only signatures containing at least nbMin genes out of the list will be retrieved (see details section).
一个整数。只用时value没有逻辑运算符对应的基因列表(见详情)。只有包含列表至少nbMin基因的签名将被检索(见详图)。
参数:verbose
if set to TRUE the function runs verbosely.
如果设置为TRUE,函数运行冗长。
参数:save
if set to TRUE data are stored onto disk.
如果设置为TRUE,数据存储到磁盘。
Details
详情----------Details----------
The "value" argument to getSignatures may contain logical operators (see help section on TBrowser web site for more informations, http://tagc.univ-mrs.fr/tbrowser)
“价值”的说法来getSignatures可能包含逻辑运算符(见帮助对TBrowser网站的部分更多信息,http://tagc.univ-mrs.fr/tbrowser)
& : AND | : OR ! : NOT , (used in conjonction with &)
&:|:或!:(使用conjonction)
However, when field = "gene" or field = "probe", user can perform a request using a list of item separated by blanks (without logical operators). These blanks are interpreted as the OR logical operators. In this case, all signatures containing at least one gene of the list will be returned. To select more informative signatures we suggest to use the nbMin argument that will select signatures containing at least nbMin genes out of the list.
然而,当field = "gene"或field = "probe",用户可以使用的项目列表,由空格分隔(不包括逻辑运算符)执行请求。这些空白被解释为“或”逻辑运算符。在这种情况下,将返回所有至少包含一个列表中的基因签名。要选择更多的信息的签名,我们建议使用的nbMin参数,将选择包含列表至少nbMin基因签名。
Moreover, user may include logical operators in the request. Indeed, this is a convenient way to create relevant queries. Suppose your field of interest is related to T-cell activation. You could be interested in retrieving all TS that contain the CD4 gene as they should contain additional T cell markers. Comparing these TS should help you to define a set of frequent CD4 neighbors (very likely related to TCR signaling cascade). Thereby, your request should be:
此外,用户可以在请求中包含逻辑运算符。事实上,这是一种方便的方法,建立相关的查询。假设你感兴趣的领域相关的T单元活化。你可以检索所有包含CD4基因,因为它们应该包含更多的T单元标记的TS感兴趣。这些TS比较,应该可以帮助您定义一组频繁的CD4邻居(很可能与TCR信号级联)。从而,你的要求应该是:
res <- getSignatures(field="gene", value="CD4")
res <- getSignatures(field="gene", value="CD4")
This gene is found in 371 TS (with the current database release), and obtaining associated gene lists would be time consuming and would not emphasize on what you are really expecting. Indeed, the CD4 marker is also expressed by macrophages. Another solution would be to search for TS containing two T-cell markers (CD4 and CD3E for instance) and to exclude (using the NOT operator) those containing the CD14 marker (a macrophages marker). The syntax should be the following:
这种基因被发现在371的TS(与当前数据库版本),并获得相关的基因名单将耗时并不会强调你真的期待。事实上,CD4标记也由巨噬单元中表达。另一个解决办法是为TS含有两个T单元标记(CD4和例如CD3E),搜索和排除(使用NOT运算符),那些含有CD14的标记(巨噬单元标记)。语法应该如下:
res <- getSignatures(field="gene", value="CD4 & CD3E & !CD14")
res <- getSignatures(field="gene", value="CD4 & CD3E & !CD14")
In the same way you could try to exclude TS containing B-cells by discarding those containing the CD19 of IGHM marker. The resulting query would be the following:
以同样的方式,你可以尝试排除TS含丢弃的B单元那些含有IGHM标记CD19的。查询结果将是以下几点:
res <- getSignatures(field="gene", value="CD4 & CD3E & !(CD19 | IGHM)")
res <- getSignatures(field="gene", value="CD4 & CD3E & !(CD19 | IGHM)")
值----------Value----------
This function will return a vector containing the names of the transcriptional signatures that satisfy the constraints. Additional informations about these signatures (GEO platform ID, GEO experiment ID, Organism, number of probes, number of genes, number of biological samples) can be obtained using the getTBInfo function (field = "signatureID").
这个函数将返回一个向量,包含转录签名,满足约束的名称。这些签名的附加信息(GEO平台ID,GEO实验身份证,有机体,探针的数量,基因,生物样品的数量),可以利用getTBInfo功能(field = "signatureID")。
作者(S)----------Author(s)----------
Bergon A., Lopez F., Textoris J., Granjeaud S. and Puthier D.
参考文献----------References----------
flexible toolbox to explore productively the transcriptional landscape of the Gene Expression Omnibus database. PLoSONE, 2008;3(12):e4001.
参见----------See Also----------
Other functions which allow to query the TBrowser database: getTBInfo, getExpressionMatrix
getTBInfo,getExpressionMatrix:允许查询TBrowser数据库的其他功能
举例----------Examples----------
## Not run: [#无法运行:]
# retrieving transcriptional signatures containing PCNA, CDC2 and CDC6.[检索含有PCNA的,Cdc2和CDC6转录签名。]
res <- getSignatures(field="gene", value="PCNA & CDC2 & CDC6")
# retrieving transcriptional signatures contain at least two genes out of the following list: "PCNA, CDC2 and CDC6".[检索转录签名包含以下列表中至少有两个基因:“增殖,cdc2和CDC6”。]
res <- getSignatures(field="gene", value="PCNA CDC2 CDC6", nbMin=2)
# retrieving transcriptional signatures related to GSE2004[检索转录GSE2004有关签名的]
gse2004TS <- getSignatures(field="experiment", value="GSE2004")
# retrieving transcriptional signatures related to the platform GPL96[检索有关平台GPL96的转录签名]
gpl96TS <- getSignatures(field="platform", value="GPL96")
# retrieving transcriptional signatures enriched in gene related to the keyword ""HSA04110:CELL CYCLE" (KEGG_PATHWAY)[检索关键字有关的基因转录签名丰富“的”HSA04110:单元周期“(KEGG_PATHWAY)]
data(annotationList)
attach(annotationList)
table(TableName)
annotationList[Keyword=="HSA04110:CELL CYCLE",]
ccTS20 <- getSignatures(field="annotation", value="HSA04110:CELL CYCLE", qValue=20)
# retrieving transcriptional signatures enriched in gene located in 8q region.[检索在基因富集在8Q区域位于转录签名。]
query <- paste(grep("^8q", Keyword, val = T), collapse = "|")
query
cc <- getSignatures(field = "annotation", value = query, qValue = 10)
## End(Not run)[#结束(不运行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|