compareGOProfiles(goProfiles)
compareGOProfiles()所属R语言包:goProfiles
Comparison of lists of genes through their functional profiles
其功能的配置文件通过比较基因名单
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Compare two samples of genes in terms of their GO profiles pn and qm. Both samples may share a common subsample of genes, with GO profile pqn0. 'compareGOProfiles' implements some inferential procedures based on asymptotic properties of the squared euclidean distance between the contracted versions of pn and qm
比较好型材的两个基因样本pn和qm。两个样品可以共享一个共同的基因子样本,带好个人资料pqn0。 “compareGOProfiles实现基于平方欧几里德距离渐近性质之间的PN和QM承包版本的一些推理程序
用法----------Usage----------
compareGOProfiles(pn, qm = NULL, pqn0 = NULL, n = ngenes(pn), m = ngenes(qm), n0 = ngenes(pqn0), method = "lcombChisq",
ab.approx = "asymptotic", confidence = 0.95, nsims = 10000, simplify = T, ...)
参数----------Arguments----------
参数:pn
an object of class ExpandedGOProfile representing one or more "sample" expanded GO profiles for a fixed ontology (see the 'Details' section)
对象代表一个或多个“样本”类ExpandedGOProfile扩大去一个固定的本体的配置文件(见“详细资料”部分)
参数:qm
an object of class ExpandedGOProfile representing one or more "sample" expanded GO profiles for a fixed ontology (see the 'Details' section)
对象代表一个或多个“样本”类ExpandedGOProfile扩大去一个固定的本体的配置文件(见“详细资料”部分)
参数:pqn0
an object of class ExpandedGOProfile representing one or more "sample" expanded GO profiles for a fixed ontology (see the 'Details' section)
对象代表一个或多个“样本”类ExpandedGOProfile扩大去一个固定的本体的配置文件(见“详细资料”部分)
参数:n
a numeric vector with the number of genes profiled in each column of pn. This parameter is included to allow the possibility of exploring the consequences of varying sample sizes, other than the true sample size in pn.
在每个PN列异形基因的数量与数字向量。此参数包括允许探索不同的样本大小,比PN真正的样本大小的后果的可能性。
参数:m
a numeric vector with the number of genes profiled in each column of qm.
数值向量与QM列在每个异形基因的数量的。
参数:n0
a numeric vector with the number of genes profiled in each column of pqn0.
在每列pqn0异形的基因数量与数字向量。
参数:method
the approximation method to the sampling distribution under the null hypothesis specifying that the samples pn and qm come from the same population. See the 'Details' section below
下空假设指定的样品PN和QM相同人口抽样分布近似方法。见“详细资料”一节
参数:confidence
the confidence level of the confidence interval in the result
结果在置信区间的置信水平
参数:ab.approx
the approximation used for computing 'a' and 'b' coefficients (see details)
使用近似计算a和B系数(见详情)
参数:nsims
some inferential methods require a simulation step; the number of simulation replicates is specified with this parameter
一些推理方法需要一个仿真步骤;数字仿真复制指定此参数
参数:simplify
should the result be simplified, if possible? See the 'Details' section
结果应予以简化,如果可能的话吗?见“详细资料”一节
参数:...
Other arguments needed
所需的其他参数
Details
详情----------Details----------
An object of S3 class 'ExpandedGOProfile' is, essentially, a 'data.frame' object with each column representing the relative frequencies in all observed node combinations, resulting from profiling a set of genes, for a given and fixed ontology. The row.names attribute codifies the node combinations and each data.frame column (say, each profile) has an attribute, 'ngenes', indicating the number of profiled genes. The arguments 'pn', 'qm' and 'pqn0' are compared in a column by column wise, recycling columns, if necessary, in order to perform max(ncol(pn),ncol(qm),ncol(pqn0)) comparisons (each comparison resulting in an object of class 'GOProfileHtest', an specialization of 'htest'). In order to be properly compared, these arguments are expanded by row, according to their row names. That is, the data arguments can have unequal row numbers. Then, they are expanded adding rows with zero frequencies, in order to make them comparable.
S3类的ExpandedGOProfile“的对象,基本上是一个”数据框“每个列代表所有观测到的节点组合的相对频率,造成一组基因分析某一固定的本体对象。 row.names属性编纂节点组合和每个数据框列(例如,每个配置文件)属性,ngenes“,说明了异形的基因数量。论据“PN”,“QM”和“pqn0比列明智的,回收列,如有必要,为了执行最大(NCOL(PN),NCOL(QM),NCOL(pqn0))比较,在一列(每一个比较中导致类GOProfileHtest“,专业化的”htest对象)。为了妥善相比,这些参数是由行扩大,根据他们的行名。也就是说,数据参数可以有不平等的行号。然后,他们扩大零频率增加的行,以使他们可比。
In the i-th comparison (i from 1 to max(ncol(pn),ncol(qm),ncol(pqn0))), the parameters n, m and n0 are included to allow the possibility of exploring the consequences of varying sample sizes, other than the true sample sizes included as an attribute in pn, qm and pqn0.
在第i个比较(我从1到最大(NCOL(PN),NCOL(QM),NCOL(pqn0))),参数N,M和N0包括允许探索不同样本的后果的可能性大小,比其他样本包括PN,QM和pqn0的属性的真实大小。
When qm = NULL, the genes profiled in pn are compared with a subsample of them, those profiled in pqn0 (compare a set of genes with a restricted subset, e.g. those overexpressed under a disease). In this case we take qm=pqn0. When pqn0 = NULL, two profiles with no genes in common are compared.
当QM = NULL,在PN异形的基因进行了比较与他们的子样本,异形pqn0(比较受限制的子集的一组基因,例如根据疾病的高表达)。在这种情况下,我们采取QM = pqn0。当pqn0 = NULL,两个没有共同的基因概况进行了比较。
Let Pn and Qm correspond to the contracted functional profiles (the total counts or relative frequencies of hits in each one of the s GO categories being compared) obtained from pn and qm. If P stands for the "population" profile originating the sample profile Pn[,j], Q for the profile originating Qm[,j] and d(,) for the squared euclidean distance, if P != Q, the distribution of sqrt(nm/(n+m))(d(Pn[,j],Qm[,j]) - d(P,Q))/se(d) is approximately standard normal, N(0,1). This provides the basis for the confidence interval in the result field icDistance. When P=Q, the asymptotic distribution of (nm/(n+m)) d(Pn[,j],Qm[,j]) corresponds to the distribution of a mixture of independent chi-square random variables, each one with one degree of freedom. The sampling distribution under H0 P=Q may be directly computed from this distribution (approximating it by simulation) (method="lcombChisq") or by a chi-square approximation to it, based on two correcting constants a and b (method="chi-square"). These constants are chosen to equate the first two moments of both distributions (the linear combination of chi-square random variables distribution and the approximating chi-square distribution). When method="chi-square", the returned test statistic value is the chi-square approximation (n d(pn[,j],qm[,j]) - b) / a. Then, the result field 'parameter' is a vector containing the 'a' and 'b' values and the number of degrees of freedom, 'df'. Otherwise, the returned test statistic value is (nm/(n+m)) d(Pn[,j],Qm[,j]) and 'parameter' contains the coefficients of the linear combination of chi-squares.
让Pn和QM符合合同的功能配置文件(或在每一个被比较的GO类别总数命中的相对频率)从PN和QM。如果P代表为“人口”原样本配置文件的文件PN [J],Q为原QM的个人资料[J],D(,)的平方欧氏距离,如果P!= Q的分布SQRT(海里/(N +米))(D(PN [J],QM [J]) - D(P,Q)的)/ SE(D)约标准正态分布N(0,1)。这提供了在结果字段icDistance的信心区间的基础。当P = Q(海里/(N + M))D(PN [J],QM [J])的渐近分布对应到一个独立的卡方随机变量混合分布,每一个一个自由度。的抽样分布在H 0 P = Q可以直接从这个分布(模拟逼近)(方法=“lcombChisq)或由卡方近似计算,它基于两个纠正常数A和B(方法= “卡方”)。这些常量的选择等同于前两个时刻都分布(卡方随机变量的分布和近似卡方分布的线性组合)。当方法=“卡方”,返回的检验统计量的值是卡方近似(ND(PN [J],QM [J]) - B)/ A。然后,结果领域的参数是一个向量,一和B值和程度的自由,DF。否则,返回的检验统计量的值是(海里/(N + M))D(PN [J],QM [J])和“参数”包含智平方的线性组合系数。
值----------Value----------
A list containing max(ncol(pn),ncol(qm),ncol(pqn0)) objects of class 'GOProfileHtest', directly inheriting from 'htest' or a single 'GOProfileHtest' object if max(ncol(pn),ncol(qm),ncol(pqn0))==1 and simplify == T. Each object of class 'GOProfileHtest' has the following fields:
一个列表,其中包含最大(NCOL(PN),NCOL(QM),NCOL(pqn0))类的GOProfileHtest“的对象,直接从”继承htest或一个GOProfileHtest的对象,如果最大(NCOL(PN),NCOL( QM),NCOL(pqn0))== 1和简化== T。每个类的GOProfileHtest“的对象有以下字段:
参数:profilePn
the first contracted profile to compute the squared Euclidean distance
第一个签约的个人资料,计算欧氏距离平方
参数:profileQm
the second contracted profile to compute the squared Euclidean distance
第二个合同的文件来计算欧氏距离平方
参数:statistic
test statistic; its meaning depends on the value of "method", see the 'Details' section.
测试统计,其意义上的“法”的价值,看到“细节”部分。
参数:parameter
parameters of the sample distribution of the test statistic, see the 'Details' section.
样本分布的检验统计量的参数,请参阅“详细资料”一节。
参数:p.value
associated p-value to test the null hypothesis of profiles equality.
相关的p值,以测试型材平等的零假设。
参数:conf.int
asymptotic confidence interval for the squared euclidean distance. Its attribute "conf.level" contains its nominal confidence level.
渐近置信区间为欧氏距离平方。其属性“conf.level”包含其标称的信心水平。
参数:estimate
squared euclidean distance between the contracted profiles. Its attribute "se" contains its standard error estimate.
承包模式之间的欧氏距离平方。 “SE”其属性包含其标准误差估计。
参数:method
a character string indicating the method used to perform the test.
字符串,表明该方法用于执行测试。
参数:data.name
a character string giving the names of the data.
字符串提供的数据的名称。
参数:alternative
a character string describing the alternative hypothesis (always 'true squared Euclidean distance between the contracted profiles is greater than zero'
字符串描述替代假说(总是真正的承包模式之间的欧几里德距离平方大于零“
作者(S)----------Author(s)----------
Jordi Ocana
参考文献----------References----------
参见----------See Also----------
fitGOProfile, equivalentGOProfiles
fitGOProfile,equivalentGOProfiles
举例----------Examples----------
data(prostateIds)
expandedWelsh <- expandedProfile(welsh01EntrezIDs[1:100], onto="MF",
level=2, orgPackage="org.Hs.eg.db")
expandedSingh <- expandedProfile(singh01EntrezIDs[1:100], onto="MF",
level=2, orgPackage="org.Hs.eg.db")
commonGenes <- intersect(welsh01EntrezIDs[1:100], singh01EntrezIDs[1:100])
commonExpanded <- expandedProfile(commonGenes, onto="MF", level=2, orgPackage="org.Hs.eg.db")
comparedMF <-compareGOProfiles (pn=expandedWelsh,
qm = expandedSingh,
pqn0= commonExpanded)
print(comparedMF)
# print(compSummary(comparedMF))[打印(compSummary(comparedMF))]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|