R语言 goProfiles包 fitGOProfile()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 20:49:41

fitGOProfile(goProfiles)
fitGOProfile()所属R语言包：goProfiles

                                    Does a "sample" GO profile 'pn', observed in a sample of 'n' genes, fit a "population" or "model" p0?
                                       “样本”个人资料“PN”，“n”的基因样本中观察到，适应了“人口”或“模式”P0？

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

'fitGOProfile' implements some inferential procedures to solve the preceding question. These procedures are based on asymptotic properties of the squared euclidean distance between the contracted versions of pn and p0
“fitGOProfile实现一些推理的程序，以解决上述问题。这些程序是基于PN和P0承包版本之间的平方欧几里德距离渐近性质

用法----------Usage----------

fitGOProfile(pn, p0, n = ngenes(pn), method = "lcombChisq", ab.approx = "asymptotic", confidence = 0.95, nsims = 10000, simplify = T)

参数----------Arguments----------

参数：pn
an object of class ExpandedGOProfile representing one or more "sample" expanded GO profiles for a fixed ontology (see the 'Details' section)
对象代表一个或多个“样本”类ExpandedGOProfile扩大去一个固定的本体的配置文件（见“详细资料”部分）

参数：p0
  an object of class ExpandedGOProfile representing one or more "population" or "theoretical" expanded GO profiles (see also the 'Details' section)
类ExpandedGOProfile代表一个或多个“人口”或“理论”的对象扩大了好型材（见“详细资料”部分）

参数：n
a numeric vector with the number of genes profiled in each column of pn. This parameter is included to allow the possibility of exploring the consequences of varying sample sizes, other than the true sample size in pn
在每个PN列异形基因的数量与数字向量。此参数包括允许探索不同的样本大小的后果的可能性，比PN真正的样本大小

参数：method
the approximation method to the sampling distribution under the null hypothesis "p = p0", where p is the 'true' population profile originating each column of pn. See the 'Details' section below
零假设“P = P0”，其中p是真的人口结构，原PN每列下的抽样分布近似方法。见“详细资料”一节

参数：ab.approx
the method used to compute the constants 'a' and 'b' described in the paper. See the 'Details' section
用于计算的常量a和B的文件中所描述的方法。见“详细资料”一节

参数：confidence
the confidence level of the confidence interval in the result
结果在置信区间的置信水平

参数：nsims
some inferential methods require a simulation step; the number of simulation replicates is specified with this parameter
一些推理方法需要一个仿真步骤;数字仿真复制指定此参数

参数：simplify
should the result be simplified, if possible? See the 'Details' section
结果应予以简化，如果可能的话吗？见“详细资料”一节

Details

详情----------Details----------

An object of class 'ExpandedGOProfile' is, essentially, a 'data.frame' object with each column representing the relative frequencies in all observed node combinations, resulting from profiling a set of genes, for a given and fixed ontology. The row.names attribute codifies the node combinations and each data.frame column (say, each profile) has an attribute, 'ngenes', indicating the number of profiled genes. (Actually, the 'ngenes' attribute of each 'p0' column is ignored and is taken as if it were infinite, 'Inf'.) The arguments 'pn' and 'p0' are compared in a column by column wise,  recycling columns, if necessary, in order to perform max(ncol(pn),ncol(p0)) comparisons (each comparison resulting in an object of class 'htest'). In order to be properly compared, 'pn' and 'p0' are expanded by row, according to their row names. That is, both arguments can have unequal row numbers. Then, they are expanded adding rows with zero frequencies, in order to make them comparable.
一个类的ExpandedGOProfile“的对象，基本上是一个”数据框“代表所有观测到的节点组合的相对频率，造成一组基因分析某一固定的本体各列的对象。 row.names属性编纂节点组合和每个数据框列（例如，每个配置文件）属性，ngenes“，说明了异形的基因数量。（其实，每个“P0”栏中“ngenes”属性被忽略和被采取就好像它是无限的，“INF”。）参数PN和P0相比，在一列列明智的，回收列如有必要，为了执行最大（NCOL（PN），NCOL（P0））比较（每个比较对象类的htest“）。为了妥善相比，“PN”和“P0”扩大行，根据自己的行名。也就是说，这两个参数可以有不平等的行号。然后，他们扩大零频率增加的行，以使他们可比。

In the i-th comparison (i from 1 to max(ncol(pn),ncol(p0))), if p stands for the profile originating the sample profile pn[,i] and d(,) for the squared euclidean distance,  if p != p0[,i], the distribution of sqrt(n)(d(pn[,i],p0[,i]) - d(p,p0[,i]))/se is approximately standard normal, N(0,1). This provides the basis for the confidence interval in the result field conf.int. When p==p0[,i], the asymptotic distribution of n d(pn[,i],p0[,i]) is the distribution of a linear combination of independent chi-square random variables, each one with one degree of freedom. This sampling distribution may be directly computed (approximating it by simulation, method="lcombChisq") or approximated by a chi-square distribution, based on two correcting constants a and b (method="chi-square"). These constants are chosen to equate the first two moments of both distributions (the distribution of a linear combination of chi square variables and the approximating chi-square distribution). When method="chi-square", the returned test statistic value is the chi-square approximation (n d(pn,p0) - b) / a. Then, the result field 'parameter' is a vector containing the 'a' and 'b' values and the number of degrees of freedom, 'df'. Otherwise, the returned test statistic value is n d(pn,p0) and 'parameter' contains the coefficients of the linear combination of chi-squares
在第i个比较（我从1到最大（NCOL（PN），NCOL（P0））），如果P代表原样本的个人档案个人资料PN [I]和D（，）的平方欧氏距离如果P = P0 [I]，的SQRT分布（N）（D（PN [我]，P0 [I]） -  D（P，P0 [I]））/ SE是约标准正常N（0,1）。这提供了在结果字段conf.int的信心区间的基础。当p == P0 [我]，ND（PN [我]，P0 [I]）的渐近分布是一个独立的卡方随机变量的线性组合，每个人分配一个自由度。这个抽样分布可能是直接计算（近似模拟方法=“lcombChisq”）或由卡方分布在两个纠正常数A和B（方法=“卡方”）的基础上，近似。这些常量的选择等同于前两个时刻都分布（分布的卡方变量和近似卡方分布的线性组合）。当方法=“卡方”，返回的检验统计量的值是卡方近似（ND（PN，P0） -  B）/ A。然后，结果领域的参数是一个向量，一和B值和程度的自由，DF。否则，返回的检验统计量的值是ND（PN，P0）和“参数”包含智平方的线性组合的系数

值----------Value----------

A list containing max(ncol(pn),ncol(p0)) objects of class 'htest',  or a single 'htest' object if ncol(pn)==1 and ncol(p0)==1 and simplify == T. Each 'htest' object has the following fields:
一个列表，其中包含最大（NCOL（PN），NCOL（P0））对象类的htest“，或一个htest”的对象，如果NCOL（PN）== 1和ncol（P0）== 1和简化==ţ每个htest对象有下列领域：

参数：statistic
test statistic; its meaning depends on the value of "method", see the 'Details' section
测试统计，其意义上的“法”的价值，看到“细节”部分

参数：parameter
parameters of the sample distribution of the test statistic, see the 'Details' section
样本分布的检验统计量的参数，看到“细节”部分

参数：p.value
associated p-value to test the null hypothesis "pn[,i] is a random sample taken from p0[,i]"
相关的p值检验零假设“PN，我是从P0采取的随机抽样[我]”

参数：conf.int
asymptotic confidence interval for the squared euclidean distance. Its attribute "conf.level" contains its nominal confidence level
渐近置信区间为欧氏距离平方。其属性“conf.level”包含其标称的信心水平

参数：estimate
squared euclidean distance between the contracted pn and p0 profiles. Its attribute "se" contains its standard error estimate
欧氏距离平方之间的合同和PN P0型材。 “SE”其属性包含其标准误差估计

参数：method
a character string indicating the method used to perform the test
字符串，表明该方法用于执行测试

参数：data.name
a character string giving the names of the data
提供的数据的名称字符串

参数：alternative
a character string describing the alternative hypothesis
字符串描述替代假说

作者（S）----------Author(s)----------

Jordi Ocana

参考文献----------References----------

Statistical methods for the analysis of high-throughput data based on functional profiles derived from the gene ontology. Journal of Statistical Planning and Inference, 2007.

参见----------See Also----------

compareGOProfiles
compareGOProfiles

举例----------Examples----------

#data(sampleProfiles)[数据（sampleProfiles）]
#comparedMF <-fitGOProfile(pn=expandedWelsh01[['MF']], [comparedMF <fitGOProfile（PN = expandedWelsh01 [分子式]]
#                         p0  = expandedSingh01[['MF']])[P0 = expandedSingh01 [中频]）]
#print(comparedMF)[打印（comparedMF）]
#print(compSummary(comparedMF))[打印（compSummary（comparedMF））]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册