R语言 edgeR包 bin.dispersion()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 17:03:25

bin.dispersion(edgeR)
bin.dispersion()所属R语言包：edgeR

                                    Estimate Common Dispersion for Negative Binomial GLMs in Bins of Genes Sorted by Overall Abundance
                                       估计普通基因箱负二项式GLMs的的色散总体丰度排序

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Estimates the common dispersion parameter for each of a number of bins of data for a DGE dataset. Genes are sorted into bins based on overall expression level. For multiple-group (one-way layout) experimental designs, conditional maximum likelihood (CML) methods can be used. For general experimental designs the binned common dispersions we can use Cox-Reid approximate conditional inference, Pearson or deviance estimators for a negative binomial generalized linear model.
估计每个为胃排空数据集的数据箱常见的色散参数。基因排序到基于整体表达水平的垃圾箱。对于多组（单程布局）的实验设计，有条件的最大似然（CML）的方法可以使用。对于一般的实验设计的分级常见的分散，我们可以使用COX-里德近似的条件推断，皮尔逊或偏差估计为负二项式的广义线性模型。

用法----------Usage----------

binCMLDispersion(y, nbins=50)
binGLMDispersion(y, design, min.n=500, offset=NULL,  method="CoxReid", ...)

参数----------Arguments----------

参数：y
an object that contains the raw counts for each library (the measure of expression level); it can either be a matrix of counts, or a DGEList object with (at least) elements counts (table of unadjusted counts) and samples (data frame containing information about experimental group, library size and normalization factor for the library size)
一个对象，它包含每个库（表达水平的措施）的原始计数;它可以是一个计数的矩阵，或DGEList对象元素（至少）counts（未经调整表计数）和samples（数据框包含有关实验组，库的大小和归一化因子的资料库的大小）

参数：nbins
scalar, the number of bins for which to compute common dispersions. Default is 50 bins.
标箱计算常见的分散。默认是50箱。

参数：design
numeric matrix giving the design matrix for the GLM that is to be fit.
数字矩阵提供的GLM是适合的设计矩阵。

参数：min.n
scalar, the minimum number of genes to be included in each bin.
标量，最低数量的基因，包括在每一个垃圾桶。

参数：offset
(optional) numeric scalar, vector or matrix giving the offset (in addition to the log of the effective library size) that is to be included in the NB GLM for the transcripts. If a scalar, then this value will be used as an offset for all transcripts and libraries. If a vector, it should be have length equal to the number of libraries, and the same vector of offsets will be used for each transcript. If a matrix, then each library for each transcript can have a unique offset, if desired. Default is NULL. If NULL, then offset is log(lib.size) if y is a matrix or log(y$samples$lib.size * y$samples$norm.factors) if y is a DGEList object.
（可选）数字标量，向量或矩阵给抵消（除了的有效库容量的log）是被包括在NB的GLM的成绩单。如果一个标量，那么这个值将被用作所有成绩单和库中的偏移量。如果一个向量，它应该有长度等于数字图书馆，将每个成绩单使用相同的偏移向量。如果一个矩阵，然后每个谈话的每个库可以有独特的偏移，如果需要的话。默认NULL。如果NULL，然后偏移是log(lib.size)如果y是一个矩阵或log(y$samples$lib.size * y$samples$norm.factors)如果y是DGEList对象。

参数：method
method  used to estimated the dispersion. Argument passed to estimateGLMCommonDisp, which calls the functions to do the computations. Possible values are "CoxReid", "Pearson" or "deviance".
使用的方法来估计的分散。参数传递到的estimateGLMCommonDisp，调用函数做计算。可能值"CoxReid"，"Pearson"或"deviance"。

参数：...
other arguments are passed to lower-level functions.
其他的参数被传递到较低级别的功能。

Details

详情----------Details----------

To obtain estimates of the common dispersion parameters conditional maximum likelihood (estimateCommonDisp) is used for binCMLDispersion and one of Cox-Reid approximate conditional inference (dispCoxReid), the deviance (dispDeviance) or Pearson (dispPearson) estimates are used for binGLMDispersion. Genes are assigned to bins using the cutWithMinN function to obtain bins spread over the abundance range of the genes while ensuring that each bin has a minimum number of genes, thus permitting reliable estimation of the common dispersion for each bin.
为了获得常见的色散参数条件最大似然（estimateCommonDisp）binCMLDispersion和COX-里德近似条件推理（dispCoxReid）（dispDeviance的越轨行为之一的估计）或皮尔逊（用dispPearsonbinGLMDispersion）估计。基因被分配到垃圾箱，使用cutWithMinN函数获得超过基因的丰度范围内蔓延的垃圾箱，同时确保每个容器有最低数量的基因，从而使每个bin的共同分散可靠估计。

值----------Value----------

Returns a list with two components:
返回两部分组成名单：

参数：dispersion
numeric vector providing the common dispersion for each bin
数字向量提供常见的色散为每个垃圾桶

参数：abundance
numeric vector providing the average abundance (expression level) of genes in each bin
在每个容器提供的平均丰度的基因（表达水平的数字矢量）

作者（S）----------Author(s)----------

Gordon Smyth, Davis McCarthy

参考文献----------References----------

参见----------See Also----------

estimateGLMCommonDisp, dispCoxReid, dispPearson, dispDeviance
estimateGLMCommonDisp，dispCoxReid，dispPearson，dispDeviance

举例----------Examples----------

y <- matrix(rnbinom(1000,mu=10,size=10),ncol=4)
d <- DGEList(counts=y,group=c(1,1,2,2),lib.size=c(1000:1003))
design <- model.matrix(~group, data=d$samples) # Define the design matrix for the full model[定义完整的模型设计矩阵]
bindisp.CML <- binCMLDispersion(d, nbins=50)
bindisp.GLM <- binGLMDispersion(d, design, min.n=10)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册