R语言 edgeR包 glmFit()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 17:08:03

glmFit(edgeR)
glmFit()所属R语言包：edgeR

                                    Fit negative binomial generalized linear model for each transcript
                                       适合每个成绩单负二项式的广义线性模型

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Fit a negative binomial generalized linear model (GLM) for each transcript (tag) with the unadjusted counts provided, a value for the dispersion parameter and, optionally, offsets and weights for different libraries or transcripts.
适合每个成绩单负二项式广义线性模型（GLM）（标签），未经调整的罪名，色散参数和可选偏移和重量不同的库或成绩单的价值。

用法----------Usage----------

## S3 method for class 'DGEList'
glmFit(y, design=NULL, dispersion=NULL, offset=NULL, weights=NULL, lib.size=NULL, start=NULL, method="auto", ...)
glmLRT(y, glmfit, coef=ncol(glmfit$design), contrast=NULL)

参数----------Arguments----------

参数：y
an object that contains the raw counts for each library (the measure of expression level); alternatively, a matrix of counts, or a DGEList object with (at least) elements counts (table of unadjusted counts) and samples (data frame containing information about experimental group, library size and normalization factor for the library size)
一个对象，它包含原始计数为每个库（表达水平的措施）;另外，一个计数矩阵的元素，或DGEList对象（至少）counts（未经调整的计数表）和samples（数据框包含有关实验组，库的大小和归一化因子的资料库的大小）

参数：design
numeric matrix giving the design matrix for the GLM that is to be fit. Must be of full column rank. Defaults to a single column of ones, equivalent to treating the columns as replicate libraries.
数字矩阵提供的GLM是适合的设计矩阵。必须是列满秩。一个单一的列的默认值，相当于复制库当作列。

参数：dispersion
numeric scalar or vector providing the value for the dispersion parameter that is used in fitting the GLM for each transcript. Can be a common value for all tags, or a vector of values can provide a unique dispersion value for each tag. If NULL (default) then dispersion will be detected and extracted from y, if possible, with order of precedence: tagwise dispersion, trended dispersions, common dispersion.
数字标量或矢量提供的分散参数，在装修每个成绩单的GLM的价值。可以是共同的价值，或者为所有标签的价值观的向量，可以为每个标签提供了一个独特的色散值。如果NULL（默认），然后分散将检测和提取：从y顺序排列的，如果可能的话，tagwise分散，分散趋势化，共同分散。

参数：offset
numeric scalar, vector or matrix giving the offset that is to be included in the NB GLM for the transcripts. Only one of offset and lib.size should be supplied—if both are supplied then offset will be used and lib.size will be ignored. If a scalar, then this value will be used as an offset for all transcripts and libraries. If a vector, it should be have length equal to the number of libraries, and the same vector of offsets will be used for each transcript. If a matrix, then each library for each transcript can have a unique offset, if desired. If NULL (the default) then the log of the effective library size (library size multiplied by normalization factors) will be used as the offsets in the GLMs.
数字标量，向量或矩阵提供的偏移，是被包括在NB的GLM的成绩单。只有一个offset和lib.size应提供的，如果两者都那么offset将使用lib.size将被忽略。如果一个标量，那么这个值将被用作所有成绩单和库中的偏移量。如果一个向量，它应该有长度等于数字图书馆，将每个成绩单使用相同的偏移向量。如果一个矩阵，然后每个谈话的每个库可以有独特的偏移，如果需要的话。如果NULL（默认）的有效库容量的log（库容量乘以标准化的因素）将用作在GLMs偏移。

参数：weights
optional numeric matrix giving prior weights for the observations (for each library and transcript) to be used in the GLM calculations.  Not supported by methods "linesearch" or "levenberg".
可选的数字矩阵提供的GLM的计算使用前重量的意见（每个库及成绩单）。不支持的方法"linesearch"或"levenberg"。

参数：lib.size
optional numeric vector providing the (effective) library size for each library (must have length equal to the number of columns, or libraries, in the matrix of counts). If NULL, then a default is used. If y is a DGEList object then the default for lib.size is the product of the library sizes and the normalization factors (in the samples slot of the object). If y is a simple matrix of counts, then the default for lib.size is the vector of column sums of y.
提供可选的数字矢量（有效）为每个库库的大小（长度等于数列，或库，必须在数矩阵）。如果NULL，则使用默认值。如果y是DGEList对象，然后为默认lib.size库的大小和标准化的因素（在samples对象插槽）的产品。 y如果是一个简单的计数矩阵，则默认lib.size的y列款项向量。

参数：start
optional numeric matrix of initial estimates for the fitted coefficients
可选的数字矩阵拟合系数的初步估计

参数：method
which fitting algorithm to use.  Possible values are "auto", "linesearch", "levenberg" or "simple".
其中使用的拟合算法。可能的值是"auto"，"linesearch"，"levenberg"或"simple"。

参数：...
other arguments are passed to lower-level functions, for example to mglmLS.
其他的参数被传递到较低级别的功能，例如mglmLS。

参数：glmfit
a DGEGLM object, the output from glmFit.
一个DGEGLM对象，从glmFit输出。

参数：coef
scalar or vector indicating the column(s) of design that are to be dropped when creating the null model for the Likelihood Ratio (LR) Test. Can be numeric or character. If character, the string(s) provided to coef must match a column of the design matrix in the glmfit object passed to glmLRT. The glmLRT fits the null model and then conducts an LR test of the model fit provided in glmfit against the null model defined by the choice of coef. By default, the last column of the design matrix is dropped to form the design matrix for the null model.
标量或矢量显示design这是创建空模型的似然比（LR）测试时，被丢弃的列（S）。可以是数字或字符。如果字符，字符串（S）coefglmfit传递glmLRT的对象必须符合设计矩阵列。 glmLRT适合空模型，然后进行符合LR模型试验提供了glmfit由coef选择对空模型定义。默认情况下，被丢弃的设计矩阵的最后一列，形成空模型的设计矩阵。

参数：contrast
contrast vector for which the test is required, of length equal to the number of columns of design. If specified, then takes precedence over coef.
相反向量测试是必需的，长度等于design列数。如果指定的话，那么接管coef优先。

Details

详情----------Details----------

Given a fixed value for the dispersion parameter, a negative binomial model can be fitted to the counts for each tag/transcript in a dataset. The function glmFit calls the in-built function glm.fit to fit the NB GLM for each tag. Once we have a fit for a given design matrix, glmLRT can be run with a given coefficient or contrast specified and evidence for differential expression assessed using a likelihood ratio test. Tags can be ranked in order of evidence for differential expression, based on the p-value computed for each tag.
由于色散参数为固定值，负二项式模型可以安装计数为DataSet中的每个标签/成绩单。的功能glmFit调用内置函数glm.fit适合每个标签的NB的GLM。一旦我们有了一个合适的，对于一个给定的设计矩阵，glmLRT可以运行一个给定的系数或指定的对比和差异表达的证据评估使用似然比检验。标签可以排在证据为了表达差异，p值计算每个标签的基础上。

值----------Value----------

glmFit produces an object of class DGEGLM with the following components:
glmFit类DGEGLM以下组件产生一个对象：

参数：coefficients
matrix of estimated coefficients from the NB model
从NB模型估计系数矩阵

参数：df.residual
vector giving the residual degrees of freedom for each tag. In theory it can be different for different tags (if there are missing values), but in practice these will usually be identical for each tag.
向量给每个标签的残留度的自由。在理论上，它可以是不同的标签（如果有遗漏值）不同，但在实践中通常会为每个标签相同。

参数：deviance
vector giving the deviance from the NB model fit for each tag.
向量给每个标签的NB模型拟合偏差。

参数：design
design matrix used in the NB model fit for each tag.
在每个标签的NB模型拟合设计矩阵。

参数：offset
scalar, vector or matrix giving the offset to use in the NB model for each tag.
标量，向量或矩阵给抵消每个标签使用在NB模型。

参数：samples
data frame providing information about the samples (libraries) in the experiment; taken from the object y.
数据框提供实验样本（库）信息;从对象y采取的。

参数：genes
vector or data frame providing gene information for each tag; taken from the object y.
向量或数据框为每个标签的基因信息;对象y。

参数：dispersion
scalar or vector giving the the value of the dispersion parameter used in each tag's NB model fit.
标量或矢量分散在每个标签的NB模型拟合参数值。

参数：lib.size
vector of library sizes used in the model fit.
在模型拟合使用的库大小的矢量。

参数：weights
matrix of final weights used in the NB model fits for each tag.
在NB模型使用的最终权重矩阵适合每个标签。

参数：fitted.values
matrix of fitted values from the NB model for each tag.
从每个标签NB模型拟合值的矩阵。

参数：abundance
vector of gene/tag abundances (expression level), on the log2 scale, computed from the mean count for each gene/tag after scaling count by normalized library size.
向量，基因/标记丰度表达水平上的log2规模，从平均每个基因/标记后归库的大小缩放计数计数计算。

glmLRT produces an object of class DGELRT with the following components:
glmLRT类DGELRT以下组件产生一个对象：

参数：table
data frame (table) containing the abundance of each tag (log-concentration, logConc), the log-fold change of expression between conditions/contrasts being tested (logFC), the likelihood ratio statistic (LR.statistic) and the p-value from the LR test (p.value), for each tag in the dataset.
数据框（表）含有丰富的每个标签（log集中，logConc），表达的对比测试条件/log倍之间变化（logFC），似然比统计（ LR.statistic）从LR检验的p值（p.value），为每个数据集的标签。

参数：coefficients
matrix of coefficients for the full model defined by the design matrix (i.e. for the full model).
矩阵design矩阵（即完整的模型）定义的完整模型系数。

参数：dispersion.used
scalar or vector of the dispersion value(s) used in the GLM fits and LR test.
色散值（S）的标量或矢量的GLM配合和LR测试使用。

The DGELRT object also contains all the elements of y except for the table of counts (raw data) and the table of pseudo-counts (if applicable).
DGELRT对象也包含所有y元素表计数（原始数据）和伪计数的表（如适用）除外。

作者（S）----------Author(s)----------

Davis McCarthy and Gordon Smyth

参见----------See Also----------

estimateGLMCommonDisp, estimateGLMTrendedDisp or estimateGLMTagwiseDisp for estimating the negative binomial dispersion.
estimateGLMCommonDisp，estimateGLMTrendedDisp或estimateGLMTagwiseDisp估计负二项式分散的。

topTags for displaying results from glmLRT.
topTags显示glmLRT的结果。

举例----------Examples----------

nlibs <- 3
ntags <- 100
dispersion.true <- 0.1

# Make first transcript respond to covariate x[设为第一誊响应协变量x]
x <- 0:2
design <- model.matrix(~x)
beta.true <- cbind(Beta1=2,Beta2=c(2,rep(0,ntags-1)))
mu.true <- 2^(beta.true %*% t(design))

# Generate count data[生成计数数据]
y <- rnbinom(ntags*nlibs,mu=mu.true,size=1/dispersion.true)
y <- matrix(y,ntags,nlibs)
colnames(y) <- c("x0","x1","x2")
rownames(y) <- paste("Gene",1:ntags,sep="")
d <- DGEList(y)

# Normalize[标准化]
d <- calcNormFactors(d)

# Fit the NB GLMs[适合NB的GLMs]
fit <- glmFit(d, design, dispersion=dispersion.true)

# Likelihood ratio tests for trend[似然比检验趋势]
results <- glmLRT(d, fit, coef=2)
topTags(results)

# Estimate the dispersion (may be unreliable with so few tags)[估计色散（可能有这么几个标签不可靠）]
d <- estimateGLMCommonDisp(d, design)
d$common.dispersion

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册