找回密码
 注册
查看: 27258|回复: 1

R语言:glm()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-16 21:22:30 | 显示全部楼层 |阅读模式
glm(stats)
glm()所属R语言包:stats

                                        Fitting Generalized Linear Models
                                         拟合广义线性模型

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

glm is used to fit generalized linear models, specified by giving a symbolic description of the linear predictor and a description of the error distribution.
glm用于满足广义线性模型,指定给一个象征性的描述线性预测误差分布的描述。


用法----------Usage----------


glm(formula, family = gaussian, data, weights, subset,
    na.action, start = NULL, etastart, mustart, offset,
    control = list(...), model = TRUE, method = "glm.fit",
    x = FALSE, y = TRUE, contrasts = NULL, ...)

glm.fit(x, y, weights = rep(1, nobs),
        start = NULL, etastart = NULL, mustart = NULL,
        offset = rep(0, nobs), family = gaussian(),
        control = list(), intercept = TRUE)

## S3 method for class 'glm'
weights(object, type = c("prior", "working"), ...)



参数----------Arguments----------

参数:formula
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.  The details of model specification are given under "Details".
类对象"formula"(或一个可以强制这一类):将装有模型的一个象征性的描述。在“详细信息”型号规格的细节。


参数:family
a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function.  (See family for details of family functions.)
误差分布的描述和链接功能,可以在模型中使用。这可以是一个字符串,命名家庭功能,家庭功能或家庭功能的调用结果。 (见family家庭功能的详细信息。)


参数:data
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model.  If not found in data, the variables are taken from environment(formula), typically the environment from which glm is called.
一个可选的数据框,列表或环境(as.data.frame到一个数据框或对象强制转换)包含在模型中的变量。如果没有找到data,environment(formula),通常是从哪个glm被称为环境变量。


参数:weights
an optional vector of "prior weights" to be used in the fitting process.  Should be NULL or a numeric vector.
在装修过程中要使用的“前重”的可选向量。应该是NULL或数字向量。


参数:subset
an optional vector specifying a subset of observations to be used in the fitting process.
一个可选的向量指定要在装修过程中使用的观测的子集。


参数:na.action
a function which indicates what should happen when the data contain NAs.  The default is set by the na.action setting of options, and is na.fail if that is unset.  The "factory-fresh" default is na.omit.  Another possible value is NULL, no action.  Value na.exclude can be useful.
一个函数,它表示数据时,包含NA的,应该发生什么。默认设置na.actionoptions设置,是na.fail如果没有设置。工厂新鲜的默认是na.omit。另一种可能的值是NULL,没有行动。值na.exclude可能是有用的。


参数:start
starting values for the parameters in the linear predictor.
线性预测的参数的初始值。


参数:etastart
starting values for the linear predictor.
开始的线性预测值。


参数:mustart
starting values for the vector of means.
值开始为手段的向量。


参数:offset
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases.  One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used.  See model.offset.
这可以被用来指定一个先验已知的组件包括在装修过程中的线性预测。这应该是NULL或数字矢量的长度相等的情况数目。一个或多个offset条款可以包括在公式代替,以及,如果超过一个指定使用它们的总和。看到model.offset。


参数:control
a list of parameters for controlling the fitting process.  For glm.fit this is passed to glm.control.
为控制装修过程中的参数列表。 glm.fit这是传递给glm.control。


参数:model
a logical value indicating whether model frame should be included as a component of the returned value.
一个逻辑值,指示是否应作为返回值的一个组成部分包括模型框架。


参数:method
the method to be used in fitting the model.  The default method "glm.fit" uses iteratively reweighted least squares (IWLS): the alternative "model.frame" returns the model frame and does no fitting.  User-supplied fitting functions can be supplied either as a function or a character string naming a function, with a function which takes the same arguments as glm.fit.  If specified as a character string it is looked up from within the stats namespace.  
该方法将用于拟合模型。默认的方法"glm.fit"使用迭代加权最小二乘(IWLS):替代"model.frame"返回的模型框架,并不会拟合。用户提供的拟合函数可以提供或者作为一个函数或一个字符串,命名函数,一个函数,它接受相同的参数为glm.fit。如果作为一个字符串指定抬头从stats命名空间内。


参数:x, y
For glm: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value.  For glm.fit: x is a design matrix of dimension n * p, and y is a vector of observations of length n.  
glm:在装修过程中使用的响应向量模型矩阵是否应作为返回值的组件返回逻辑值。 glm.fit:x是设计矩阵的维n * p,y长度n观测向量。


参数:contrasts
an optional list. See the contrasts.arg of model.matrix.default.
可选列表。参见contrasts.argmodel.matrix.default。


参数:intercept
logical. Should an intercept be included in the null model?
逻辑。截距应包括在空模型?


参数:object
an object inheriting from class "glm".
继承类"glm"对象。


参数:type
character, partial matching allowed.  Type of weights to extract from the fitted model object.
允许字符,部分匹配。权重的类型,从拟合模型对象中提取。


参数:...
For glm: arguments to be used to form the default control argument if it is not supplied directly.  For weights: further arguments passed to or from other methods.  
glm:被用来形成默认的参数control参数,如果它不直接提供。 weights“:通过进一步的论据或其他方法。


Details

详情----------Details----------

A typical predictor has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response.  For binomial and quasibinomial families the response can also be specified as a factor (when the first level denotes failure and all others success) or as a two-column matrix with the columns giving the numbers of successes and failures.  A terms specification of the form first + second indicates all the terms in first together with all the terms in second with any duplicates removed.
一个典型的预测的形式response ~ terms其中response(数字)的响应向量和terms是指定response的线性预测一系列的条款。 binomial和quasibinomial家庭的响应,也可以被指定为factor(第一级时表示失败和所有其他成功)为一个两列矩阵或列给成功和失败的数量。一个条款规范的形式first + second表示firstsecond任何重复删除的所有条款的所有条款。

A specification of the form first:second indicates the the set of terms obtained by taking the interactions of all terms in first with all terms in second.  The specification first*second indicates the cross of first and second.  This is the same as first + second +   first:second.
一个规范的形式first:second表示术语first与在second所有条款所有条款的相互作用得到一套。规范first*second表示first和second交叉。这是相同的first + second +   first:second。

The terms in the formula will be re-ordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on: to avoid this pass a terms object as the formula.
公式中的条款将重新排列,使主效应放在第一位,其次,所有二阶的相互作用,所有三阶等terms公式为对象,以避免这一关。

Non-NULL weights can be used to indicate that different observations have different dispersions (with the values in weights being inversely proportional to the dispersions); or equivalently, when the elements of weights are positive integers w_i, that each response y_i is the mean of w_i unit-weight observations.  For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes: they would rarely be used for a Poisson GLM.
非NULLweights可以用来表示不同的意见有不同的分散(weights是成反比的分散值),或等价地,当<元素X>是正整数weights,每个响应w_i是平均y_i单位重量的意见。为二项式的GLM事先重量时的反应是成功的比例,用于给试验次数:他们很少会被使用泊松的GLM。

glm.fit is the workhorse function: it is not normally called directly but can be more efficient where the response vector and design matrix have already been calculated.
glm.fit是主力功能:它一般不直接调用,但可以更有效地响应向量和设计矩阵已计算。

If more than one of etastart, start and mustart is specified, the first in the list will be used.  It is often advisable to supply starting values for a quasi family, and also for families with unusual links such as gaussian("log").
如果一个以上的etastart,start和mustart指定列表中的第一将使用。它往往是最好起始值为一个quasi家庭,也为家庭提供不寻常的联系,如gaussian("log")。

All of weights, subset, offset, etastart and mustart are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula.
weights,subset,offset,etastart和mustartformula,这是首次在变量相同的方式评估data然后在formula环境。

For the background to warning messages about "fitted probabilities numerically 0 or 1 occurred" for binomial GLMs, see Venables &amp; Ripley (2002, pp. 197&ndash;8).
警告消息“为背景的拟合概率数值0或1二项式GLMs,看到维纳布尔斯和雷普利(2002年,第197-8页)。


值----------Value----------

glm returns an object of class inheriting from "glm" which inherits from the class "lm". See later in this section. If a non-standard method is used, the object will also inherit from the class (if any) returned by that function.
glm返回"glm"从类"lm"继承的一个类继承的对象。在本节后面。如果非标准method使用,对象也将继承类(如有)由该函数返回。

The function summary (i.e., summary.glm) can be used to obtain or print a summary of the results and the function anova (i.e., anova.glm) to produce an analysis of variance table.
函数summary(即summary.glm)可用于获取或打印结果的总结和功能anova(即,anova.glm)生产分析方差表。

The generic accessor functions coefficients, effects, fitted.values and residuals can be used to extract various useful features of the value returned by glm.
通用存取功能coefficients,effects,fitted.values和residuals可用于提取各种有用的功能由glm返回的值。

weights extracts a vector of weights, one for each case in the fit (after subsetting and na.action).
weights提取的权重向量,每个在合适的情况下(后子集和na.action)。

An object of class "glm" is a list containing at least the following components:
"glm"类的对象是一个列表,至少包含以下组件:


参数:coefficients
a named vector of coefficients
命名为向量的系数


参数:residuals
the working residuals, that is the residuals in the final iteration of the IWLS fit.  Since cases with zero weights are omitted, their working residuals are NA.
工作残差,是在最后一次迭代的IWLS适合残差。由于具有零权的情况下被忽略,他们的工作残差NA。


参数:fitted.values
the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.
拟合的平均值,链接功能的逆变换的线性预测得到。


参数:rank
the numeric rank of the fitted linear model.
拟合线性模型的数字排名。


参数:family
the family object used.
family对象使用。


参数:linear.predictors
the linear fit on link scale.
链路上规模的线性拟合。


参数:deviance
up to a constant, minus twice the maximized log-likelihood.  Where sensible, the constant is chosen so that a saturated model has deviance zero.
一个常数,减去两次最大化日志的可能性。在合理的情况下,常数选择,使饱和模式有偏差为零。


参数:aic
A version of Akaike's An Information Criterion, minus twice the maximized log-likelihood plus twice the number of parameters, computed by the aic component of the family. For binomial and Poison families the dispersion is fixed at one and the number of parameters is the number of coefficients. For gaussian, Gamma and inverse gaussian families the dispersion is estimated from the residual deviance, and the number of parameters is the number of coefficients plus one.  For a gaussian family the MLE of the dispersion is used so this is a valid value of AIC, but for Gamma and inverse gaussian families it is not. For families fitted by quasi-likelihood the value is NA.
赤池资讯准则的一个版本,再减去两倍的最大似然加上两倍数量的参数,由aic家庭的组成部分计算。二项式及毒药家庭分散固定在一个和参数的数量是系数。高斯,伽玛和逆高斯家庭分散的残余偏差估计,参数的数量是系数的数目加一。为高斯家庭分散MLE使用,所以这是一个有效的价值,但AIC的伽玛和逆高斯家庭是不是。为家庭装有拟似然值是NA。


参数:null.deviance
The deviance for the null model, comparable with deviance. The null model will include the offset, and an intercept if there is one in the model.  Note that this will be incorrect if the link function depends on the data other than through the fitted mean: specify a zero offset to force a correct calculation.
空模型的偏差,用deviance相媲美。如果有一个模型中,将包括偏移,截距空模型。请注意,这将是不正确的,如果取决于其他比合身意味着通过数据链接功能:指定一个零偏移强制正确计算。


参数:iter
the number of iterations of IWLS used.
使用IWLS迭代。


参数:weights
the working weights, that is the weights in the final iteration of the IWLS fit.
工作权,这是在最后一次迭代的IWLS适合的重量。


参数:prior.weights
the weights initially supplied, a vector of 1s if none were.
最初提供的重量,1的向量,如果没有。


参数:df.residual
the residual degrees of freedom.
自由的剩余度。


参数:df.null
the residual degrees of freedom for the null model.
自由的空模型中的残留度。


参数:y
if requested (the default) the y vector used. (It is a vector even for a binomial model.)
如果请求(默认)y向量。 (这是一个向量,甚至二项式模型)。


参数:x
if requested, the model matrix.
如果要求,模型矩阵。


参数:model
if requested (the default), the model frame.
如果请求(默认),模型框架。


参数:converged
logical. Was the IWLS algorithm judged to have converged?
逻辑。是的IWLS算法判断有融合呢?


参数:boundary
logical. Is the fitted value on the boundary of the attainable values?
逻辑。是实现价值的边界上的拟合值吗?


参数:call
the matched call.
匹配的呼叫。


参数:formula
the formula supplied.
提供的公式。


参数:terms
the terms object used.
terms对象使用。


参数:data
the data argument.
data argument。


参数:offset
the offset vector used.
用于抵消向量。


参数:control
the value of the control argument used.
使用control参数值。


参数:method
the name of the fitter function used, currently always "glm.fit".
使用的的钳工功能的名称,目前始终"glm.fit"。


参数:contrasts
(where relevant) the contrasts used.
(如适用)的对比。


参数:xlevels
(where relevant) a record of the levels of the factors used in fitting.
(如有关)创纪录的水平在装修中使用的因素。


参数:na.action
(where relevant) information returned by model.frame on the special handling of NAs.
返回的信息(如有关)model.frameNA的特殊处理。

In addition, non-empty fits will have components qr, R and effects relating to the final weighted linear fit.
此外,非空配合,将有组件qr,R和effects有关最终加权线性拟合。

Objects of class "glm" are normally of class c("glm",     "lm"), that is inherit from class "lm", and well-designed methods for class "lm" will be applied to the weighted linear model at the final iteration of IWLS.  However, care is needed, as extractor functions for class "glm" such as residuals and weights do not just pick out the component of the fit with the same name.
类对象"glm"通常类c("glm",     "lm"),这是继承自类"lm",以及精心设计的类的方法"lm"将被应用到加权线性模型最后IWLS迭代。然而,护理是必要的,作为提取类"glm"比如residuals和weights不只是挑选出适合具有相同的名称的组成部分的功能。

If a binomial glm model was specified by giving a two-column response, the weights returned by prior.weights are the total numbers of cases (factored by the supplied case weights) and the component y of the result is the proportion of successes.
如果binomial被指定给一个两列的反应glm模型,权重返回prior.weights情况总数(提供的情况下重量的因素)和组件 y结果是成功的比例。


拟合函数----------Fitting functions----------

The argument method serves two purposes.  One is to allow the model frame to be recreated with no fitting.  The other is to allow the default fitting function glm.fit to be replaced by a function which takes the same arguments and uses a different fitting algorithm.  If glm.fit is supplied as a character string it is used to search for a function of that name, starting in the stats namespace.
参数method有两个目的。其中之一是,使无接头重新模型框架。其他允许默认的拟合函数glm.fit被替换一个函数,它采用相同的参数,并使用不同的拟合算法。如果glm.fit作为一个字符串,它是用来搜索该名称的功能,在stats命名空间开始提供。

The class of the object return by the fitter (if any) will be prepended to the class returned by glm.
类钳工(如有)的对象回报将前面的类返回glm。


作者(S)----------Author(s)----------



The original <font face="Courier New,Courier" color="#666666"><b>R</b></font> implementation of <code>glm</code> was written by Simon
Davies working for Ross Ihaka at the University of Auckland, but has
since been extensively re-written by members of the R Core team.

The design was inspired by the S function of the same name described
in Hastie &amp; Pregibon (1992).




参考文献----------References----------

An Introduction to Generalized Linear Models. London: Chapman and Hall.
Generalized linear models. Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth &amp; Brooks/Cole.
Generalized Linear Models. London: Chapman and Hall.
Modern Applied Statistics with S. New York: Springer.

参见----------See Also----------

anova.glm, summary.glm, etc. for glm methods, and the generic functions anova, summary, effects, fitted.values, and residuals.
anova.glm,summary.glm,glm方法,以及通用的功能等。anova,summary,effects,fitted.values residuals。

lm for non-generalized linear models (which SAS calls GLMs, for "general" linear models).
lm非广义线性模型(SAS呼吁GLMs一般的线性模型,)。

loglin and loglm (package MASS) for fitting log-linear models (which binomial and Poisson GLMs are) to contingency tables.
loglin和loglm(包MASS)件数线性模型(二项分布与泊松GLMs)应急表。

bigglm in package biglm for an alternative way to fit GLMs to large datasets (especially those with many cases).
bigglm包biglm另一种方式,以适应大型数据集(尤其是那些许多情况下)GLMs。

esoph, infert and predict.glm have examples of fitting binomial glms.
esoph,infert和predict.glm有拟合二项式glms的例子。


举例----------Examples----------


## Dobson (1990) Page 93: Randomized Controlled Trial :[#多布森(1990)第93页:随机对照试验:]
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
anova(glm.D93)
summary(glm.D93)

## an example with offsets from Venables &amp; Ripley (2002, p.189)[#为例,从维纳布尔斯和雷普利的偏移量(2002,p.189)]
utils::data(anorexia, package="MASS")

anorex.1 <- glm(Postwt ~ Prewt + Treat + offset(Prewt),
                family = gaussian, data = anorexia)
summary(anorex.1)

# A Gamma example, from McCullagh &amp; Nelder (1989, pp. 300-2)[伽玛例如,从McCullagh&内尔德(1989年,页300-2)]
clotting <- data.frame(
    u = c(5,10,15,20,30,40,60,80,100),
    lot1 = c(118,58,42,35,27,25,21,19,18),
    lot2 = c(69,35,26,21,18,16,13,12,12))
summary(glm(lot1 ~ log(u), data=clotting, family=Gamma))
summary(glm(lot2 ~ log(u), data=clotting, family=Gamma))

## Not run: [#无法运行:]
## for an example of the use of a terms object as a formula[#使用的条款的例子对象作为一个公式]
demo(glm.vr)


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

发表于 2014-7-18 10:03:55 | 显示全部楼层
{:soso_e179:}
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-22 22:50 , Processed in 0.062192 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表