找回密码
 注册
查看: 1597|回复: 0

R语言:negbin()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-16 21:23:51 | 显示全部楼层 |阅读模式
negbin(mgcv)
negbin()所属R语言包:mgcv

                                        GAM negative binomial family
                                         自由亚齐运动负二项式家庭

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

The gam modelling function is designed to be able to use  the negbin family (a modification of MASS library negative.binomial family  by Venables and Ripley), with or without a known theta parameter.  Two approaches to estimating the theta parameter are available:
gam建模功能被设计为能够使用或没有已知的negbinnegative.binomial家庭(修改维纳布尔斯和Ripley)的地下库theta家庭,参数。 theta参数估计的两种方法可供选择:

If "performance iteration" is used for smoothing parameter estimation  (see gam), then smoothing parameters are chosen by GCV and  theta is chosen in order to ensure that the Pearson estimate of the scale  parameter is as close as possible to 1, the value that the scale parameter should have.
如果性能迭代用于平滑参数估计(见gam),然后选择平滑参数GCV和theta选择,以确保皮尔森估计尺度参数为的尽可能接近1,价值尺度参数应该有。

If "outer iteration" is used for smoothing parameter selection, and smoothing parameters  are chosen by UBRE/AIC (with scale parameter set to 1) then a value of theta is   searched for which minimizes the AIC of the model. Alternatively If (RE)ML is used for smoothing  parameter estimation then a value of theta is searched for which maximizes the (restricted)  likelihood.
如果“外部循环用于平滑参数的选取,平滑参数的选择UBRE / AIC(尺度参数设置为1)然后theta为最大限度地减少了该模型的AIC的搜索的价值。另外,如果(RE)的ML是用于平滑参数估计,然后theta搜索的价值最大化(限制)的可能性。

The second option is much slower than the first, but the first can sometimes fail to converge.  To use the first option, set the optimizer argument of gam to "perf".
第二个选项是比第一次慢得多,但第一,有时可以不收敛。使用第一个选项,设置optimizer gam参数"perf"。


用法----------Usage----------


negbin(theta = stop("'theta' must be specified"), link = "log")



参数----------Arguments----------

参数:theta
Either i) a single value known value of theta, ii) two values of theta specifying the  endpoints of an interval over which to search for theta or iii) an array of values of theta, specifying the set of theta values to search. (iii) is only available with AIC based theta estimation.  
任我)单值称为theta的价值,二)theta的两个值指定的时间间隔的端点搜索THETA或iii)theta的值的数组,指定theta的值设置为搜索。 (三)是唯一可与AIC的THETA估计。


参数:link
The link function: one of "log", "identity" or "sqrt"
链接功能:一个"log","identity"或"sqrt"


Details

详情----------Details----------

If a single value of theta is supplied then it is always taken as the known fixed value,  and estimation of smoothing paramaters is then by UBRE/AIC. If theta is two numbers (theta[2]>theta[1])  then they are taken as specifying the range of values over which to search for  the optimal theta. If theta is any other array of numbers then they are taken as the discrete set of values of theta over which to search for theta. The latter  option only works with AIC based outer iteration, if performance iteration is used then an array will only be used to define a search range.
如果theta单值提供了它总是被视为已知的固定值,然后估计平滑paramaters的由UBRE / AIC。如果theta是两个数字(theta[2]>theta[1]),那么他们采取指定范围值来寻找最佳的THETA。如果theta是任何其他阵列的数字,然后他们theta的值作为一组离散搜索theta。选择后者只适用于与AIC的外部循环,如果使用性能迭代数组只会被用来定义一个搜索范围。

If performance iteration is used (see gam argument optimizer) then the method  of estimation is to choose theta  so that the GCV (Pearson) estimate  of the scale parameter is one (since the scale parameter  is one for the negative binomial). In this case theta estimation is nested within the IRLS loop  used for GAM fitting. After each call to fit an iteratively weighted additive model to the IRLS pseudodata,  the theta estimate is updated. This is done by conditioning on all components of the current GCV/Pearson  estimator of the scale parameter except theta and then searching for the  theta which equates this conditional  estimator to one. The search is  a simple bisection search after an initial crude line search to bracket one. The search will  terminate at the upper boundary of the search region is a Poisson fit would have yielded an estimated  scale parameter <1.
如果使用性能迭代(见gam说法optimizer),然后估算的方法是选择theta这样的GCV的(皮尔逊)尺度参数的估计是一个(因为规模参数是负二项分布)。在这种情况下theta估计是嵌套的内IRLS循环使用的自由亚齐运动配件。每次通话后,以适应一个迭代加权相加模型IRLS伪数据,theta估计更新。这是通过对GCV /皮尔森目前除了theta“,然后寻找一个相当于这个条件估计theta尺度参数估计的所有部件的空调。搜索是在经过最初的原油网上搜索到支架一个简单的二分法搜索。搜索将终止在搜索区域的上边界是泊松适合将已经取得了规模估计参数<1。

If outer iteration is used then theta is estimated by searching for the value yielding the lowest AIC. The search is either over the supplied array of values, or is a grid search over the supplied range, followed by a  golden section search. A full fit is required for each trial theta, so the process is slow, but speed is enhanced by making the changes in theta as small as possible, from one step to the next, and  using the previous smothing parameter and fitted values to start the new fit.
如果使用外部循环theta寻找收益率最低的AIC的价值估计。搜索是在阵列提供的价值,或者是提供的范围,其次是黄金地段搜索的网格搜索。完全适合每个审判theta需要,所以这个过程是缓慢的,但速度theta尽可能小的变化,从一个步骤到下,使用以前的提高smothing参数和拟合值,以启动新的适合。

In a simulation test based on 800 replicates of the first example data, given below, the GCV based (performance iteration) method yielded models with,  on avergage 6% better MSE performance than the AIC based (outer iteration) method.  theta had a 0.86 correlation coefficient between the two methods. theta estimates averaged 3.36 with a standard deviation of 0.44 for the AIC based method and 3.22 with a standard deviation of 0.43 for the GCV based method. However the GCV based method is  less computationally reliable, failing in around 4% of replicates.
在对800基于模拟试验的第一个例子数据复制,如下,GCV的基于(表现迭代)方法产生模型,avergage 6%的MSE性能优于基于AIC的方法(外部循环)。 theta有两种方法之间的相关系数0.86。 theta估计与AIC的方法和GCV的基础方法标准偏差为0.43 3.22 0.44标准差平均3.36。然而,基于GCV的方法是少计算可靠,在约4%的复制失败。


值----------Value----------

An object inheriting from class family, with additional elements
对象继承类family,与其他元素


参数:dvar
the function giving the first derivative of the variance function w.r.t. mu.
一阶导数方差函数WRT功能mu。


参数:d2var
the function giving the second derivative of the variance function w.r.t. mu.
二阶导数方差函数WRT功能mu。


参数:getTheta
A function for retrieving the value(s) of theta. This also useful for retriving the  estimate of theta after fitting (see example).
一个theta的值(S)检索功能。这也为有助于retriving估计theta后接头(见例子)。


警告----------WARNINGS----------

gamm does not support theta estimation
gamm不支持theta估计

The negative binomial functions from the MASS library are no longer supported.
负二项分布函数从地下库不再支持。


作者(S)----------Author(s)----------


Simon N. Wood <a href="mailto:simon.wood@r-project.org">simon.wood@r-project.org</a>
modified from Venables and Ripley's <code>negative.binomial</code> family.




参考文献----------References----------



举例----------Examples----------


library(mgcv)
set.seed(3)
n<-400
dat <- gamSim(1,n=n)
g <- exp(dat$f/5)

# negative binomial data  [负二项分布数据]
dat$y <- rnbinom(g,size=3,mu=g)
# known theta ...[被称为theta的...]
b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(3),data=dat)
plot(b,pages=1)
print(b)

## unknown theta via performance iteration...[#未知theta的通过性能迭代...]
b1 <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(c(1,10)),
          optimizer="perf",data=dat)
plot(b1,pages=1)
print(b1)

## unknown theta via outer iteration and AIC search...[#未知THETA通过外部循环和AIC搜索...]
b2<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(c(1,10)),
        data=dat)
plot(b2,pages=1)
print(b2)

## Same again all by  REML...[#同一再次全部由REML法...]
b2a <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(c(1,10)),
        data=dat,method="REML")
plot(b2a,pages=1)
print(b2a)


## how to retrieve Theta...[#如何获取西塔...]
b2a$family$getTheta()

## unknown theta via outer iteration and AIC search[#未知THETA通过外部循环和AIC搜索]
## over a discrete set of values...[#多组离散的值...]
b3<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(2:10/2),
        data=dat)
plot(b3,pages=1)
print(b3)

## another example...[#另一个例子...]
set.seed(1)
f <- dat$f
f <- f - min(f);g <- f^2
dat$y <- rnbinom(g,size=3,mu=g)
b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(1:10,link="sqrt"),
         data=dat)
plot(b,pages=1)
print(b)
rm(dat)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-23 22:30 , Processed in 0.019528 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表