找回密码
 注册
查看: 3021|回复: 0

R语言:summary.gam()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-16 19:42:50 | 显示全部楼层 |阅读模式
summary.gam(mgcv)
summary.gam()所属R语言包:mgcv

                                        Summary for a GAM fit
                                         为自由亚齐运动适合摘要

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Takes a fitted gam object produced by gam() and produces various useful summaries from it. (See sink to divert output to a file.)
注意到一个装有gam对象gam(),从它产生的各种有用的摘要。 (见sink转移输出到一个文件。)


用法----------Usage----------


## S3 method for class 'gam'
summary(object, dispersion=NULL, freq=FALSE, p.type = 0, ...)

## S3 method for class 'summary.gam'
print(x,digits = max(3, getOption("digits") - 3),
                  signif.stars = getOption("show.signif.stars"),...)



参数----------Arguments----------

参数:object
a fitted gam object as produced by gam().
装gam对象由gam()生产。


参数:x
a summary.gam object produced by summary.gam().  
一个summary.gam对象summary.gam()。


参数:dispersion
A known dispersion parameter. NULL to use estimate or default (e.g. 1 for Poisson).
已知的色散参数。 NULL使用估计或默认(如泊松1)。


参数:freq
By default p-values for individual terms are calculated using the Bayesian estimated covariance matrix of the parameter estimators. If this is set to TRUE then the frequentist covariance matrix of the parameters is used instead. See details.  
默认情况下,个别条款的P-值计算使用贝叶斯估计参数估计的协方差矩阵。如果此设置为TRUE,那么frequentist协方差矩阵参数来代替。查看详情。


参数:p.type
determines how p-values are computed when freq==FALSE. 0 uses a test statistic with  distribution determined by the un-rounded edf of the term. 1 uses upwardly biased rounding of the edf and -1 uses a version of the test statistic with a null distribution that has to be simulated. Other options are poor, generate a warning, and are only of research interest. See details.  
确定p值如何计算时freq==FALSE。 0测试统计量的分布决定由联合国全面EDF的长期使用。 1使用向上偏颇的四舍五入的EDF和-1使用空分布具有模拟测试统计的一个版本。其他选项是穷人,生成一个警告,并且是唯一的研究兴趣。查看详情。


参数:digits
controls number of digits printed in output.
控制打印输出的位数。


参数:signif.stars
Should significance stars be printed alongside output.
要意义星一起输出打印。


参数:...
other arguments.
其他参数。


Details

详情----------Details----------

Model degrees of freedom are taken as the trace of the influence (or hat) matrix A for the model fit. Residual degrees of freedom are taken as number of data minus model degrees of freedom.  Let P_i be the matrix  giving the parameters of the ith smooth when applied to the data (or pseudodata in the generalized case) and let X  be the design matrix of the model. Then tr(XP_i) is the edf for the ith term. Clearly this  definition causes the edf's to add up properly!
度的自由模式的影响(或帽子)采取跟踪矩阵A模型拟合。残差自由度数据减去度的自由模式。让P_i矩阵第i顺利参数时,适用于数据(或伪数据在广义的情况下),让X是该模型的设计矩阵。然后tr(XP_i)是第i个任期的EDF。显然,这一定义使EDF的正确添加!

print.summary.gam tries to print various bits of summary information useful for term selection in a pretty way.
print.summary.gam尝试打印各种汇总信息位长期选择一个漂亮的方式非常有用。

If freq=TRUE then the frequentist approximation for p-values of smooth terms described in section 4.8.5 of Wood (2006) is used. The approximation is not great.  If p_i  is the parameter vector for the ith smooth term, and this term has estimated covariance matrix V_i then the  statistic is p_i'V_i^{k-}p_i, where V_i^{k-} is the rank k  pseudo-inverse of V_i, and k is estimated rank of   V_i. p-values are obtained as follows. In the case of known dispersion parameter, they are obtained by comparing the chi.sq statistic to the  chi-squared distribution with k degrees of freedom, where k is the estimated rank of  V_i. If the dispersion parameter is unknown (in  which case it will have been estimated) the statistic is compared to an F distribution with k upper d.f.  and lower d.f. given by the residual degrees of freedom for the model.  Typically the p-values will be somewhat too low.
如果freq=TRUE然后用木(2006)第4.8.5节中描述的顺利术语p值frequentist近似。逼近不是很大。如果p_i是第i平稳长期的参数向量,这个词估计协方差矩阵V_i然后统计p_i'V_i^{k-}p_i,其中V_i^{k-}是排名ķ伪逆V_i和K的,据估计排名V_i。 p值,得到如下。在已知的色散参数的情况下,他们得到比较chi.sq统计,卡方分布,自由的k,其中k是V_i的估计排名。如果分散参数是未知的(在这种情况下,它已估计)的统计相比,F分布与K上DF低D.F.通过自由的剩余度模型。 p值通常会有些过低。

If freq=FALSE then "Bayesian p-values" are returned for the smooth terms, based on a  test statistic motivated by an extension of Nychka's (1988) analysis of the frequentist properties of Bayesian confidence intervals for smooths.  These have better frequentist performance (in terms of power and distribution under the null)  than the alternative strictly frequentist approximation. Let f denote the vector of  values of a smooth term evaluated at the original  covariate values and let V_f denote the corresponding Bayesian covariance matrix. Let  V*_f denote the rank r pseudoinverse of V_f, where r is the  EDF for the term. The statistic used is then
如果freq=FALSE然后贝叶斯P-值返回的顺利的方面,根据测试统计(1988)Nychka的贝叶斯置信区间为平滑frequentist性能分析的延伸动机。这些有更好的frequentist比替代严格frequentist的近似性能(空下的权力和分布)。让f表示平稳长期在原协值评估值向量,并让V_f表示相应的贝叶斯协方差矩阵。让V*_f指r伪逆排名V_f,其中r是EDF为长期。然后使用统计

(this can be calculated efficiently without forming the pseudoinverse explicitly). T is compared to a  chi-squared distribution with degrees of freedom given by the EDF for the term,  or T is used as a component in an F ratio statistic if the  scale parameter has been estimated.  
(这可以计算的,没有形成明确的伪逆有效)。 T相比,与EDF的自由一词,或T在F比统计的一个组成部分,如果尺度参数估计已度的卡方分布。

The non-integer rank truncated inverse is constructed to give an  approximation varying smoothly between the bounding integer rank approximations, while yielding test statistics with the correct mean and variance under the null. Alternatively (p.type==1) r is obtained by  biased rounding of the EDF: values less than .05 above the preceding integer are rounded down, while other values are rounded up. Another option (p.type==-1) uses a statistic of formal rank given by the number of coefficients for the smooth, but with its terms weighted by the eigenvalues of the covariance matrix, so that penalized terms are down-weighted, but the null distribution requires simulation. Other options for p.type are 2 (naive rounding), 3 (round up), 4 (numerical rank determination): these are poor options for theoretically known reasons, and will generate a warning.
非整数的排名截断逆构造一个近似变之间的边界的整数排名逼近顺利,而产生正确的空下的均值和方差的检验统计。或者(p.type==1)r得到偏颇的EDF四舍五入:前面的整数比0.05以上的值向下调整,而其他值四舍五入。 p.type==-1另一个选项()使用的顺利系数给出的正式排名的统计,但其职权的协方差矩阵的特征值加权,这样的处罚条款下加权,但空分布需要模拟。其他选项p.type2(天真四舍五入),3(整),4(数值秩的决心):这是穷人的选择,理论上众所周知的原因,将产生一个警告。

The resulting p-value also has a  Bayesian interpretation:  the probability of observing an f less probable than 0, under the approximation for the posterior for f implied by the truncation used in the test statistic.
p值也有一个贝叶斯解释:观察f不到0可能的概率下,后路逼近f隐含在测试中使用的截断统计。

Note that the p-values distributional approximations start to break down below one effective degree of freedom, and p-values are not reported below 0.5 degrees of freedom.
请注意,p值分布近似开始跌破一个有效的自由程度,p值不低于0.5自由度报告。

In simulations the p-values have best behaviour under ML smoothness selection, with REML coming second.
在模拟p值有下ML平滑选择最佳的行为,与REML法第二次来。


值----------Value----------

summary.gam produces a list of summary information for a fitted gam object.
summary.gam装gam对象名单产生的摘要信息。


参数:p.coeff
is an array of estimates of the strictly parametric model coefficients.
是一系列严格的模型系数参数的估计。


参数:p.t
is an array of the p.coeff's divided by their standard errors.
是一个p.coeff的数组划分标准误差。


参数:p.pv
is an array of p-values for the null hypothesis that the corresponding parameter is zero.  Calculated with reference to the t distribution with the estimated residual degrees of freedom for the model fit if the dispersion parameter has been estimated, and the standard normal if not.
p值是阵列的零假设,相应的参数是零。据估计,如果分散参数计算参考与自由的估计剩余度为t分布模型拟合和标准正常,如果不。


参数:m
The number of smooth terms in the model.
模型中的顺利。


参数:chi.sq
An array of test statistics for assessing the significance of model smooth terms. See details.
阵列测试统计评估模型顺利条款的意义。查看详情。


参数:s.pv
An array of approximate p-values for the null hypotheses that each smooth term is zero. Be warned, these are only approximate.
近似为每个平稳长期为零的零假设p值的数组。被警告,这些都只是近似。


参数:se
array of standard error estimates for all parameter estimates.
阵列标准的错误估计,所有的参数估计。


参数:r.sq
The adjusted r-squared for the model. Defined as the proportion of variance explained, where original variance and  residual variance are both estimated using unbiased estimators. This quantity can be negative if your model is worse than a one  parameter constant model, and can be higher for the smaller of two nested models! The proportion null deviance  explained is probably more appropriate for non-normal errors. Note that r.sq does not include any offset in the one parameter model.
模型调整后的R平方。定义为方差的比例解释说,原来的方差和残差都估计使用无偏估计。如果你的模型是一个常数参数模型比更糟,这可以是负的数量,可以较小的两个嵌套模型高!可能是更多的非正常错误的适当比例空偏差解释。注意r.sq不包括任何一个参数模型所抵消。


参数:dev.expl
The proportion of the null deviance explained by the model. The null deviance is computed taking acount of any offset, so  dev.expl can be substantially lower than r.sq when an offset is present.
空偏差的比例由模型解释。空偏差计算acount任何偏移,所以dev.expl比r.sq当偏移是目前可以大大低。


参数:edf
array of estimated degrees of freedom for the model terms.
阵列模型来估计的自由程度。


参数:residual.df
estimated residual degrees of freedom.
估计残差自由度。


参数:n
number of data.
数据的数量。


参数:method
The smoothing selection criterion used.
平滑的选择标准。


参数:sp.criterion
The minimized value of the smoothness selection criterion. Note that for ML and REML methods,  what is reported is the negative log maginal likelihood or negative log restricted likelihood.  
平滑甄选准则降到最低值。请注意,ML和REML方法,据报道是负面的的日志maginal可能性或负日志限制的可能性。


参数:scale
estimated (or given) scale parameter.
估计(或)尺度参数。


参数:family
the family used.
家庭中使用。


参数:formula
the original GAM formula.
原来的自由亚齐运动公式。


参数:dispersion
the scale parameter.
尺度参数。


参数:pTerms.df
the degrees of freedom associated with each parameteric term (excluding the constant).
与每一个参数化的长期(不包括常数)的自由度。


参数:pTerms.chi.sq
a Wald statistic for testing the null hypothesis that the each parametric term is zero.
一个测试的每个参数的长期为零的零假设的Wald统计量。


参数:pTerms.pv
p-values associated with the tests that each term is zero. For penalized fits these are approximate. The reference distribution  is an appropriate chi-squared when the scale parameter is known, and is based on an F when it is not.
P-值的测试,每学期为零。对于处罚配合这些是近似的。参考分布是一个合适的卡方被称为尺度参数时,是基于一个F时,是不是。


参数:cov.unscaled
The estimated covariance matrix of the parameters (or estimators if freq=TRUE), divided by scale parameter.
参数估计的协方差矩阵(或估计,如果freq=TRUE),除以尺度参数。


参数:cov.scaled
The estimated covariance matrix of the parameters (estimators if freq=TRUE).
参数估计的协方差矩阵(估计如果freq=TRUE)。


参数:p.table
significance table for parameters
为参数的意义表


参数:s.table
significance table for smooths
为平滑的意义表


参数:p.Terms
significance table for parametric model terms
参数模型方面的意义表


警告----------WARNING ----------

The p-values are approximate: do read the details section.
p值是近似的:不读的细节部分。


作者(S)----------Author(s)----------


Simon N. Wood <a href="mailto:simon.wood@r-project.org">simon.wood@r-project.org</a> with substantial
improvements by Henric Nilsson.



参考文献----------References----------

Journal of the American Statistical Association 83:1134-1143.
and Hall/CRC Press.

参见----------See Also----------

gam, predict.gam,
gam,predict.gam


举例----------Examples----------


library(mgcv)
set.seed(0)
dat &lt;- gamSim(1,n=200,scale=2) ## simulate data[#模拟数据]

b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
plot(b,pages=1)
summary(b)

## now check the p-values by using a pure regression spline.....[#现在检查使用纯回归样条的p-值.....]
b.d &lt;- round(summary(b)$edf)+1 ## get edf per smooth[#每顺利获得EDF]
b.d &lt;- pmax(b.d,3) # can't have basis dimension less than 3![不能有小于3的基础上尺寸!]
bc<-gam(y~s(x0,k=b.d[1],fx=TRUE)+s(x1,k=b.d[2],fx=TRUE)+
        s(x2,k=b.d[3],fx=TRUE)+s(x3,k=b.d[4],fx=TRUE),data=dat)
plot(bc,pages=1)
summary(bc)

## p-value check - increase k to make this useful![#p值检查 - 增加K到这个有用!]
k<-20;n <- 200;p <- rep(NA,k)
for (i in 1:k)
{ b<-gam(y~te(x,z),data=data.frame(y=rnorm(n),x=runif(n),z=runif(n)),
         method="ML")
  p[i]<-summary(b)$s.p[1]
}
plot(((1:k)-0.5)/k,sort(p))
abline(0,1,col=2)
ks.test(p,"punif") ## how close to uniform are the p-values?[#如何紧密均匀P-值吗?]


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-24 20:55 , Processed in 0.024223 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表