R语言 VGAM包 cao()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-10-1 15:28:30

cao(VGAM)
cao()所属R语言包：VGAM

                                       Fitting Constrained Additive Ordination (CAO)
                                       配件约束添加剂排序（CAO）

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

A constrained additive ordination (CAO) model is fitted using the reduced-rank vector generalized additive model (RR-VGAM) framework.
受约束的添加剂协调（CAO）模型拟合采用降秩向量广义相加模型（RR-VGAM）框架。

用法----------Usage----------

cao(formula, family, data = list(),
weights = NULL, subset = NULL, na.action = na.fail,
etastart = NULL, mustart = NULL, coefstart = NULL,
control = cao.control(...), offset = NULL,
method = "cao.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE,
contrasts = NULL, constraints = NULL,
extra = NULL, qr.arg = FALSE, smart = TRUE, ...)

参数----------Arguments----------

参数：formula
a symbolic description of the model to be fit.  The RHS of the formula is used to construct the latent variables, upon which the smooths are applied.  All the variables in the formula are used for the construction of latent variables except for those specified by the argument Norrr, which is itself a formula.  The LHS of the formula contains the response variables, which should be a matrix with each column being a response (species).
一个象征性的模型来描述是合适的。式的RHS被用来构建潜变量，基于此，平滑处理被施加。除了那些指定的参数Norrr，这本身就是一个公式，在公式中使用的所有变量的潜变量的建设。 LHS的公式中包含的反应变量，这应该是一个矩阵，每一列是一个响应（种）。

参数：family
a function of class "vglmff" (see vglmff-class) describing what statistical model is to be fitted. This is called a “VGAM family function”.  See CommonVGAMffArguments for general information about many types of arguments found in this type of function. See cqo for a list of those presently implemented.
一个类的函数"vglmff"（vglmff-class）描述统计模型是被安装。这就是所谓的“VGAM家庭功能”。见CommonVGAMffArguments的一般信息，发现这种类型的函数的参数的多种类型的。见cqo的列表，目前实施的。

参数：data
an optional data frame containing the variables in the model. By default the variables are taken from environment(formula), typically the environment from which cao is called.
一个可选的数据框包含在模型中的变量。默认情况下，变量的environment(formula)，通常是cao被称为环境。

参数：weights
an optional vector or matrix of (prior) weights to be used in the fitting process.  For cao, this argument currently should not be used.
在嵌合过程中要使用的可选的（现有）的权重向量或矩阵。对于cao，这种说法目前不应该使用。

参数：subset
an optional logical vector specifying a subset of observations to be used in the fitting process.
一个可选的逻辑矢量指定的装配过程中可以使用的观测值的一个子集。

参数：na.action
a function which indicates what should happen when the data contain NAs.  The default is set by the na.action setting of options, and is na.fail if that is unset. The “factory-fresh” default is na.omit.
一个函数，它表示当数据包含NA的，应该发生什么。默认设置是由na.action的options，是na.fail，如果是没有设置的。 “出厂时的默认是na.omit。

参数：etastart
starting values for the linear predictors.  It is a M-column matrix. If M=1 then it may be a vector.  For cao, this argument currently should not be used.
开始的线性预测值。这是一个M列的矩阵。如果M=1然后它可能是一个矢量。对于cao，这种说法目前不应该使用。

参数：mustart
starting values for the fitted values. It can be a vector or a matrix.  Some family functions do not make use of this argument. For cao, this argument currently should not be used.
拟合值的初始值。它可以是一个矢量或矩阵。有些家庭功能不使用这种说法。对于cao，这种说法目前不应该使用。

参数：coefstart
starting values for the coefficient vector.  For cao, this argument currently should not be used.
的系数向量的初始值。对于cao，这种说法目前不应该使用。

参数：control
a list of parameters for controlling the fitting process. See cao.control for details.
的参数，用于控制的嵌合过程的列表。见cao.control的详细信息。

参数：offset
a vector or M-column matrix of offset values.  These are a priori known and are added to the linear predictors during fitting.  For cao, this argument currently should not be used.
一个向量或M的列矩阵的偏移值。这些是先验已知的，并且在配合期间添加到的线性预测。对于cao，这种说法目前不应该使用。

参数：method
the method to be used in fitting the model.  The default (and presently only) method cao.fit uses iteratively reweighted least squares (IRLS) within FORTRAN code called from optim.
该方法被用于拟合模型。默认情况下，（目前）方法cao.fit使用迭代加权最小二乘（IRLS）在FORTRAN代码调用optim。

参数：model
a logical value indicating whether the model frame should be assigned in the model slot.
一个逻辑值，该值指示是否应该被分配在model插槽的模型框架。

参数：x.arg, y.arg
logical values indicating whether the model matrix and response vector/matrix used in the fitting process should be assigned in the x and y slots.  Note the model matrix is the linear model (LM) matrix.
逻辑值指示是否模型矩阵和的响应向量/矩阵在装修过程中使用应分配在x和y槽。注意模型矩阵是线性模型（LM）的矩阵。

参数：contrasts
an optional list. See the contrasts.arg of model.matrix.default.
可选列表。请参阅contrasts.argmodel.matrix.default。

参数：constraints
an optional list  of constraint matrices.  For cao, this argument currently should not be used.  The components of the list must be named with the term it corresponds to (and it must match in character format).  Each constraint matrix must have M rows, and be of full-column rank. By default, constraint matrices are the M by M identity matrix unless arguments in the family function itself override these values.  If constraints is used it must contain all the terms; an incomplete list is not accepted.
约束矩阵的可选列表。对于cao，这种说法目前不应该使用。的列表中的组件必须被命名为与它对应的术语（和它必须匹配的字符格式）。每个约束矩阵必须有M行，全列秩。默认情况下，约束矩阵M的M的的身份矩阵，除非在家庭中的参数函数本身覆盖这些值。如果constraints使用它必须包含的所有条款，不接受不完整的名单。

参数：extra
an optional list with any extra information that might be needed by the family function.  For cao, this argument currently should not be used.
任何额外的信息可能需要的家庭功能的可选列表。对于cao，这种说法目前不应该使用。

参数：qr.arg
For cao, this argument currently should not be used.
对于cao，这种说法目前不应该使用。

参数：smart
logical value indicating whether smart prediction (smartpred) will be used.
逻辑值，该值指示是否智能预测（smartpred）的使用。

参数：...
further arguments passed into cao.control.
进一步的参数传递到cao.control。

Details

详细信息----------Details----------

The arguments of cao are a mixture of those from vgam and cqo, but with some extras in cao.control. Currently, not all of the arguments work properly.
cao的参数是一个混合这些从vgam和cqo，但一些额外的cao.control。目前，并非所有的参数正常工作。

CAO can be loosely be thought of as the result of fitting generalized additive models (GAMs) to several responses (e.g., species) against a very small number of latent variables.  Each latent variable is a linear combination of the explanatory variables; the coefficients C (called C below) are called constrained coefficients or canonical coefficients, and are interpreted as weights or loadings. The C are estimated by maximum likelihood estimation.  It is often a good idea to apply scale to each explanatory variable first.
曹可大致认为是拟合广义加性模型（GAMS）的潜变量对极少数的几个反应（例如，种）的结果。每个潜变量是解释变量的线性组合的系数C（叫做C下文）被称为约束系数或规范的系数，并且被解释为重量或负荷。 C的估计最大似然估计。它往往是一个好主意，申请scale各解释变量。

For each response (e.g., species), each latent variable is smoothed by a cubic smoothing spline, thus CAO is data-driven. If each smooth were a quadratic then CAO would simplify to constrained quadratic ordination (CQO; formerly called canonical Gaussian ordination or CGO). If each smooth were linear then CAO would simplify to constrained linear ordination (CLO). CLO can theoretically be fitted with cao by specifying df1.nl=0, however it is more efficient to use rrvglm.
对于每一个响应（例如，物种），每个潜变量由三次样条函数进行平滑，从而曹是数据驱动的。如果每一个平滑的二次曹简化约束二次协调（CQO以前称为规范的高斯协调或CGO）。如果每个光滑呈线性曹简化约束排序（CLO）。 CLO理论上可以配备cao，通过指定df1.nl=0，但它是更有效地使用rrvglm。

Currently, only Rank=1 is implemented, and only Norrr = ~1 models are handled.
目前，仅是Rank=1的实施，并处理Norrr = ~1模型。

With binomial data, the default formula is
二项数据，默认公式是

where x_2 is a vector of environmental variables, and nu=C^T x_2 is a R-vector of latent variables. The eta_s is an additive predictor for species s, and it models the probabilities of presence as an additive model on the logit scale.  The matrix C is estimated from the data, as well as the smooth functions f_s.  The argument Norrr = ~ 1 specifies that the vector x_1, defined for RR-VGLMs and QRR-VGLMs, is simply a 1 for an intercept. Here, the intercept in the model is absorbed into the functions. A cloglog link may be preferable over a logit link.
x_2是一个向量，环境变量，和nu=C^T x_2是R的潜变量的向量。 eta_s是一种添加剂的预测物种s，作为添加剂的Logit模型规模和模型的概率存在。估计的数据，以及平滑函数C的矩阵f_s。参数Norrr = ~ 1指定的矢量x_1，定义为的RR-VGLMs和QRR VGLMs，是一个简单的拦截1。在这里，在模型中的截距被吸收到的功能。 Acloglog：链接可能是最好的超过logit链接。

With Poisson count data, the formula is
泊松计数数据，计算公式为

which models the mean response as an additive models on the log scale.
该款机型作为添加剂的平均响应模型的log规模。

The fitted latent variables (site scores) are scaled to have unit variance.  The concept of a tolerance is undefined for CAO models, but the optima and maxima are defined. The generic functions Max and Opt should work for CAO objects, but note that if the maximum occurs at the boundary then Max will return a NA.  Inference for CAO models is currently undeveloped.
设备齐全的潜变量（网站评分）扩展到单位方差。一个宽容的概念是不确定的为曹车型，但最优值和最大值定义。的通用功能Max和Opt应该工作为曹对象，但要注意，如果的最大值出现在边界然后Max将返回一个NA。曹模型的推断是目前尚未开发。

值----------Value----------

An object of class "cao" (this may change to "rrvgam" in the future). Several generic functions can be applied to the object, e.g., Coef, ccoef, lvplot, summary.
类的一个对象"cao"（这可能会改变，以"rrvgam"在未来）。可以应用到对象的一些通用功能，例如，Coef，ccoef，lvplot，summary。

警告----------Warning ----------

CAO is very costly to compute. With version 0.7-8 it took 28 minutes on  a fast machine. I hope to look at ways of speeding things up in the future.
曹是非常昂贵的计算。版本0.7-8一个速度快的机器上用了28分钟。我希望看的东西在未来的加速方式。

Use set.seed just prior to calling cao() to make your results reproducible. The reason for this is finding the optimal CAO model presents a difficult optimization problem, partly because the log-likelihood function contains many local solutions. To obtain the (global) solution the user is advised to try many initial values. This can be done by setting Bestof some appropriate value (see cao.control). Trying many initial values becomes progressively more important as the nonlinear degrees of freedom of the smooths increase.
使用set.seed前调用cao()使你的结果重现性的。这是找到最佳的曹模型提出了一个困难的优化问题的原因，一方面是因为对数似然函数包含许多本地的解决方案。为了获得（全球）解决方案，建议用户尝试了很多的初始值。这可以通过设置Bestof一些适当的值（见cao.control）。多次尝试初始值的平滑增长的非线性度的自由变得越来越重要。

Currently the dispersion parameter for a gaussianff CAO model is estimated slightly differently and may be slightly biassed downwards (usually a little too small).
目前分散的一个gaussianff曹模型的参数估计略有不同，可能略有偏压向下（通常是有点太小了）。

注意----------Note----------

CAO models are computationally expensive, therefore setting trace = TRUE is a good idea, as well as running it on a simple random sample of the data set instead.
曹模型计算成本高昂，因此在trace = TRUE是一个很好的想法，以及运行其上的数据集，而不是一个简单的随机抽样。

Sometimes the IRLS algorithm does not converge within the FORTRAN code. This results in warnings being issued.  In particular, if an error code of 3 is issued, then this indicates the IRLS algorithm has not converged. One possible remedy is to increase or decrease the nonlinear degrees of freedom so that the curves become more or less flexible, respectively.
有时IRLS算法不收敛，在FORTRAN代码。这所发出的警告。特别是，如果一个错误码，3发出，那么这表示IRLS算法不收敛。其中一个可能的补救办法是增加或减少非线性自由度，使得曲线变得或多或少灵活，分别。

（作者）----------Author(s)----------

T. W. Yee

参考文献----------References----------

Constrained additive ordination. Ecology, 87, 203–213.
http://www.stat.auckland.ac.nz/~yee contains further information and examples.

参见----------See Also----------

cao.control, Coef.cao, cqo, lv, Opt, Max, lv, persp.cao, poissonff, binomialff, negbinomial, gamma2, gaussianff, set.seed, gam.
cao.control，Coef.cao，cqo，lv，Opt，Max，lv，persp.cao，poissonff，binomialff，negbinomial，gamma2，gaussianff，set.seed，gam。

实例----------Examples----------

## Not run: [＃不运行：]
hspider[,1:6] = scale(hspider[,1:6]) # Standardized environmental vars[标准化的环境瓦尔]
set.seed(149) # For reproducible results [对于重复性的结果]
ap1 = cao(cbind(Pardlugu, Pardmont, Pardnigr, Pardpull) ~
      WaterCon + BareSand + FallTwig +
      CoveMoss + CoveHerb + ReflLux,
      family = poissonff, data = hspider, Rank = 1,
      df1.nl = c(Pardpull=2.7, 2.5),
      Bestof = 7, Crow1positive = FALSE)
sort(ap1@misc$deviance.Bestof) # A history of all the iterations[历史上所有的迭代]

Coef(ap1)
ccoef(ap1)

par(mfrow=c(2,2))
plot(ap1) # All the curves are unimodal; some quite symmetric[所有曲线是单峰，颇有些对称]

par(mfrow=c(1,1), las=1)
index = 1:ncol(ap1@y)
lvplot(ap1, lcol=index, pcol=index, y=TRUE)

trplot(ap1, label=TRUE, col=index)
abline(a=0, b=1, lty=2)

trplot(ap1, label=TRUE, col="blue", log="xy", whichSp=c(1,3))
abline(a=0, b=1, lty=2)

persp(ap1, col=index, lwd=2, label=TRUE)
abline(v=Opt(ap1), lty=2, col=index)
abline(h=Max(ap1), lty=2, col=index)

## End(Not run)[＃（不执行）]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册