R语言 VGAM包 multinomial()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-10-1 15:45:13

multinomial(VGAM)
multinomial()所属R语言包：VGAM

                                       Multinomial Logit Model
                                       多项Logit模型

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Fits a multinomial logit model to a (preferably unordered) factor response.
适用于多项式Logit模型的一个（最好是无序的）因素的反应。

用法----------Usage----------

multinomial(zero = NULL, parallel = FALSE, nointercept = NULL,
         refLevel = "last", whitespace = FALSE)

参数----------Arguments----------

参数：zero
An integer-valued vector specifying which linear/additive predictors are modelled as intercepts only. Any values must be from the set {1,2,...,M}. The default value means none are modelled as intercept-only terms.
指定一个整数值向量线性/添加剂的预测模型仅作为拦截。任何值都必须是集合{1,2，...，M}。默认值是指没有被建模为仅截距。

参数：parallel
A logical, or formula specifying which terms have equal/unequal coefficients.
一个逻辑，或指定的条款有平等/不平等系数，公式。

参数：nointercept, whitespace
See CommonVGAMffArguments for more details.
见CommonVGAMffArguments更多详情。

参数：refLevel
Either a single positive integer or a value of the factor. If an integer then it specifies which column of the response matrix is the reference or baseline level. The default is the last one (the (M+1)th one). If used, this argument will be often assigned the value 1. If inputted as a value of a factor then beware of missing values of certain levels of the factor (drop.unused.levels = TRUE or drop.unused.levels = FALSE). See the example below.
无论是单一的正整数或值的因素。如果一个整数，它指定响应矩阵的列是参考或基准水平。默认值是最后一个（(M+1)个1）。如果使用此参数将经常被分配的值1。如果输入的值的一个因素，然后小心遗漏值的一定水平的因素（drop.unused.levels = TRUE或drop.unused.levels = FALSE）。请看下面的例子。

Details

详细信息----------Details----------

In this help file the response Y is assumed to be a factor with unordered values 1,2,…,M+1, so that M is the number of linear/additive predictors eta_j.
本帮助文件中的响应Y被认为是一个因素，与无序的值1,2,…,M+1，使M是线性/添加剂预测eta_j。

The default model can be written
默认的模型可以写成

where eta_j is the jth linear/additive predictor. Here, j=1,…,M, and eta_{M+1} is 0 by definition. That is, the last level of the factor, or last column of the response matrix, is taken as the reference level or baseline—this is for identifiability of the parameters. The reference or baseline level can be changed with the refLevel argument.
eta_j是j次线性/添加剂的预测。在这里，j=1,…,M和eta_{M+1}是0的定义。也就是说，最后一级的因子，或最后一列的响应矩阵，被取为参考电平或基线，这是可识别的参数。与refLevel参数的参考或基准水平是可以改变的。

In almost all the literature, the constraint matrices associated with this family of models are known. For example, setting parallel = TRUE will make all constraint matrices (except for the intercept) equal to a vector of M 1's. If the constraint matrices are unknown and to be estimated, then this can be achieved by fitting the model as a reduced-rank vector generalized linear model (RR-VGLM; see rrvglm). In particular, a multinomial logit model with unknown constraint matrices is known as a stereotype model (Anderson, 1984), and can be fitted with rrvglm.
在几乎所有的文学，与该系列机型的约束矩阵是已知的。例如，设置parallel = TRUE将尽一切约束矩阵（用于拦截除外）等M1的向量。如果约束矩阵是未知的，要估计，那么这可以实现拟合模型为降秩向量广义线性模型（RR-VGLM; rrvglm“）。特别是已知的与未知的约束矩阵，多项式Logit模型的作为原型模型（安德森，1984），并可以配备rrvglm。

值----------Value----------

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, rrvglm and vgam.
类的一个对象"vglmff"（见vglmff-class）。该对象被用于建模功能，如vglm，rrvglm和vgam。

警告----------Warning ----------

No check is made to verify that the response is nominal.
不进行任何检查，以验证响应是名义上的。

See CommonVGAMffArguments for more warnings.
见CommonVGAMffArguments详细警告信息。

注意----------Note----------

The response should be either a matrix of counts (with row sums that are all positive), or a factor. In both cases, the y slot returned by vglm/vgam/rrvglm is the matrix of sample proportions.
该反应应该是一个矩阵的计数（与行的款项都是正面），或一个因素。在这两种情况下，y返回vglm插槽/vgam/rrvglm是矩阵的样本比例。

The multinomial logit model is more appropriate for a nominal (unordered) factor response than for an ordinal (ordered) factor response. Models more suited for the latter include those based on cumulative probabilities, e.g., cumulative.
多项Logit模型的名义（无序的）因素的反应比一个有序（有序）因子反应是比较合适的。后者更适合的模型包括那些基于累积概率，例如，cumulative。

multinomial is prone to numerical difficulties if the groups are separable and/or the fitted probabilities are close to 0 or 1. The fitted values returned are estimates of the probabilities P[Y=j] for j=1,…,M+1. See safeBinaryRegression for the logistic regression case.
multinomial是容易发生数值困难的，如果是可分离的基团和/或拟合的概率接近为0或1。设备齐全的返回值是估计的概率P[Y=j]j=1,…,M+1。见safeBinaryRegressionlogistic回归的情况下。

Here is an example of the usage of the parallel argument. If there are covariates x2, x3 and x4, then parallel = TRUE ~ x2 + x3 - 1 and parallel = FALSE ~ x4 are equivalent. This would constrain the regression coefficients for x2 and x3 to be equal; those of the intercepts and x4 would be different.
下面是一个例子parallel参数的使用。如果有协变量x2，x3和x4，那么parallel = TRUE ~ x2 + x3 - 1和parallel = FALSE ~ x4是等效的。这将限制的回归系数x2和x3是相等的;的拦截和x4会有所不同。

In Example 4 below, a conditional logit model is fitted to an artificial data set that explores how cost and travel time affect people's decision about how to travel to work. Walking is the baseline group. The variable Cost.car is the difference between the cost of travel to work by car and walking, etc. The variable Time.car is the difference between the travel duration/time to work by car and walking, etc. For other details about the xij argument see vglm.control and fill.
在下面的例4，有条件的Logit模型安装成本和旅行时间，探讨如何影响人们的决定如何去上班的人工数据集。步行是基准组。变量Cost.car是工作车和步行的旅行费用之间的差异等变量Time.car之间的旅行时间/时间的车程，步行等的区别是其他详细信息xij参数在看vglm.control和fill。

The multinom function in the nnet package uses the first level of the factor as baseline, whereas the last level of the factor is used here. Consequently the estimated regression coefficients differ.
multinom功能nnet包使用的第一级为基准的因素，而用在这里的最后一个级别的因素。因此，估计回归系数是不同的。

（作者）----------Author(s)----------

Thomas W. Yee

参考文献----------References----------

The <code>VGAM</code> package for categorical data analysis. Journal of Statistical Software, 32, 1–34. http://www.jstatsoft.org/v32/i10/.
Reduced-rank vector generalized linear models. Statistical Modelling,  3, 15–41.
Generalized Linear Models, 2nd ed. London: Chapman & Hall.
Categorical Data Analysis, 2nd ed. New York: Wiley.
The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd ed. New York: Springer-Verlag.
Analyzing Categorical Data, New York: Springer-Verlag.
Regression and ordered categorical variables.  Journal of the Royal Statistical Society, Series B, Methodological, 46, 1–30.
by the <code>VGAM</code> package can be found at http://www.stat.auckland.ac.nz/~yee/VGAM/doc/categorical.pdf.

参见----------See Also----------

margeff, cumulative, acat, cratio, sratio, dirichlet, dirmultinomial, rrvglm, fill1, Multinomial, iris. The author's homepage has further documentation about categorical data analysis using VGAM.
margeff，cumulative，acat，cratio，sratio，dirichlet，dirmultinomial，rrvglm，fill1，Multinomial，iris。作者的主页上有进一步的分类数据分析的文档中关于使用VGAM。

实例----------Examples----------

# Example 1: fit a multinomial logit model to Edgar Anderson's iris data[例1：适合多项式Logit模型的埃德加·安德森的虹膜数据]
data(iris)
## Not run:  fit = vglm(Species ~ ., multinomial, iris)[＃不运行：适合vglm（物种~。，多元，虹膜）]
coef(fit, matrix = TRUE)
## End(Not run)[＃（不执行）]

# Example 2a: a simple example [例2a：一个简单的例子]
ycounts = t(rmultinom(10, size = 20, prob = c(0.1, 0.2, 0.8))) # Counts[计数]
fit = vglm(ycounts ~ 1, multinomial)
head(fitted(fit)) # Proportions[比例]
fit@prior.weights # NOT recommended for extraction of prior weights[不推荐用于开采前的权重]
weights(fit, type = "prior", matrix = FALSE) # The better method[更好的方法]
depvar(fit)       # Sample proportions; same as fit@y[样本比例相同，适合@ Y]
constraints(fit) # Constraint matrices[约束矩阵]

# Example 2b: Different reference level used as the baseline [实施例2b：不同的参考电平用作基准]
fit2 = vglm(ycounts ~ 1, multinomial(refLevel = 2))
coef(fit2, matrix = TRUE)
coef(fit , matrix = TRUE) # Easy to reconcile this output with fit2[易于调和输出FIT2]

# Example 3: The response is a factor.[实施例3：该响应是一个因素。]
nn = 10
dframe3 = data.frame(yfactor = gl(3, nn, labels = c("Control", "Trt1", "Trt2")),
                  x2 = runif(3 * nn))
myrefLevel = with(dframe3, yfactor[12])
fit3a = vglm(yfactor ~ x2, multinomial(refLevel = myrefLevel), dframe3)
fit3b = vglm(yfactor ~ x2, multinomial(refLevel = 2), dframe3)
coef(fit3a, matrix = TRUE)  # "Treatment1" is the reference level[“Treatment1”是参考电平]
coef(fit3b, matrix = TRUE)  # "Treatment1" is the reference level[“Treatment1”是参考电平]
margeff(fit3b)

# Example 4: Fit a rank-1 stereotype model [例4：安装一个等级1的刻板模式]
data(car.all)
fit4 = rrvglm(Country ~ Width + Height + HP, multinomial, car.all)
coef(fit4) # Contains the C matrix[包含C矩阵]
constraints(fit4)$HP    # The A matrix [A矩阵]
coef(fit4, matrix = TRUE)  # The B matrix[矩阵B]
Coef(fit4)@C             # The C matrix [C矩阵]
ccoef(fit4)             # Better to get the C matrix this way[更好地得到这样的C矩阵]
Coef(fit4)@A             # The A matrix [A矩阵]
svd(coef(fit4, matrix = TRUE)[-1, ])$d # This has rank 1; = C %*% t(A) [等级1 = C％％T（A）]

# Example 5: The use of the xij argument (aka conditional logit model)[例5：XIJ参数（又名条件Logit模型的使用）]
set.seed(111)
nn = 100  # Number of people who travel to work[数人前往]
M = 3  # There are M+1 models of transport to go to work[有M +1的运输模式去上班]
ycounts = matrix(0, nn, M+1)
ycounts[cbind(1:nn, sample(x = M+1, size = nn, replace = TRUE))] = 1
dimnames(ycounts) = list(NULL, c("bus","train","car","walk"))
gotowork = data.frame(cost.bus  = runif(nn), time.bus  = runif(nn),
                  cost.train= runif(nn), time.train= runif(nn),
                  cost.car  = runif(nn), time.car  = runif(nn),
                  cost.walk = runif(nn), time.walk = runif(nn))
gotowork = round(gotowork, dig = 2) # For convenience[为方便起见，]
gotowork = transform(gotowork,
                  Cost.bus = cost.bus - cost.walk,
                  Cost.car = cost.car - cost.walk,
                  Cost.train = cost.train - cost.walk,
                  Cost    = cost.train - cost.walk, # for labelling[对于标签]
                  Time.bus = time.bus - time.walk,
                  Time.car = time.car - time.walk,
                  Time.train = time.train - time.walk,
                  Time    = time.train - time.walk) # for labelling[对于标签]
fit = vglm(ycounts ~ Cost + Time,
         multinomial(parall = TRUE ~ Cost + Time - 1),
         xij = list(Cost ~ Cost.bus + Cost.train + Cost.car,
                  Time ~ Time.bus + Time.train + Time.car),
         form2 =  ~ Cost + Cost.bus + Cost.train + Cost.car +
                  Time + Time.bus + Time.train + Time.car,
         data=gotowork, trace = TRUE)
head(model.matrix(fit, type = "lm")) # LM model matrix[LM模型矩阵]
head(model.matrix(fit, type = "vlm"))  # Big VLM model matrix[大VLM模型矩阵]
coef(fit)
coef(fit, matrix = TRUE)
constraints(fit)
summary(fit)
max(abs(predict(fit) - predict(fit, new = gotowork))) # Should be 0[应为0]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册