loglm(MASS)
loglm()所属R语言包:MASS
Fit Log-Linear Models by Iterative Proportional Scaling
对数线性模型拟合迭代按比例缩放
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This function provides a front-end to the standard function, loglin, to allow log-linear models to be specified and fitted in a manner similar to that of other fitting functions, such as glm.
此功能提供了一个前端的标准功能,loglin,允许指定的对数线性模型和安装方式类似于其他配件的功能,如glm。
用法----------Usage----------
loglm(formula, data, subset, na.action, ...)
参数----------Arguments----------
参数:formula
A linear model formula specifying the log-linear model. If the left-hand side is empty, the data argument is required and must be a (complete) array of frequencies. In this case the variables on the right-hand side may be the names of the dimnames attribute of the frequency array, or may be the positive integers: 1, 2, 3, ... used as alternative names for the 1st, 2nd, 3rd, ... dimension (classifying factor). If the left-hand side is not empty it specifies a vector of frequencies. In this case the data argument, if present, must be a data frame from which the left-hand side vector and the classifying factors on the right-hand side are (preferentially) obtained. The usual abbreviation of a . to stand for "all other variables in the data frame" is allowed. Any non-factors on the right-hand side of the formula are coerced to factor.
一个指定数线性模型的线性模型公式。如果左边是空的,data参数是必需的,而且必须是一个频率(完整)阵列。在这种情况下,在右边的变量可能是dimnames频率数组属性的名称,或可能的正整数:1,2,3,...作为第一,第二,第三,替代名称...尺寸(分类因子)。如果左边是不是空的,它指定的频率向量。在这种情况下,如果目前的数据参数,必须从左侧的向量和右侧的分类因素(优先)获得的数据框。通常缩写.站的所有数据框中的其他变量是允许的。公式右边的任何因素非强制因素。
参数:data
Numeric array or data frame. In the first case it specifies the array of frequencies; in then second it provides the data frame from which the variables occurring in the formula are preferentially obtained in the usual way. This argument may be the result of a call to xtabs.
数字数组或数据框。在第一种情况下,它指定数组的频率,然后第二个,它提供的数据框从公式中的变量发生在通常的方式获得优先。这种说法可能是调用xtabs一个结果。
参数:subset
Specifies a subset of the rows in the data frame to be used. The default is to take all rows.
指定要使用的数据框行的一个子集。默认的是采取一切行。
参数:na.action
Specifies a method for handling missing observations. The default is to fail if missing values are present.
指定失踪意见的处理方法。默认是失败的,如果存在缺失值。
参数:...
May supply other arguments to the function loglm1. </table>
可能会提供其他参数的功能loglm1。 </ TABLE>
Details
详情----------Details----------
If the left-hand side of the formula is empty the data argument supplies the frequency array and the right-hand side of the formula is used to construct the list of fixed faces as required by loglin. Structural zeros may be specified by giving a start argument with those entries set to zero, as described in the help information for loglin.
如果该公式的左边是空的data参数提供的频率数组公式的右侧是用于构建固定的面孔名单loglin。结构零,可指定给予start参数设置为零的条目,在帮助信息中描述的loglin。
If the left-hand side is not empty, all variables on the right-hand side are regarded as classifying factors and an array of frequencies is constructed. If some cells in the complete array are not specified they are treated as structural zeros. The right-hand side of the formula is again used to construct the list of faces on which the observed and fitted totals must agree, as required by loglin. Hence terms such as a:b, a*b and a/b are all equivalent.
如果不是空的左侧,右侧的所有变量视为分类的因素和频率数组构造。如果没有指定完整的数组中的一些单元,它们被视为结构性零。再次使用公式右侧兴建的面孔上的观测和拟合总计必须同意,由loglin。因此,如a:b,a*b和a/b都是等价的。
值----------Value----------
An object of class "loglm" conveying the results of the fitted log-linear model. Methods exist for the generic functions print, summary, deviance, fitted, coef, resid, anova and update, which perform the expected tasks. Only log-likelihood ratio tests are allowed using anova.
一个类的对象"loglm"输送数线性模型拟合的结果。方法存在的通用功能print,summary,deviance,fitted,coef,resid,anova和update,完成预期的任务。只对数似然比检验允许使用anova。
The deviance is simply an alternative name for the log-likelihood ratio statistic for testing the current model within a saturated model, in accordance with standard usage in generalized linear models.
的越轨行为,仅仅是一个测试当前模型在饱和模式,按照广义线性模型的标准用法的对数似然比统计的替代名称。
警告----------Warning----------
If structural zeros are present, the calculation of degrees of freedom may not be correct. loglin itself takes no action to allow for structural zeros. loglm deducts one degree of freedom for each structural zero, but cannot make allowance for gains in error degrees of freedom due to loss of dimension in the model space. (This would require checking the rank of the model matrix, but since iterative proportional scaling methods are developed largely to avoid constructing the model matrix explicitly, the computation is at least difficult.)
如果存在结构性零,自由度的计算可能不正确。 loglin本身不采取任何行动,以便为结构零。 loglm扣除每个结构零一个自由度,但不能在错误的自由,由于在模型空间维度的损失程度的收益津贴。 (这将需要检查的模型矩阵的秩,但由于迭代比例缩放方法主要是开发,以避免明确地构建模型矩阵,计算至少有困难。)
When structural zeros (or zero fitted values) are present the estimated coefficients will not be available due to infinite estimates. The deviances will normally continue to be correct, though.
当结构零(或零拟合值)是目前的估计系数将不可因无限的估计。在deviances通常会继续是正确的,虽然。
参考文献----------References----------
Modern Applied Statistics with S. Fourth edition. Springer.
参见----------See Also----------
loglm1, loglin
loglm1,loglin
举例----------Examples----------
# The data frames Cars93, minn38 and quine are available[数据框Cars93,minn38和奎因]
# in the MASS package.[在质谱包。]
# Case 1: frequencies specified as an array.[案例1:作为一个数组指定的频率。]
sapply(minn38, function(x) length(levels(x)))
## hs phs fol sex f[#HS小灵通FOL性f]
## 3 4 7 2 0[#3 4 7 2 0]
minn38a <- array(0, c(3,4,7,2), lapply(minn38[, -5], levels))
minn38a[data.matrix(minn38[,-5])] <- minn38$fol
fm <- loglm(~1 + 2 + 3 + 4, minn38a) # numerals as names.[数字作为名称。]
deviance(fm)
##[1] 3711.9[#[1] 3711.9]
fm1 <- update(fm, .~.^2)
fm2 <- update(fm, .~.^3, print = TRUE)
## 5 iterations: deviation 0.0750732[#5迭代:偏差0.0750732]
anova(fm, fm1, fm2)
## Not run: LR tests for hierarchical log-linear models[#无法运行:LR层次对数线性模型测试]
Model 1:
~ 1 + 2 + 3 + 4
Model 2:
. ~ 1 + 2 + 3 + 4 + 1:2 + 1:3 + 1:4 + 2:3 + 2:4 + 3:4
Model 3:
. ~ 1 + 2 + 3 + 4 + 1:2 + 1:3 + 1:4 + 2:3 + 2:4 + 3:4 +
1:2:3 + 1:2:4 + 1:3:4 + 2:3:4
Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)
Model 1 3711.915 155
Model 2 220.043 108 3491.873 47 0.00000
Model 3 47.745 36 172.298 72 0.00000
Saturated 0.000 0 47.745 36 0.09114
## End(Not run)[#结束(不运行)]
# Case 1. An array generated with xtabs.[案例1。数组生成xtabs。]
loglm(~ Type + Origin, xtabs(~ Type + Origin, Cars93))
## Not run: Call:[#无法运行:呼叫:]
loglm(formula = ~Type + Origin, data = xtabs(~Type + Origin,
Cars93))
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 18.362 5 0.0025255
Pearson 14.080 5 0.0151101
## End(Not run)[#结束(不运行)]
# Case 2. Frequencies given as a vector in a data frame[案例2。作为向量的频率在一个数据框]
names(quine)
## [1] "Eth" "Sex" "Age" "Lrn" "Days"[#[1]“的Eth”,“性别”“年龄”,“LRN”,“天”]
fm <- loglm(Days ~ .^2, quine)
gm <- glm(Days ~ .^2, poisson, quine) # check glm.[检查GLM。]
c(deviance(fm), deviance(gm)) # deviances agree[deviances同意]
## [1] 1368.7 1368.7[#[1] 1368.7 1368.7]
c(fm$df, gm$df) # resid df do not![渣油DF不要!]
c(fm$df, gm$df.residual) # resid df do not![渣油DF不要!]
## [1] 127 128[#[1] 127 128]
# The loglm residual degrees of freedom is wrong because of[因为的自由loglm的残留度是错误的]
# a non-detectable redundancy in the model matrix.[在检测非冗余模型矩阵。]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|