robcov(rms)
robcov()所属R语言包:rms
Robust Covariance Matrix Estimates
强大的协方差矩阵的估计
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Uses the Huber-White method to adjust the variance-covariance matrix of a fit from maximum likelihood or least squares, to correct for heteroscedasticity and for correlated responses from cluster samples. The method uses the ordinary estimates of regression coefficients and other parameters of the model, but involves correcting the covariance matrix for model misspecification and sampling design. Models currently implemented are models that have a residuals(fit,type="score") function implemented, such as lrm, cph, coxph, and ordinary linear models (ols). The fit must have specified the x=TRUE and y=TRUE options for certain models. Observations in different clusters are assumed to be independent. For the special case where every cluster contains one observation, the corrected covariance matrix returned is the "sandwich" estimator (see Lin and Wei). This is a consistent estimate of the covariance matrix even if the model is misspecified (e.g. heteroscedasticity, underdispersion, wrong covariate form).
胡贝尔白色的方法来调整一个合适的方差 - 协方差矩阵的最大似然法或最小二乘法,纠正异方差和相关从聚类样品的响应。该方法使用普通的估计回归系数和其他参数的模型,但涉及修正协方差矩阵模型错误和抽样设计。目前实施的模型是模型,有一个residuals(fit,type="score")实现的功能,如lrm,cph,coxph,和普通线性模型(ols)。必须指定的契合x=TRUE和y=TRUE某些型号的选项。观察在不同的聚类被认为是独立的。对于每一个聚类包含一个观察的特殊情况下,修正后的协方差矩阵返回的是“三明治”估计(见林和魏)。这是即使模型的协方差矩阵的一致估计来年(例如的异方差underdispersion,错误的协变量的形式)。
For the special case of ols fits, robcov can compute the improved (especially for small samples) Efron estimator that adjusts for natural heterogeneity of residuals (see Long and Ervin (2000) estimator HC3).
对于特殊的情况下,OLS适合,robcov可以计算的改进(尤其是小样本)·埃夫隆估计调整的残留物的自然异质性(见龙和欧文(2000年)估计HC3)。
用法----------Usage----------
robcov(fit, cluster, method=c('huber','efron'))
参数----------Arguments----------
参数:fit
a fit object from the rms series
一个合适的对象rms系列
参数:cluster
a variable indicating groupings. cluster may be any type of vector (factor, character, integer). NAs are not allowed. Unique values of cluster indicate possibly correlated groupings of observations. Note the data used in the fit and stored in fit$x and fit$y may have had observations containing missing values deleted. It is assumed that if any NAs were removed during the original model fitting, an naresid function exists to restore NAs so that the rows of the score matrix coincide with cluster. If cluster is omitted, it defaults to the integers 1,2,...,n to obtain the "sandwich" robust covariance matrix estimate.
一个变量,表示分组。 cluster可以是任何类型的向量(因子,字符,整数)。港定居人士都是不允许的。独特的价值观cluster表示可能相关的观测分组。请注意适合使用的数据,并存储在fit$x和fit$y观察可能有遗漏值的删除。据推测,如果任何NAS删除在原始模型拟合,naresid函数的存在是为了恢复NAS,这样的得分矩阵的行与cluster。如果cluster省略,默认为整数1,2,...,n,并将获得的“三明治”鲁棒协方差矩阵估计。
参数:method
can set to "efron" for ols fits (only). Default is Huber-White estimator of the covariance matrix. </table>
可以设置为"efron"醇配合头(只)。默认情况下是胡贝尔白的协方差矩阵的估计。 </ TABLE>
值----------Value----------
a new fit object with the same class as the original fit, and with the element orig.var added. orig.var is the covariance matrix of the original fit. Also, the original var component is replaced with the new Huberized estimates.
一个新的合适的对象具有相同类的原始配合,并与元素orig.var。 orig.var是原来的合适的协方差矩阵。此外,原始var组件替换与的新Huberized估计。
警告----------Warnings----------
Adjusted ols fits do not have the corrected standard errors printed with print.ols. Use sqrt(diag(adjfit$var)) to get this, where adjfit is the result of robcov.
调整的ols配合不印有print.ols的校正标准误差。使用sqrt(diag(adjfit$var)),,其中adjfit的结果robcov。
(作者)----------Author(s)----------
Frank Harrell<br>
Department of Biostatistics<br>
Vanderbilt University<br>
f.harrell@vanderbilt.edu
参考文献----------References----------
functions.
参见----------See Also----------
bootcov, naresid, residuals.cph
bootcov,naresid,residuals.cph
实例----------Examples----------
# In OLS test against more manual approach[在OLS测试对更多的手动方法]
set.seed(1)
n <- 15
x1 <- 1:n
x2 <- sample(1:n)
y <- round(x1 + x2 + 8*rnorm(n))
f <- ols(y ~ x1 + x2, x=TRUE, y=TRUE)
vcov(f)
vcov(robcov(f))
X <- f$x
G <- diag(resid(f)^2)
solve(t(X) %*% X) %*% (t(X) %*% G %*% X) %*% solve(t(X) %*% X)
# Duplicate data and adjust for intra-cluster correlation to see that[重复数据调整为聚类内的相关地看到,]
# the cluster sandwich estimator completely ignored the duplicates[聚类三明治估计完全忽略重复]
x1 <- c(x1,x1)
x2 <- c(x2,x2)
y <- c(y, y)
g <- ols(y ~ x1 + x2, x=TRUE, y=TRUE)
vcov(robcov(g, c(1:n, 1:n)))
# A dataset contains a variable number of observations per subject,[数据集包含可变数目每科的观察,]
# and all observations are laid out in separate rows. The responses[所有观测值都制定了不同的行。的反应]
# represent whether or not a given segment of the coronary arteries[表示是否一个给定的段的冠状动脉]
# is occluded. Segments of arteries may not operate independently[是闭塞。动脉段可能无法独立运作]
# in the same patient. We assume a "working independence model" to[在同一个病人。我们假设的“独立模式”]
# get estimates of the coefficients, i.e., that estimates assuming[估计的系数,即估计假设]
# independence are reasonably efficient. The job is then to get[独立是合理有效的。然后,工作是得到]
# unbiased estimates of variances and covariances of these estimates.[这些估算的方差和协方差的无偏估计。]
n.subjects <- 30
ages <- rnorm(n.subjects, 50, 15)
sexes <- factor(sample(c('female','male'), n.subjects, TRUE))
logit <- (ages-50)/5
prob <- plogis(logit) # true prob not related to sex[真正的概率不相关性]
id <- sample(1:n.subjects, 300, TRUE) # subjects sampled multiple times[科目多次采样]
table(table(id)) # frequencies of number of obs/subject[频率的OB /主题]
age <- ages[id]
sex <- sexes[id]
# In truth, observations within subject are independent:[事实上,观测的主体是独立的:]
y <- ifelse(runif(300) <= prob[id], 1, 0)
f <- lrm(y ~ lsp(age,50)*sex, x=TRUE, y=TRUE)
g <- robcov(f, id)
diag(g$var)/diag(f$var)
# add ,group=w to re-sample from within each level of w[添加,基团= w来重新采样从瓦特每一级内]
anova(g) # cluster-adjusted Wald statistics[聚类调整的Wald统计量]
# fastbw(g) # cluster-adjusted backward elimination[fastbw(G)聚类调整淘汰落后]
plot(Predict(g, age=30:70, sex='female')) # cluster-adjusted confidence bands[聚类调整后的置信区间]
# Get design effects based on inflation of the variances when compared[基于对通胀的差异相比,获取的设计效果]
# with bootstrap estimates which ignore clustering[而忽视聚类的bootstrap估计]
g2 <- robcov(f)
diag(g$var)/diag(g2$var)
# Get design effects based on pooled tests of factors in model[汇集因素模型试验的基础上设计效果]
anova(g2)[,1] / anova(g)[,1]
# A dataset contains one observation per subject, but there may be[数据集包含一个观察为准,但也有可能是]
# heteroscedasticity or other model misspecification. Obtain[异方差性或其他的模型错误。获得]
# the robust sandwich estimator of the covariance matrix.[强大的三明治估计的协方差矩阵。]
# f <- ols(y ~ pol(age,3), x=TRUE, y=TRUE)[F < - 醇(Y-POL(年龄,3),X = TRUE,Y = TRUE)]
# f.adj <- robcov(f)[< - robcov f.adj(六)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|