validate.lrm(rms)
validate.lrm()所属R语言包:rms
Resampling Validation of a Logistic Model
重采样的Logistic模型验证
译者:生物统计家园网 机器人LoveR
描述----------Description----------
The validate function when used on an object created by lrm does resampling validation of a logistic regression model, with or without backward step-down variable deletion. It provides bias-corrected Somers' D_{xy} rank correlation, R-squared index, the intercept and slope of an overall logistic calibration equation, the maximum absolute difference in predicted and calibrated probabilities E_{max}, the discrimination index D (model L.R. (chi-square - 1)/n), the unreliability index U = difference in -2 log likelihood between un-calibrated X beta and X beta with overall intercept and slope calibrated to test sample / n, the overall quality index (logarithmic probability score) Q = D - U, and the Brier or quadratic probability score, B (the last 3 are not computed for ordinal models), the g-index, and gp, the g-index on the probability scale. The corrected slope can be thought of as shrinkage factor that takes into account overfitting.
validate功能使用时,lrm不重采样的logistic回归模型进行验证,带或不带倒退的变数删除创建的对象。它提供了偏置校正萨默斯D_{xy}等级相关,R-平方指数,的整体MF校准方程的截距和斜率,最大绝对差在预测和校准的概率E_{max},歧视指数D(LR型(chi-square - 1)/n),不可靠指数U= -2对数似然未校准之间的差异X beta和X beta整体的截距和斜率的校准测试样品/ N的整体质量指数(对数概率得分)Q = D - U,和蒺藜或二次的概率得分,B(最后3为序模型计算),<X >指数,和g,gp指数的概率规模。更正后的斜率可以被认为是收缩的因素,需要考虑过拟合。
用法----------Usage----------
# fit <- lrm(formula=response ~ terms, x=TRUE, y=TRUE)
## S3 method for class 'lrm'
validate(fit, method="boot", B=40,
bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0,
force=NULL,
pr=FALSE, kint, Dxy.method=if(k==1) 'somers2' else 'lrm',
emax.lim=c(0,1), ...)
参数----------Arguments----------
参数:fit
a fit derived by lrm. The options x=TRUE and y=TRUE must have been specified.
适合取得的lrm。的选项x=TRUE和y=TRUE必须被指定。
参数:method,B,bw,rule,type,sls,aics,force,pr
see validate and predab.resample
看到validate和predab.resample
参数:kint
In the case of an ordinal model, specify which intercept to validate. Default is the middle intercept.
在一个有序的模式的情况下,指定拦截来验证。默认是中间拦截。
参数:Dxy.method
"lrm" to use lrms computation of D_{xy} correlation, which rounds predicted probabilities to nearest .002. Use Dxy.method="somers2" (the default) to instead use the more accurate but slower somers2 function. This will matter most when the model is extremely predictive. The default is "lrm" for ordinal models, since somers2 only handles binary response variables.
"lrm"使用lrm相关,这轮预测概率最接近的0.002 D_{xy}计算。使用Dxy.method="somers2"(默认值),而不是使用更准确,但速度慢somers2功能。这将最重要的模型时,非常的预测。默认是"lrm"为序模型的,因为somers2。只处理二项反应变量。
参数:emax.lim
range of predicted probabilities over which to compute the maximum error. Default is entire range.
预测概率的范围超过计算的最大误差。默认值是整个范围。
参数:...
other arguments to pass to lrm.fit (now only maxit and tol are allowed) and to predab.resample (note especially the group, cluster, and subset parameters) </table>
其他参数传递给lrm.fit(现在唯一的maxit和tol)predab.resample(请注意特别是group,cluster,和subset参数)</ TABLE>
Details
详细信息----------Details----------
If the original fit was created using penalized maximum likelihood estimation, the same penalty.matrix used with the original fit are used during validation.
如果原始适合使用惩罚最大似然估计,同样的penalty.matrix使用原来的配合使用在验证过程中。
值----------Value----------
a matrix with rows corresponding to D_{xy}, R^2, Intercept, Slope, E_{max}, D, U, Q, B, g, gp, and columns for the original index, resample estimates, indexes applied to the whole or omitted sample using the model derived from the resample, average optimism, corrected index, and number of successful re-samples.
矩阵的行对应的D_{xy},R^2,Intercept,Slope,E_{max},D,U,Q,B,g,gp,和原来的索引,重采样估计,适用于全部或省略的样品来自重采样的模型索引的列,平均乐观,校正指数,和一些成功的再采样。
副作用----------Side Effects----------
prints a summary, and optionally statistics for each re-fit
为每一个再适合打印的总结,并选择性地统计
(作者)----------Author(s)----------
Frank Harrell<br>
Department of Biostatistics, Vanderbilt University<br>
f.harrell@vanderbilt.edu
参考文献----------References----------
logistic regression models. Stat in Med 10:1213–1226.
discrimination of discriminant analysis and logistic regression under multivariate normality. In Biostatistics: Statistics in Biomedical, Public Health, and Environmental Sciences. The Bernard G. Greenberg Volume, ed. PK Sen. New York: North-Holland, p. 333–343.
参见----------See Also----------
predab.resample, fastbw, lrm, rms, rms.trans, calibrate, somers2, cr.setup, gIndex
predab.resample,fastbw,lrm,rms,rms.trans,calibrate,somers2,cr.setup,gIndex
实例----------Examples----------
n <- 1000 # define sample size[确定样本量]
age <- rnorm(n, 50, 10)
blood.pressure <- rnorm(n, 120, 15)
cholesterol <- rnorm(n, 200, 25)
sex <- factor(sample(c('female','male'), n,TRUE))
# Specify population model for log odds that Y=1[指定的log几率的人口模型Y = 1]
L <- .4*(sex=='male') + .045*(age-50) +
(log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)][模拟二进制y以有PROB(y = 1时)= 1 / [1 +(-L)]]
y <- ifelse(runif(n) < plogis(L), 1, 0)
f <- lrm(y ~ sex*rcs(cholesterol)+pol(age,2)+blood.pressure, x=TRUE, y=TRUE)
#Validate full model fit[验证完整的模型拟合]
validate(f, B=10) # normally B=150[通常B = 150]
validate(f, B=10, group=y)
# two-sample validation: make resamples have same numbers of[两个样本的验证:品牌重新采样有相同数量的]
# successes and failures as original sample[作为原始样品的成功与失败]
#Validate stepwise model with typical (not so good) stopping rule[验证逐步模型与典型的(不太好)停止规则]
validate(f, B=10, bw=TRUE, rule="p", sls=.1, type="individual")
## Not run: [#不运行:]
#Fit a continuation ratio model and validate it for the predicted[适合的延续率模型的预测进行验证]
#probability that y=0[y = 0的概率是]
u <- cr.setup(y)
Y <- u$y
cohort <- u$cohort
attach(mydataframe[u$subs,])
f <- lrm(Y ~ cohort+rcs(age,4)*sex, penalty=list(interaction=2))
validate(f, cluster=u$subs, subset=cohort=='all')
#see predab.resample for cluster and subset[predab.resample聚类和子集]
## End(Not run)[#(不执行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|