predictrms(rms)
predictrms()所属R语言包:rms
Predicted Values from Model Fit
从模型的拟合预测值
译者:生物统计家园网 机器人LoveR
描述----------Description----------
The predict function is used to obtain a variety of values or predicted values from either the data used to fit the model (if type="adjto" or "adjto.data.frame" or if x=TRUE or linear.predictors=TRUE were specified to the modeling function), or from a new dataset. Parameters such as knots and factor levels used in creating the design matrix in the original fit are "remembered". See the Function function for another method for computing the linear predictors.
predict函数是用来获取各种值或使用的数据拟合模型的预测值从(type="adjto"或"adjto.data.frame"如果x=TRUE或<所述,>所指定的建模功能),或从一个新的数据集。如海里,在原来的配合设计矩阵因子水平的参数是“记住”。见linear.predictors=TRUE功能,另一种方法计算线性预测。
用法----------Usage----------
## S3 method for class 'bj'
predict(object, newdata,
type=c("lp", "x", "data.frame",
"terms", "cterms", "ccterms", "adjto", "adjto.data.frame", "model.frame"),
se.fit=FALSE, conf.int=FALSE,
conf.type=c('mean','individual','simultaneous'),
incl.non.slopes, non.slopes, kint=1,
na.action=na.keep, expand.na=TRUE,
center.terms=type=="terms", ...) # for bj
## S3 method for class 'cph'
predict(object, newdata,
type=c("lp", "x",
"data.frame", "terms", "cterms", "ccterms", "adjto", "adjto.data.frame",
"model.frame"),
se.fit=FALSE, conf.int=FALSE, conf.type=c('mean','individual','simultaneous'),
incl.non.slopes=NULL,
non.slopes=NULL, kint=1, na.action=na.keep, expand.na=TRUE,
center.terms=type=="terms", ...) # cph
## S3 method for class 'Glm'
predict(object, newdata,
type= c("lp", "x", "data.frame",
"terms", "cterms", "ccterms", "adjto", "adjto.data.frame", "model.frame"),
se.fit=FALSE, conf.int=FALSE, conf.type=c('mean','individual','simultaneous'),
incl.non.slopes,
non.slopes, kint=1, na.action=na.keep, expand.na=TRUE,
center.terms=type=="terms", ...) # Glm
## S3 method for class 'Gls'
predict(object, newdata,
type=c("lp", "x", "data.frame",
"terms", "cterms", "ccterms", "adjto", "adjto.data.frame", "model.frame"),
se.fit=FALSE, conf.int=FALSE, conf.type=c('mean','individual','simultaneous'),
incl.non.slopes,
non.slopes, kint=1, na.action=na.keep, expand.na=TRUE,
center.terms=type=="terms", ...) # Gls
## S3 method for class 'ols'
predict(object, newdata,
type=c("lp", "x", "data.frame",
"terms", "cterms", "ccterms", "adjto", "adjto.data.frame", "model.frame"),
se.fit=FALSE, conf.int=FALSE, conf.type=c('mean','individual','simultaneous'),
incl.non.slopes,
non.slopes, kint=1, na.action=na.keep, expand.na=TRUE,
center.terms=type=="terms", ...) # ols
## S3 method for class 'psm'
predict(object, newdata,
type=c("lp", "x", "data.frame",
"terms", "cterms", "ccterms", "adjto", "adjto.data.frame", "model.frame"),
se.fit=FALSE, conf.int=FALSE, conf.type=c('mean','individual','simultaneous'),
incl.non.slopes,
non.slopes, kint=1, na.action=na.keep, expand.na=TRUE,
center.terms=type=="terms", ...) # psm
参数----------Arguments----------
参数:object
a fit object with an rms fitting function
一个合适的对象rms拟合函数
参数:newdata
An S data frame, list or a matrix specifying new data for which predictions are desired. If newdata is a list, it is converted to a matrix first. A matrix is converted to a data frame. For the matrix form, categorical variables (catg or strat) must be coded as integer category numbers corresponding to the order in which value labels were stored. For list or matrix forms, matrx factors must be given a single value. If this single value is the S missing value NA, the adjustment values of matrx (the column medians) will later replace this value. If the single value is not NA, it is propagated throughout the columns of the matrx factor. For factor variables having numeric levels, you can specify the numeric values in newdata without first converting the variables to factors. These numeric values are checked to make sure they match a level, then the variable is converted internally to a factor. It is most typical to use a data frame for newdata, and the S function expand.grid is very handy here. For example, one may specify <br> newdata=expand.grid(age=c(10,20,30), <br> race=c("black","white","other"), <br> chol=seq(100,300,by=25)).
S的数据框,列表或矩阵指定新的预测所需的数据。如果newdata是一个列表,它会转换成一个矩阵。矩阵A被转换到一个数据框。必须被编码为在其中存储的值标签的顺序相对应的整数类别号为矩阵的形式,分类变量(catg或strat)。对于列表或矩阵形式,matrx因素必须给予一个单一的值。如果这个单一的价值是S遗漏值NA,matrx(柱中位数)的调整值以后将取代此值。如果单一的值不是NA,它是传播整个matrx因子列。 factor变量的数字水平,你可以指定的数值newdata不先转换的变量因素。这些数值进行检查,以确保他们的水平相匹配,那么该变量在内部转换到factor。这是最典型的使用newdata,数据框和S函数expand.grid是非常方便的在这里。例如,可以指定<BR>newdata=expand.grid(age=c(10,20,30),参考race=c("black","white","other"),参考chol=seq(100,300,by=25))。
参数:type
Type of output desired. The default is "lp" to get the linear predictors - predicted X beta. For Cox models, these predictions are centered. You may specify "x" to get an expanded design matrix at the desired combinations of values, "data.frame" to get an S data frame of the combinations, "model.frame" to get a data frame of the transformed predictors, "terms" to get a matrix with each column being the linear combination of variables making up a factor (with separate terms for interactions), "cterms" ("combined terms") to not create separate terms for interactions but to add all interaction terms involving each predictor to the main terms for each predictor, "ccterms" to combine all related terms (related through interactions) and their interactions into a single column, "adjto" to return a vector of limits[2] (see datadist) in coded form, and "adjto.data.frame" to return a data frame version of these central adjustment values. Use of type="cterms" does not make sense for a strat variable that does not interact with another variable. If newdata is not given, predict will attempt to return information stored with the fit object if the appropriate options were used with the modeling function (e.g., x, y, linear.predictors, se.fit).
所需的输出类型。默认为"lp"来得到线性预测 - 预测X beta。 Cox模型,这些预测都集中。您可以指定"x"得到扩展的设计矩阵所需的值的组合,"data.frame"S的数据框的组合,"model.frame"得到一个数据框转化的预测,"terms"得到一个矩阵,每列的线性组合变量的因素(与单独的条款进行交互),"cterms"(“合并条款”),以创建单独的互动但添加的所有交互项,涉及每一个预测,每个预测的主要条款,"ccterms"结合到一个列中的所有相关条款(相关,在相互交流中)和它们之间的相互作用,"adjto"返回的向量limits[2](见datadist)以编码的形式,和"adjto.data.frame"返回一个数据框版本的这些中央调整值。 type="cterms"不有意义的strat不与另一个变量的变量,使用。 newdata如果没有给出,predict会尝试返回的信息存储与合适的对象,如果使用适当的选项的建模功能(例如,x, y, linear.predictors, se.fit“)。
参数:se.fit
Defaults to FALSE. If type="linear.predictors", set se.fit=TRUE to return a list with components linear.predictors and se.fit instead of just a vector of fitted values.
默认为FALSE的。如果type="linear.predictors",设置se.fit=TRUE返回一个列表的组件linear.predictors和se.fit,而不是只是一个向量的拟合值。
参数:conf.int
Specify conf.int as a positive fraction to obtain upper and lower confidence intervals (e.g., conf.int=0.95). The t-distribution is used in the calculation for ols fits. Otherwise, the normal critical value is used.
指定conf.int是一个积极的部分,以获得上限和下限的置信区间(例如,conf.int=0.95)。使用t分配的计算ols适合的。否则,正常的临界值被使用。
参数:conf.type
specifies the type of confidence interval. Default is for the mean. For ols fits there is the option of obtaining confidence limits for individual predicted values by specifying conf.type="individual".
指定类型的置信区间。默认情况下是平均。 ols适合有选择获得个人预测值的置信限conf.type="individual"。
参数:incl.non.slopes
Default is TRUE if non.slopes or kint is specified, the model has a scale parameter (e.g., a parametric survival model), or type!="x". Otherwise the default is FALSE. Set to TRUE to use an intercept in the prediction if the model has any intercepts (except for type="terms" which doesn't need intercepts). Set to FALSE to get predicted X beta ignoring intercepts.
默认是TRUE如果non.slopes或kint指定,该模型具有规模的参数(例如,一个参数的生存模式),或type!="x"。否则,默认是FALSE。设置为TRUE使用拦截的预测,如果该模型任何的拦截(除type="terms"不需要拦截)。设置为FALSE得到预测X beta无视拦截。
参数:non.slopes
For models such as the ordinal logistic models containing more than one intercept, this specifies dummy variable values to pick off intercept(s) to use in computing predictions. For example, if there are 3 intercepts, use non.slopes=c(0,1,0) to use the second. Default is c(1,0,...,0). You may alternatively specify kint.
对于有序模型包含一个以上的拦截模式,如,指定虚拟变量截距(S)中使用的计算预测值。例如,如果有3个拦截,使用non.slopes=c(0,1,0)使用第二。默认是c(1,0,...,0)。或者,您可以指定kint。
参数:kint
a single integer specifying the number of the intercept to use in multiple-intercept models
指定的数目的截距在多个截距模型使用一个单一的整数
参数:na.action
Function to handle missing values in newdata. For predictions "in data", the same na.action that was used during model fitting is used to define an naresid function to possibly restore rows of the data matrix that were deleted due to NAs. For predictions "out of data", the default na.action is na.keep, resulting in NA predictions when a row of newdata has an NA. Whatever na.action is in effect at the time for "out of data" predictions, the corresponding naresid is used also.
在newdata功能处理缺失值。对于预测“数据”,同样的na.action被用来定义一个naresid函数可能恢复的数据矩阵的行被删除由于定居期间使用了模型拟合。为了预测“出来的数据”,默认na.action是na.keep,导致NA预测时,一排newdata的NA。无论na.action实际上是在为“数据”预测,对应的naresid使用。
参数:expand.na
set to FALSE to keep the naresid from having any effect, i.e., to keep from adding back observations removed because of NAs in the returned object. If expand.na=FALSE, the na.action attribute will be added to the returned object.
设置为FALSE的naresid任何影响,即不断加回观测NAS返回的对象中删除,因为。如果expand.na=FALSE,na.action属性将被添加到返回的对象。
参数:center.terms
set to FALSE to suppress subtracting adjust-to values from columns of the design matrix before computing terms with type="terms".
抑制减去调整前的设计矩阵的列值计算与FALSE设置为type="terms"。
参数:...
ignored
忽视
Details
详细信息----------Details----------
datadist and options(datadist=) should be run before predictrms if using type="adjto", type="adjto.data.frame", or type="terms", or if the fit is a Cox model fit and you are requesting se.fit=TRUE. For these cases, the adjustment values are needed (either for the returned result or for the correct covariance matrix computation).
datadist和options(datadist=)应运行前predictrms,如果使用type="adjto",type="adjto.data.frame"或type="terms",如果合适的Cox模型拟合您所请求的se.fit=TRUE。对于这些情况,需要调整值(无论是返回结果的或正确的协方差矩阵计算)。
(作者)----------Author(s)----------
Frank Harrell<br>
Department of Biostatistics, Vanderbilt University<br>
f.harrell@vanderbilt.edu
参见----------See Also----------
plot.Predict, summary.rms, rms, rms.trans, predict.lrm, residuals.cph, datadist, gendata, gIndex, Function.rms, reShape, xYplot, contrast.rms
plot.Predict,summary.rms,rms,rms.trans,predict.lrm,residuals.cph,datadist,gendata,<所述>,gIndex,Function.rms,reShape,xYplot
实例----------Examples----------
n <- 1000 # define sample size[确定样本量]
set.seed(17) # so can reproduce the results[所以可以重现的结果]
age <- rnorm(n, 50, 10)
blood.pressure <- rnorm(n, 120, 15)
cholesterol <- rnorm(n, 200, 25)
sex <- factor(sample(c('female','male'), n,TRUE))
treat <- factor(sample(c('a','b','c'), n,TRUE))
# Specify population model for log odds that Y=1[指定的log几率的人口模型Y = 1]
L <- .4*(sex=='male') + .045*(age-50) +
(log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male')) +
.3*sqrt(blood.pressure-60)-2.3 + 1*(treat=='b')
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)][模拟二进制y以有PROB(y = 1时)= 1 / [1 +(-L)]]
y <- ifelse(runif(n) < plogis(L), 1, 0)
ddist <- datadist(age, blood.pressure, cholesterol, sex, treat)
options(datadist='ddist')
fit <- lrm(y ~ rcs(blood.pressure,4) +
sex * (age + rcs(cholesterol,4)) + sex*treat*age)
# Use xYplot to display predictions in 9 panels, with error bars,[使用的xYplot显示预测在9板,误差线,]
# with superposition of two treatments[与叠加两种治疗]
dat <- expand.grid(treat=levels(treat),sex=levels(sex),
age=c(20,40,60),blood.pressure=120,
cholesterol=seq(100,300,length=10))
# Add variables linear.predictors and se.fit to dat[添加变量的linear.predictors和se.fit至DAT]
dat <- cbind(dat, predict(fit, dat, se.fit=TRUE))
# This is much easier with Predict[这是更容易与预测]
# xYplot in Hmisc extends xyplot to allow error bars[xYplot在Hmisc,扩展xyplot让错误条]
xYplot(Cbind(linear.predictors,linear.predictors-1.96*se.fit,
linear.predictors+1.96*se.fit) ~ cholesterol | sex*age,
groups=treat, data=dat, type='b')
# Since blood.pressure doesn't interact with anything, we can quickly and[,由于blood.pressure不与任何交互,我们可以快速,]
# interactively try various transformations of blood.pressure, taking[交互地尝试不同的转换blood.pressure,]
# the fitted spline function as the gold standard. We are seeking a[装样条函数作为金标准。我们正在寻找一个]
# linearizing transformation even though this may lead to falsely[线性化转型,即使这可能会导致虚假]
# narrow confidence intervals if we use this data-dredging-based transformation[狭窄的置信区间,如果我们使用基于数据疏浚改造]
bp <- 70:160
logit <- predict(fit, expand.grid(treat="a", sex='male', age=median(age),
cholesterol=median(cholesterol),
blood.pressure=bp), type="terms")[,"blood.pressure"]
#Note: if age interacted with anything, this would be the age[注意:如果年龄与任何互动,这将是年龄]
# "main effect" ignoring interaction terms[“主要作用”忽略交互项]
#Could also use Predict(f, age=ag)$yhat[也可以使用预测(女,年龄= AG)yhat]
#which allows evaluation of the shape for any level of interacting[任何级别的交互允许的形状评价]
#factors. When age does not interact with anything, the result from[因素的影响。当年龄不与任何东西,结果从]
#predict(f, \dots, type="terms") would equal the result from[预测(F,\点,类型=“条款”),就等于结果]
#plot if all other terms were ignored[图如果被忽略所有其他条款]
plot(bp^.5, logit) # try square root vs. spline transform.[尝试与样条变换的平方根。]
plot(bp^1.5, logit) # try 1.5 power[尝试1.5电源]
plot(sqrt(bp-60), logit)
#Some approaches to making a plot showing how predicted values[如何预测值的图的一些方法]
#vary with a continuous predictor on the x-axis, with two other[随在x-轴连续的预测,与其他两个]
#predictors varying[的预测不同]
combos <- gendata(fit, age=seq(10,100,by=10), cholesterol=c(170,200,230),
blood.pressure=c(80,120,160))
#treat, sex not specified -> set to mode[请客,性别未指定 - >设置为模式]
#can also used expand.grid[也可以使用expand.grid]
combos$pred <- predict(fit, combos)
xyplot(pred ~ age | cholesterol*blood.pressure, data=combos, type='l')
xYplot(pred ~ age | cholesterol, groups=blood.pressure, data=combos, type='l')
Key() # Key created by xYplot[关键的xYplot]
xYplot(pred ~ age, groups=interaction(cholesterol,blood.pressure),
data=combos, type='l', lty=1:9)
Key()
# Add upper and lower 0.95 confidence limits for individuals[添加上部和下部0.95置信限为个人]
combos <- cbind(combos, predict(fit, combos, conf.int=.95))
xYplot(Cbind(linear.predictors, lower, upper) ~ age | cholesterol,
groups=blood.pressure, data=combos, type='b')
Key()
# Plot effects of treatments (all pairwise comparisons) vs.[图的治疗效果(所有成对比较) - ]
# levels of interacting factors (age, sex)[水平相互作用的因素(年龄,性别)]
d <- gendata(fit, treat=levels(treat), sex=levels(sex), age=seq(30,80,by=10))
x <- predict(fit, d, type="x")
betas <- fit$coef
cov <- fit$var
i <- d$treat=="a"; xa <- x[i,]; Sex <- d$sex[i]; Age <- d$age[i]
i <- d$treat=="b"; xb <- x[i,]
i <- d$treat=="c"; xc <- x[i,]
doit <- function(xd, lab) {
xb <- xd%*%betas
se <- apply((xd %*% cov) * xd, 1, sum)^.5
q <- qnorm(1-.01/2) # 0.99 confidence limits[0.99的置信区间]
lower <- xb - q * se; upper <- xb + q * se
#Get odds ratios instead of linear effects[获取的比值比,而不是线性的影响]
xb <- exp(xb); lower <- exp(lower); upper <- exp(upper)
#First elements of these agree with [第一个元素的这些同意]
#summary(fit, age=30, sex='female',conf.int=.99))[摘要(适合年龄= 30,性别=“女”,conf.int = 0.99))]
for(sx in levels(Sex)) {
j <- Sex==sx
errbar(Age[j], xb[j], upper[j], lower[j], xlab="Age",
ylab=paste(lab,"Odds Ratio"), ylim=c(.1,20), log='y')
title(paste("Sex:",sx))
abline(h=1, lty=2)
}
}
par(mfrow=c(3,2), oma=c(3,0,3,0))
doit(xb - xa, "b:a")
doit(xc - xa, "c:a")
doit(xb - xa, "c:b")
# NOTE: This is much easier to do using contrast.rms[注意:这是很容易做到使用contrast.rms]
# Demonstrate type="terms", "cterms", "ccterms"[证明“条款”中,“cterms”,“ccterms”]
set.seed(1)
n <- 40
x <- 1:n
w <- factor(sample(c('a', 'b'), n, TRUE))
u <- factor(sample(c('A', 'B'), n, TRUE))
y <- .01*x + .2*(w=='b') + .3*(u=='B') + .2*(w=='b' & u=='B') + rnorm(n)/5
ddist <- datadist(x, w, u)
f <- ols(y ~ x*w*u, x=TRUE, y=TRUE)
f
anova(f)
z <- predict(f, type='terms', center.terms=FALSE)
z[1:5,]
k <- coef(f)
## Manually compute combined terms[#手工计算合并条款]
wb <- w=='b'
uB <- u=='B'
h <- k['x * w=b * u=B']*x*wb*uB
tx <- k['x'] *x + k['x * w=b']*x*wb + k['x * u=B'] *x*uB + h
tw <- k['w=b']*wb + k['x * w=b']*x*wb + k['w=b * u=B']*wb*uB + h
tu <- k['u=B']*uB + k['x * u=B']*x*uB + k['w=b * u=B']*wb*uB + h
h <- z[,'x * w * u'] # highest order term is present in all cterms[最高的术语存在于所有cterms]
tx2 <- z[,'x']+z[,'x * w']+z[,'x * u']+h
tw2 <- z[,'w']+z[,'x * w']+z[,'w * u']+h
tu2 <- z[,'u']+z[,'x * u']+z[,'w * u']+h
ae <- function(a, b) all.equal(a, b, check.attributes=FALSE)
ae(tx, tx2)
ae(tw, tw2)
ae(tu, tu2)
zc <- predict(f, type='cterms')
zc[1:5,]
ae(tx, zc[,'x'])
ae(tw, zc[,'w'])
ae(tu, zc[,'u'])
zc <- predict(f, type='ccterms')
# As all factors are indirectly related, ccterms gives overall linear[由于各方面的因素都间接的关系,ccterms提供了整体线性]
# predictor except for the intercept[预测除用于拦截]
zc[1:5,]
ae(as.vector(zc + coef(f)[1]), f$linear.predictors)
## Not run: [#不运行:]
#A variable state.code has levels "1", "5","13"[一个的变量state.code有“1”,“10”,“13级”]
#Get predictions with or without converting variable in newdata to factor[获取带或不带转换的变量,在newdata因素的预测]
predict(fit, data.frame(state.code=c(5,13)))
predict(fit, data.frame(state.code=factor(c(5,13))))
#Use gendata function (gendata.rms) for interactive specification of[互动规范使用gendata的功能(gendata.rms)]
#predictor variable settings (for 10 observations)[预测变量的设置(10个观察)]
df <- gendata(fit, nobs=10, viewvals=TRUE)
df$predicted <- predict(fit, df) # add variable to data frame[添加变量的数据框]
df
df <- gendata(fit, age=c(10,20,30)) # leave other variables at ref. vals.[在文献中留下其他的变量。松动。]
predict(fit, df, type="fitted")
# See reShape (in Hmisc) for an example where predictions corresponding to [见一个例子,预测相应重塑(Hmisc)]
# values of one of the varying predictors are reformatted into multiple[一个变化的预测变量的值被重新格式化成多个]
# columns of a matrix[矩阵的列]
## End(Not run)[#(不执行)]
options(datadist=NULL)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|