R语言 rms包 Predict()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-27 19:13:34

Predict(rms)
Predict()所属R语言包：rms

                                    Compute Predicted Values and Confidence Limits
                                       计算预测值和置信限。

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Predict allows the user to easily specify which predictors are to vary.  When the vector of values over which a predictor should vary is not specified, the range will be all levels of a categorical predictor or equally-spaced points between the datadist "Low:prediction" and "High:prediction" values for the variable (datadist by default uses the 10th smallest and 10th largest predictor values in the dataset).  Predicted values are  the linear predictor (X beta), a user-specified transformation of that scale, or estimated probability of surviving past a fixed single time point given the linear predictor.  Predict is usually used for plotting predicted values but there is also a print method.
Predict使用户能够容易地指定预测因子变化。当向量的预测不同的值没有指定，范围将是一个明确的预测或各级之间的间隔点datadist"Low:prediction"和"High:prediction"值变量（datadist默认情况下，使用10日和10日的预测值最大的数据集）。预测值是线性预测（Ⅹ测试），用户指定的该尺度变换，或估计的概率生存过去固定的单一的时间点给定的线性预测。 Predict通常用于绘制的预测值，但也有一个print方法。

When the first argument to Predict is a fit object created by bootcov with coef.reps=TRUE, confidence limits come from the stored matrix of bootstrap repetitions of coefficients, using  bootstrap percentile nonparametric confidence limits.  Such confidence intervals do not make distributional assumptions.
当Predict的第一个参数是一个合适的对象创建的bootcovcoef.reps=TRUE，来自存储的重复系数矩阵的引导，引导百分非参数的置信区间的置信区间。这样的置信区间不进行分配的假设。

There is a plot method for Predict objects that makes it easy to show predicted values and confidence bands.
有一个plotPredict对象的方法，可以很容易显示预测值和置信区间。

The rbind method for Predict objects allows you to create separate sets of predictions under different situations and to combine them into one set for feeding to plot.Predict.  For example you might want to plot confidence intervals for means and for individuals using ols, and have the two types of confidence bands be superposed onto one plot or placed into two panels.  Another use for rbind is to combine predictions from quantile regression models that predicted three different quantiles.
rbindPredict对象的方法允许您创建不同的集合预测在不同情况下，将它们组合成一组喂养plot.Predict。例如，您可能要绘制置信区间和个人使用ols，有两种类型的置信区间叠加到一个图，或放入两个小组。 rbind的另一个用途是将的预测分量回归模型，预测了三种不同的位数。

If conf.type="simultaneous", simultaneous (over all requested predictions) confidence limits are computed.  See the predictrms function for details.
如果conf.type="simultaneous"，同时（以上所有要求的预测）的置信区间的计算。请参阅predictrms功能的详细信息，。

用法----------Usage----------

Predict(x, ..., fun,
      type = c("predictions", "model.frame", "x"),
      np = 200, conf.int = 0.95,
      conf.type = c("mean", "individual","simultaneous"),
      adj.zero = FALSE, ref.zero = FALSE,
      non.slopes, time = NULL, loglog = FALSE, digits=4, name, factors=NULL)

## S3 method for class 'Predict'
print(x, ...)

## S3 method for class 'Predict'
rbind(..., rename)

参数----------Arguments----------

参数：x
an rms fit object, or for print the result of Predict. options(datadist="d") must have been specified (where d was created by datadist), or  it must have been in effect when the the model was fitted.
rms合适的对象，或print的结果Predict。 options(datadist="d")，必须指明（其中d是由datadist），或者它必须是有效的模型拟合。

参数：...
One or more variables to vary, or single-valued adjustment values. Specify a variable name without an equal sign to use the default display range, or any range  you choose (e.g. seq(0,100,by=2),c(2,3,7,14)).  The default list of values for which predictions are made is taken as the list of unique values of the variable if they number fewer than 11. For variables with >10 unique values, np equally spaced values in the range are used for plotting if the range is not specified.  Variables not specified are set to the default adjustment value limits[2], i.e. the median for continuous variables and a reference category for non-continuous ones. Later variables define adjustment settings. For categorical variables, specify the class labels in quotes when specifying variable values.  If the levels of a categorical variable are numeric, you may omit the quotes.  For variables not described using datadist, you must specify explicit ranges and adjustment settings for predictors  that were in the model. If no variables are specified in ..., predictions will be made by separately varying all predictors in the model over their default range, holding the other predictors at their adjustment values. This has the same effect as specifying name as a vector containing all the predictors.  For rbind, ... represents a series of results from Predict.  If you name the results, these names will be taken as the values of the new .set. variable added to the concatenated data frames.  See an example below.
一个或多个变量来改变，或单值的调整值。指定变量等号的名称，而不使用默认的显示范围，或任何你选择的范围（如：seq(0,100,by=2),c(2,3,7,14)“）。列表中的唯一值的变量，如果他们的人数少于11的默认值进行预测。变量与>10独特的价值，np相等间隔的值的范围是用于绘制如果未指定的范围内。未指定的变量设置为默认调整值limits[2]，即连续变量和非连续的参考类别的中位数。后来的变量定义调整设置。对于分类变量，在报价时，指定变量的值指定类的标签。如果一个分类变量的水平是数字，你可以省略引号。对于变量描述使用datadist，你必须指定明确的范围和调整设置的预测模型中的。如果没有变量中指定的...，预测将分别改变所有的预测模型中对它们的默认范围，其他预测调整值。这具有相同的效果作为指定name作为一个向量，包含所有的预测。对于rbind，...表示一系列从Predict结果。如果你的名字，这些名字将被视为新的.set.变量的值添加到连接的数据框。请参阅下面的例子。

参数：fun
an optional transformation of the linear predictor
一个可选的转换的线性预测

参数：type
defaults to providing predictions.  Set to "model.frame" to return a data frame of predictor settings used.  Set to "x" to return the corresponding design matrix constructed from the predictor settings.
默认提供的预测。设置为"model.frame"返回一个数据框的预测设置使用。设置为"x"返回矩阵的预测设置了相应的设计。

参数：np
the number of equally-spaced points computed for continuous predictors that vary, i.e., when the specified value is . or NA
同样间隔点的数量计算为连续预测各不相同，例如，当指定的值是.或NA

参数：conf.int
confidence level.  Default is 0.95.  Specify FALSE to suppress.
置信水平。默认值是0.95。指定FALSE抑制。

参数：conf.type
type of confidence interval.  Default is "mean" which applies to all models.  For models containing a residual variance (e.g, ols), you can specify conf.type="individual" instead, to obtain limits on the predicted value for an individual subject. Specify conf.type="simultaneous" to obtain simultaneous confidence bands for mean predictions with family-wise coverage of conf.int.
类型的置信区间。默认是"mean"适用于所有型号。剩余方差模型（例如，ols），您可以指定conf.type="individual"代替，获得的预测值个别科目的限制。指定conf.type="simultaneous"同时置信区间为平均预测与家庭明智的覆盖面conf.int。

参数：adj.zero
Set to TRUE to adjust all non-plotted variables to 0 (or reference cell for categorical variables) and to omit intercept(s) from consideration. Default is FALSE.
设置为TRUE调整所有非绘制变量为0（或分类变量的引用单元格）和考虑省略截距（S）。默认是FALSE。

参数：ref.zero
Set to TRUE to subtract a constant from X beta before plotting so that the reference value of the x-variable yields y=0.  This is done before applying function fun.
减去一个常数TRUE之前绘制这样的X beta变量收益率参考值x设置为y=0。这之前完成应用功能fun。

参数：non.slopes
This is only useful in a multiple intercept model such as the ordinal logistic model. There to use to second of three intercepts, for example, specify non.slopes=c(0,1,0). The default is non.slopes=rep(0,k) if adj.zero=TRUE, where k is the number of intercepts in the model.  If adj.zero=FALSE, the default is (1,0,0,...,0), i.e., the first intercept is used.
在多截距模型，如有序模型，这是唯一有用的。有使用到第二三拦截，例如，指定了non.slopes=c(0,1,0)。默认值是non.slopes=rep(0,k)如果adj.zero=TRUE，其中k是截距模型中。如果adj.zero=FALSE，默认是(1,0,0,...,0)，即，第一截距使用。

参数：time
Specify a single time u to cause function survest to be invoked to plot the probability of surviving until time u when the fit is from cph or psm.
指定一个单一的时间u，造成功能survest被调用绘制生存的可能性，直到时间u，适合从cph或psm。

参数：loglog
Specify loglog=TRUE to plot log[-log(survival)] instead of survival, when time is given.
指定loglog=TRUE绘制log[-log(survival)]，而不是生存，当time。

参数：digits
Controls how “adjust-to” values are plotted.  The default is 4 significant digits.
控制“调整到”值绘制。默认值是4显著的数字。

参数：name
Instead of specifying the variables to vary in the variables (...) list, you can specify one or more variables by specifying a vector of character string variable names in the name argument.  Using this mode you cannot specify a list of variable values to use; prediction is done as if you had said e.g. age without the equal sign.  Also, interacting factors can only be set to their reference values using this notation.
而不是指定的变量改变在variables（...）列表中，您可以指定一个或多个变量通过指定的向量name参数的字符串中的变量名。使用此模式时，你不能指定变量的值的列表使用，预测是，如果你说了，例如age不带等号的。，相互作用的因素也可以被设置为使用这个符号的参考值。

参数：factors
an alternate way of specifying ..., mainly for use by survplot or gendata.  This must be a list with one or more values for each variable listed, with NA values for default ranges.
的另一种方法的说明......，主要用于survplot或gendata。这必须是一个列表与一个或多个列出的每个变量的值，用NA的值默认范围。

参数：rename
If you are concatenating predictor sets using rbind and one or more of the variables were renamed for one or more of the sets, but these new names represent different versions of the same predictors (e.g., using or not using imputation), you can specify a named character vector to rename predictors to a central name.  For example, specify rename=c(age.imputed='age',    corrected.bp='bp') to rename from old names age.imputed,    corrected.bp to age, bp.  This happens before concatenation of rows.
如果你正在连接的预测将使用rbind和一个或多个变量进行重命名的组中的一个或多个，但这些新的名称代表了不同版本的相同的预测（例如，使用或不使用归集），你可以指定一个指定的字符向量到中央的名字命名的预测。例如，指定rename=c(age.imputed='age',    corrected.bp='bp')要重命名的旧名称age.imputed,    corrected.bp到age, bp。这种情况发生在连接前行。

Details

详细信息----------Details----------

When there are no intercepts in the fitted model, plot subtracts adjustment values from each factor while computing variances for confidence limits.
有没有拦截在合适的模型，图减去从各因素的调整值，而计算方差的置信限。

Specifying time will not work for Cox models with time-dependent covariables.  Use survest or survfit for that purpose.
time将无法正常工作与时间相关的协变量的Cox模型。使用survest或survfit用于这一目的。

值----------Value----------

a data frame containing all model predictors and the computed values yhat, lower, upper, the latter two if confidence intervals were requested.  The data frame has an additional class "Predict".  If name is specified or no predictors are specified in ..., the resulting data frame has an additional variable called .predictor. specifying which predictor is currently being varied. .predictor. is handy for use as a paneling variable in lattice or ggplot2 graphics.
一个数据框包含所有模型的预测值和计算值yhat，lower，upper，后两者如果被要求的置信区间。数据框有一个额外的class"Predict"。如果name指定或没有预测中指定的...，得到的数据框有一个额外的变量称为.predictor.，指定预测，目前正在修订。 .predictor.作为一个的镶板变量在lattice或ggplot2图形是很方便的。

（作者）----------Author(s)----------

Frank Harrell<br>
Department of Biostatistics, Vanderbilt University<br>
f.harrell@vanderbilt.edu

参见----------See Also----------

datadist, predictrms, contrast.rms, summary.rms, rms, rms.trans, survest, survplot, rmsMisc, transace, rbind, bootcov
datadist，predictrms，contrast.rms，summary.rms，rms，rms.trans，survest，survplot，rmsMisc，transace，rbind，bootcov

实例----------Examples----------

n <- 1000 # define sample size[确定样本量]
set.seed(17) # so can reproduce the results[所以可以重现的结果]
age          <- rnorm(n, 50, 10)
blood.pressure <- rnorm(n, 120, 15)
cholesterol <- rnorm(n, 200, 25)
sex          <- factor(sample(c('female','male'), n,TRUE))
label(age)          <- 'Age'    # label is in Hmisc[标签是在Hmisc]
label(cholesterol) <- 'Total Cholesterol'
label(blood.pressure) <- 'Systolic Blood Pressure'
label(sex)          <- 'Sex'
units(cholesterol) <- 'mg/dl' # uses units.default in Hmisc[使用units.default在Hmisc]
units(blood.pressure) <- 'mmHg'

# Specify population model for log odds that Y=1[指定的log几率的人口模型Y = 1]
L <- .4*(sex=='male') + .045*(age-50) +
  (log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)][模拟二进制y以有PROB（y = 1时）= 1 / [1 +（-L）]]
y <- ifelse(runif(n) < plogis(L), 1, 0)

ddist <- datadist(age, blood.pressure, cholesterol, sex)
options(datadist='ddist')

fit <- lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4)))
Predict(fit, age, cholesterol, np=4)
Predict(fit, age=seq(20,80,by=10), sex, conf.int=FALSE)
Predict(fit, age=seq(20,80,by=10), sex='male')  # works if datadist not used[如果不使用datadist工程]
# Get simultaneous confidence limits accounting for making 7 estimates[同时7估算的置信区间占]
# Predict(fit, age=seq(20,80,by=10), sex='male', conf.type='simult')[预测（适合年龄= SEQ（20,80 = 10），性别=男，conf.type =SIMULT“的）]
# (this needs the multcomp package)[（这需要multcomp的包）]

ddist$limits$age[2] <- 30 # make 30 the reference value for age[30的参考值年龄]
# Could also do: ddist$limits["Adjust to","age"] <- 30[也可以这样做：ddist $限制=“调整”，“年龄”] < -  30]
fit <- update(fit) # make new reference value take effect[新的参考值生效]
Predict(fit, age, ref.zero=TRUE, fun=exp)

# Make two curves, and plot the predicted curves as two trellis panels[两条曲线，并画出预测曲线的两个格子板]
w <- Predict(fit, age, sex)
require(lattice)
xyplot(yhat ~ age | sex, data=w, type='l')
# To add confidence bands we need to use the Hmisc xYplot function in[为了增添信心带我们需要使用的Hmisc xYplot功能，在]
# place of xyplot[地方xyplot]
xYplot(Cbind(yhat,lower,upper) ~ age | sex, data=w,
   method='filled bands', type='l', col.fill=gray(.95))
# If non-displayed variables were in the model, add a subtitle to show[如果不显示的变量在模型中，添加字幕显示]
# their settings using title(sub=paste('Adjusted to',attr(w,'info')$adjust),adj=0)[设置使用标题（分=粘贴（“调整”，ATTR（W，信息）$调整），可调节= 0）]
# Easier: feed w into plot.Predict[更简单：饲料W到plot.Predict]
## Not run: [＃不运行：]
# Predictions form a parametric survival model[预测形成了一个参数生存模型]
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n,
            rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
t <- -log(runif(n))/h
label(t) <- 'Follow-up Time'
e <- ifelse(t<=cens,1,0)
t <- pmin(t, cens)
units(t) <- "Year"
ddist <- datadist(age, sex)
Srv <- Surv(t,e)

# Fit log-normal survival model and plot median survival time vs. age[适合登录正常的生存模式和图中位生存时间与年龄]
f <- psm(Surv(t, e) ~ rcs(age), dist=if(.R.)'lognormal' else 'gaussian')
med <- Quantile(f)    # Creates function to compute quantiles[创建函数来计算位数]
                     # (median by default)[（默认情况下，中位数）]
Predict(f, age, fun=function(x)med(lp=x))
# Note: This works because med() expects the linear predictor (X*beta)[注：本作品，因为MED（）预期的线性预测（X *测试版）]
#    as an argument.  Would not work if use [作为一个参数。是行不通的，如果使用]
#    ref.zero=TRUE or adj.zero=TRUE.[ref.zero文化= TRUE或adj.zero的= TRUE。]
# Also, confidence intervals from this method are approximate since[此外，这种方法是近似的，因为置信区间]
# they don't take into account estimation of scale parameter[他们没有考虑到帐户估计尺度参数]

# Fit an ols model to log(y) and plot the relationship between x1[适合OLS模型进行登录（Y），并画出X1之间的关系]
# and the predicted mean(y) on the original scale without assuming[和预测的平均值（y）在原有规模没有假设]
# normality of residuals; use the smearing estimator.  Before doing[正常的残留物，使用涂抹估计。在此之前]
# that, show confidence intervals for mean and individual log(y),[，显示置信区间为均值和单独的log（Y），]
# and for the latter, also show bootstrap percentile nonparametric[对于后者，也显示引导百分位数的非参数]
# pointwise confidence limits[逐点的置信区间]
set.seed(1)
x1 <- runif(300)
x2 <- runif(300)
ddist <- datadist(x1,x2); options(datadist='ddist')
y  <- exp(x1+ x2 - 1 + rnorm(300))
f  <- ols(log(y) ~ pol(x1,2) + x2, x=TRUE, y=TRUE)  # x y for bootcov[X，Y bootcov]
fb <- bootcov(f, B=100, coef.reps=TRUE)
pb <- Predict(fb, x1, x2=c(.25,.75))
p1 <- Predict(f,  x1, x2=c(.25,.75))
p <- rbind(normal=p1, boot=pb)
plot(p)

p1 <- Predict(f, x1, conf.type='mean')
p2 <- Predict(f, x1, conf.type='individual')
p  <- rbind(mean=p1, individual=p2)
plot(p, label.curve=FALSE) # uses superposition[使用叠加]
plot(p, ~x1 | .set.)       # 2 panels[2板]

r <- resid(f)
smean <- function(yhat)smearingEst(yhat, exp, res, statistic='mean')
formals(smean) <- list(yhat=numeric(0), res=r[!is.na(r)])
#smean$res <- r[!is.na(r)] # define default res argument to function[smean $水库 -  R [！is.na（R）]＃定义默认水库参数，功能]
Predict(f, x1, fun=smean)

## End(Not run)[＃（不执行）]
options(datadist=NULL)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册