censboot(boot)
censboot()所属R语言包:boot
Bootstrap for Censored Data
截尾数据的自举
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This function applies types of bootstrap resampling which have been suggested to deal with right-censored data. It can also do model-based resampling using a Cox regression model.
此功能适用于引导重采样已建议与右删失数据处理的类型。它也可以做模型为基础的重采样,采用Cox回归模型。
用法----------Usage----------
censboot(data, statistic, R, F.surv, G.surv, strata = matrix(1,n,2),
sim = "ordinary", cox = NULL, index = c(1, 2), ...,
parallel = c("no", "multicore", "snow"),
ncpus = getOption("boot.ncpus", 1L), cl = NULL)
参数----------Arguments----------
参数:data
The data frame or matrix containing the data. It must have at least two columns, one of which contains the times and the other the censoring indicators. It is allowed to have as many other columns as desired (although efficiency is reduced for large numbers of columns) except for sim = "weird" when it should only have two columns - the times and censoring indicators. The columns of data referenced by the components of index are taken to be the times and censoring indicators.
数据框或矩阵包含的数据。它必须有至少两列,其中包含的时间和其他指标的审查。允许它有许多其他列所需除sim = "weird"(虽然效率列大量减少)时,它应该只有两列 - 倍和审查指标。列data组件的引用index倍和审查指标。
参数:statistic
A function which operates on the data frame and returns the required statistic. Its first argument must be the data. Any other arguments that it requires can be passed using the ... argument. In the case of sim = "weird", the data passed to statistic only contains the times and censoring indicator regardless of the actual number of columns in data. In all other cases the data passed to statistic will be of the same form as the original data. When sim = "weird", the actual number of observations in the resampled data sets may not be the same as the number in data. For this reason, if sim = "weird" and strata is supplied, statistic should also take a numeric vector indicating the strata. This allows the statistic to depend on the strata if required.
一个功能上的数据框和返回所需的统计数据。其第一个参数必须是数据。它需要的任何其他参数可以通过使用...参数。 “在sim = "weird"的情况下,数据传递给statistic只包含时间和审查指标,不管列data的实际人数。在所有其他情况下,通过数据统计将作为原始数据相同的形式。当sim = "weird",观察在重采样的数据集的实际数量可能不会是作为data数相同。出于这个原因,如果sim = "weird"和strata提供,statistic也应采取一个数值向量,表示地层。这允许取决于地层的统计,如果需要的话。
参数:R
The number of bootstrap replicates.
引导数复制。
参数:F.surv
An object returned from a call to survfit giving the survivor function for the data. This is a required argument unless sim = "ordinary" or sim = "model" and cox is missing.
从调用返回一个对象survfit给幸存者的数据功能。这是一个必需的参数,除非sim = "ordinary"或sim = "model"和cox失踪。
参数:G.surv
Another object returned from a call to survfit but with the censoring indicators reversed to give the product-limit estimate of the censoring distribution. Note that for consistency the uncensored times should be reduced by a small amount in the call to survfit. This is a required argument whenever sim = "cond" or when sim = "model" and cox is supplied.
另一个对象从调用返回survfit但与审查指标逆转的产品限制预算的审查分布。注意一致性未经审查应少量减少在调用survfit。这是一个必需的参数,只要sim = "cond"或sim = "model"和cox提供。
参数:strata
The strata used in the calls to survfit. It can be a vector or a matrix with 2 columns. If it is a vector then it is assumed to be the strata for the survival distribution, and the censoring distribution is assumed to be the same for all observations. If it is a matrix then the first column is the strata for the survival distribution and the second is the strata for the censoring distribution. When sim = "weird" only the strata for the survival distribution are used since the censoring times are considered fixed. When sim = "ordinary", only one set of strata is used to stratify the observations, this is taken to be the first column of strata when it is a matrix.
在调用survfit使用的阶层。它可以是2列向量或矩阵。如果它是一个向量,那么它被假定的生存分布的地层,并审查分配被认为是所有意见相同。如果它是一个矩阵,第一列是为生存分布的阶层和第二阶层审查分布。当sim = "weird"只为生存分布的地层,因为审查的时间被认为是固定的。当sim = "ordinary",只有一个组地层采用分层的意见,这是采取的是strata当它是一个矩阵的第一列。
参数:sim
The simulation type. Possible types are "ordinary" (case resampling), "model" (equivalent to "ordinary" if cox is missing, otherwise it is model-based resampling), "weird" (the weird bootstrap - this cannot be used if cox is supplied), and "cond" (the conditional bootstrap, in which censoring times are resampled from the conditional censoring distribution).
模拟类型。可能的类型"ordinary"重采样(案例),"model"(相当于"ordinary"如果cox缺少的,否则它是基于模型的重采样),"weird"(怪异的引导 - 这不能被用来cox如果提供),"cond"(引导有条件的,在审查倍重采样条件审查分布)。
参数:cox
An object returned from coxph. If it is supplied, then F.surv should have been generated by a call of the form survfit(cox).
从coxph返回的对象。如果提供,那么F.surv应该已经生成的调用形式survfit(cox)。
参数:index
A vector of length two giving the positions of the columns in data which correspond to the times and censoring indicators respectively.
一个长度为2向量data对应的时间和审查指标分别列的位置,。
参数:...
Other named arguments which are passed unchanged to statistic each time it is called. Any such arguments to statistic must follow the arguments which statistic is required to have for the simulation. Beware of partial matching to arguments of censboot listed above, and that arguments named X and FUN cause conflicts in some versions of boot (but not this one).
其他命名参数传递不变statistic每次被调用的时间。 statistic任何这样的论点,必须遵循statistic需要有模拟的参数。谨防部分匹配的参数censboot上面列出,该参数名为X和FUNboot(但不是这一个)的某些版本的原因冲突。
参数:parallel, ncpus, cl
See the help for boot.
见帮助boot。
Details
详情----------Details----------
The various types of resampling are described in Davison and Hinkley (1997) in sections 3.5 and 7.3. The simplest is case resampling which simply resamples with replacement from the observations.
重采样的各类戴维森和欣克利(1997)第3.5和7.3。最简单的方法是简单更换的意见重新取样的情况下重新取样。
The conditional bootstrap simulates failure times from the estimate of the survival distribution. Then, for each observation its simulated censoring time is equal to the observed censoring time if the observation was censored and generated from the estimated censoring distribution conditional on being greater than the observed failure time if the observation was uncensored. If the largest value is censored then it is given a nominal failure time of Inf and conversely if it is uncensored it is given a nominal censoring time of Inf. This is necessary to allow the largest observation to be in the resamples.
有条件引导模拟故障时间从生存分布的估计。然后,为每个观察其模拟的审查时间观测到的审查时间是平等的,如果观察审查,估计审查分布大于所观察到的故障时间观察,如果是未经审查的条件产生。如果最大的价值是审查,然后它被赋予的名义失效时间Inf“相反,如果它是未经审查的,它被赋予了Inf名义审查时间。这是必要的,以便在重新取样的最大观察。
If a Cox regression model is fitted to the data and supplied, then the failure times are generated from the survival distribution using that model. In this case the censoring times can either be simulated from the estimated censoring distribution (sim = "model") or from the conditional censoring distribution as in the previous paragraph (sim = "cond").
如果Cox回归模型拟合的数据和提供的,然后生成失败次数使用该模型从生存分布。审查倍,在这种情况下,可以模拟在前款(sim = "model")从预算的审查分配(sim = "cond")或条件审查分布。
The weird bootstrap holds the censored observations as fixed and also the observed failure times. It then generates the number of events at each failure time using a binomial distribution with mean 1 and denominator the number of failures that could have occurred at that time in the original data set. In our implementation we insist that there is a least one simulated event in each stratum for every bootstrap dataset.
怪异的引导拥有固定和也观测到的失效时间的审查意见。然后,它产生的每个故障时间,平均1和分母的原始数据集的,当时可能发生的失败次数使用二项分布的事件数目。在我们的实现中,我们坚持认为,有至少一个各阶层的模拟事件为每个引导数据集。
When there are strata involved and sim is either "model" or "cond" the situation becomes more difficult. Since the strata for the survival and censoring distributions are not the same it is possible that for some observations both the simulated failure time and the simulated censoring time are infinite. To see this consider an observation in stratum 1F for the survival distribution and stratum 1G for the censoring distribution. Now if the largest value in stratum 1F is censored it is given a nominal failure time of Inf, also if the largest value in stratum 1G is uncensored it is given a nominal censoring time of Inf and so both the simulated failure and censoring times could be infinite. When this happens the simulated value is considered to be a failure at the time of the largest observed failure time in the stratum for the survival distribution.
当有阶层的参与和sim要么是"model"或"cond"形势变得更加困难。由于地层的生存和审查分布是不一样的,它是模拟故障时间和模拟的审查时间,对一些意见是无限的可能。看到这层楼的观察考虑的生存分布和阶层的审查分配1G。审查,如果层楼的最大价值,它是一个Inf,如果在地层1G最大的价值是未经审查它被赋予了Inf名义审查时间的名义故障时间,这样既模拟故障和审查时间可能是无限的。当发生这种情况的模拟值被认为是失败的,在时间的观察地层中的最大失败的生存分布的时间。
When parallel = "snow" and cl is not supplied, library(survival) is run in each of the worker processes.
当parallel = "snow"和cl不提供,library(survival)在每个工作进程中运行。
值----------Value----------
An object of class "boot" containing the following components:
一个对象类"boot"包含以下组件:
参数:t0
The value of statistic when applied to the original data.
statistic的价值时,适用于原始数据。
参数:t
A matrix of bootstrap replicates of the values of statistic.
一个引导矩阵复制statistic值。
参数:R
The number of bootstrap replicates performed.
引导数重复执行。
参数:sim
The simulation type used. This will usually be the input value of sim unless that was "model" but cox was not supplied, in which case it will be "ordinary".
所使用的模拟式。通常,这将是输入值sim除非是"model"但是cox不提供的,在这种情况下,这将是"ordinary"。
参数:data
The data used for the bootstrap. This will generally be the input value of data unless sim = "weird", in which case it will just be the columns containing the times and the censoring indicators.
数据用于引导。这通常是data除非sim = "weird",在这种情况下,它只会是时代和审查指标列。输入值
参数:seed
The value of .Random.seed when censboot was called.
.Random.seed值censboot被称为。
参数:statistic
The input value of statistic.
statistic输入值。
参数:strata
The strata used in the resampling. When sim = "ordinary" this will be a vector which stratifies the observations, when sim = "weird" it is the strata for the survival distribution and in all other cases it is a matrix containing the strata for the survival distribution and the censoring distribution.
在重采样中使用的阶层。当sim = "ordinary"这将是一个矢量分层的意见,当sim = "weird"它是为生存分布的地层,并在所有其他情况下,它是一个矩阵,包含生存分布和审查分布的地层。
参数:call
The original call to censboot.
censboot原来的呼叫。
作者(S)----------Author(s)----------
Angelo J. Canty. Parallel extensions by Brian Ripley
参考文献----------References----------
N. (1993) Statistical Models Based on Counting Processes. Springer-Verlag.
in the Cox model. Journal of the American Statistical Association, 89, 1290–1302.
Bootstrap Methods and Their Application. Cambridge University Press.
Journal of the American Statistical Association, 76, 312–319.
NSF-241, Dept. of Statistics, Stanford University.
参见----------See Also----------
boot, coxph, survfit
boot,coxph,survfit
举例----------Examples----------
library(survival)
# Example 3.9 of Davison and Hinkley (1997) does a bootstrap on some[戴维森和欣克利(1997)3.9的例子做了一些引导]
# remission times for patients with a type of leukaemia. The patients[类型的白血病患者的缓解时间。病人]
# were divided into those who received maintenance chemotherapy and [分为那些接受维持化疗,]
# those who did not. Here we are interested in the median remission [这些谁没有。在这里,我们感兴趣的是中位数缓解]
# time for the two groups.[两组的时间。]
data(aml, package = "boot") # not the version in survival.[没有生存的版本。]
aml.fun <- function(data) {
surv <- survfit(Surv(time, cens) ~ group, data = data)
out <- NULL
st <- 1
for (s in 1:length(surv$strata)) {
inds <- stst + surv$strata[s]-1)
md <- min(surv$time[inds[1-surv$surv[inds] >= 0.5]])
st <- st + surv$strata[s]
out <- c(out, md)
}
out
}
aml.case <- censboot(aml, aml.fun, R = 499, strata = aml$group)
# Now we will look at the same statistic using the conditional [现在我们将看看在相同的使用条件的统计]
# bootstrap and the weird bootstrap. For the conditional bootstrap [引导和怪异的引导。对于条件引导]
# the survival distribution is stratified but the censoring [生存分布是分层的,但设限]
# distribution is not. [分配是没有的。]
aml.s1 <- survfit(Surv(time, cens) ~ group, data = aml)
aml.s2 <- survfit(Surv(time-0.001*cens, 1-cens) ~ 1, data = aml)
aml.cond <- censboot(aml, aml.fun, R = 499, strata = aml$group,
F.surv = aml.s1, G.surv = aml.s2, sim = "cond")
# For the weird bootstrap we must redefine our function slightly since[怪异的引导,我们必须重新定义我们的功能略有以来]
# the data will not contain the group number.[数据将不包含的组号。]
aml.fun1 <- function(data, str) {
surv <- survfit(Surv(data[, 1], data[, 2]) ~ str)
out <- NULL
st <- 1
for (s in 1:length(surv$strata)) {
inds <- stst + surv$strata[s] - 1)
md <- min(surv$time[inds[1-surv$surv[inds] >= 0.5]])
st <- st + surv$strata[s]
out <- c(out, md)
}
out
}
aml.wei <- censboot(cbind(aml$time, aml$cens), aml.fun1, R = 499,
strata = aml$group, F.surv = aml.s1, sim = "weird")
# Now for an example where a cox regression model has been fitted[现在已经安装Cox回归模型的例子]
# the data we will look at the melanoma data of Example 7.6 from [我们将在示例7.6中的黑色素瘤数据看数据]
# Davison and Hinkley (1997). The fitted model assumes that there[戴维森和欣克利(1997)。拟合模型的假设]
# is a different survival distribution for the ulcerated and [是一种生存分布和不同的溃烂]
# non-ulcerated groups but that the thickness of the tumour has a[非溃疡组,但对肿瘤的厚度有]
# common effect. We will also assume that the censoring distribution[共同作用。我们还将承担,审查分配]
# is different in different age groups. The statistic of interest[在不同年龄组的是不同的。感兴趣的统计]
# is the linear predictor. This is returned as the values at a[是线性预测。这是返回的值]
# number of equally spaced points in the range of interest.[在利益范围内的等距点的数量。]
data(melanoma, package = "boot")
library(splines)# for ns[为NS]
mel.cox <- coxph(Surv(time, status == 1) ~ ns(thickness, df=4) + strata(ulcer),
data = melanoma)
mel.surv <- survfit(mel.cox)
agec <- cut(melanoma$age, c(0, 39, 49, 59, 69, 100))
mel.cens <- survfit(Surv(time - 0.001*(status == 1), status != 1) ~
strata(agec), data = melanoma)
mel.fun <- function(d) {
t1 <- ns(d$thickness, df=4)
cox <- coxph(Surv(d$time, d$status == 1) ~ t1+strata(d$ulcer))
ind <- !duplicated(d$thickness)
u <- d$thickness[!ind]
eta <- cox$linear.predictors[!ind]
sp <- smooth.spline(u, eta, df=20)
th <- seq(from = 0.25, to = 10, by = 0.25)
predict(sp, th)$y
}
mel.str <- cbind(melanoma$ulcer, agec)
# this is slow![这是缓慢的!]
mel.mod <- censboot(melanoma, mel.fun, R = 499, F.surv = mel.surv,
G.surv = mel.cens, cox = mel.cox, strata = mel.str, sim = "model")
# To plot the original predictor and a 95% pointwise envelope for it[绘制原来的预测和95%的逐点信封]
mel.env <- envelope(mel.mod)$point
th <- seq(0.25, 10, by = 0.25)
plot(th, mel.env[1, ], ylim = c(-2, 2),
xlab = "thickness (mm)", ylab = "linear predictor", type = "n")
lines(th, mel.mod$t0, lty = 1)
matlines(th, t(mel.env), lty = 2)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|