repCV.seqModel(robustHD)
repCV.seqModel()所属R语言包:robustHD
Cross-validation for a sequential regression model
交叉验证的顺序回归模型
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Perform (repeated) K-fold cross-validation to estimate the prediction error of a previously fit sequential regression model such as a robust least angle regression model. In each iteration of cross-validation, the optimal model is thereby selected from the training data and used to make predictions for the test data.
执行(重复)K的-折交叉验证估计先前合适的顺序回归模型,作为一个强大的最小角度回归模型的预测误差的。在每一次迭代的交叉验证,优化模型,从而从训练数据选择和使用的测试数据来进行预测。
用法----------Usage----------
## S3 method for class 'seqModel'
repCV(object, cost, K = 5, R = 1,
foldType = c("random", "consecutive", "interleaved"),
folds = NULL, seed = NULL, ...)
参数----------Arguments----------
参数:object
the model fit for which to estimate the prediction error.
估计的预测误差的模型拟合。
参数:cost
a cost function measuring prediction loss. It should expect vectors to be passed as its first two arguments, the first corresponding to the observed values of the response and the second to the predicted values, and must return a non-negative scalar value. The default is to use the root mean squared prediction error for non-robust models and the root trimmed mean squared prediction error for robust models (see cost).
成本函数的测量预测的损失。它应该期待向量就可以通过它的前两个参数,第一个对应的观测值的响应和第二年的预测值,并且必须返回一个非负的标值。默认情况下是使用均方根预测误差不可靠的模型和根修剪意味着可靠的模型预测误差平方(见cost)。
参数:K
an integer giving the number of groups into which the data should be split (the default is five). Keep in mind that this should be chosen such that all groups are of approximately equal size. Setting K equal to n yields leave-one-out cross-validation.
一个整数,数组的数据应该分开(默认为5)。请记住,这应该是所有组大小约等于选择了这样。设置Kn产量留一交叉验证。
参数:R
an integer giving the number of replications for repeated K-fold cross-validation. This is ignored for for leave-one-out cross-validation and other non-random splits of the data.
一个整数,代表数的重复,重复K倍交叉验证。这是离开了交叉验证和其他非随机的数据分割忽略。
参数:foldType
a character string specifying the type of folds to be generated. Possible values are "random" (the default), "consecutive" or "interleaved".
一个字符串指定要产生的褶皱的类型。可能值是"random"(默认值),"consecutive"或"interleaved"。
参数:folds
an object of class "cvFolds" giving the folds of the data for cross-validation (as returned by cvFolds). If supplied, this is preferred over K and R.
类的一个对象"cvFolds"给的数据进行交叉验证的褶皱(返回cvFolds)。如果提供,这是优于K和R。
参数:seed
optional initial seed for the random number generator (see .Random.seed).
可选的初始种子的随机数发生器(见.Random.seed“)。
参数:...
additional arguments to be passed to the prediction loss function cost.
额外的参数传递的预测损失函数cost。
值----------Value----------
An object of class "cv" with the following components:
对象的类"cv"以下组件:
参数:n
an integer giving the number of observations.
一个整数,给出了若干意见。
参数:K
an integer giving the number of folds.
一个整数,给出的倍数的数目。
参数:R
an integer giving the number of replications.
一个整数,复制数量。
参数:cv
a numeric value giving the estimated prediction error. For repeated cross-validation, this gives the average value over all replications.
一个数字值,该值给出估计预测误差。对于重复交叉验证,这给了对所有重复的平均值。
参数:se
a numeric value giving the estimated standard error of the prediction loss.
一个数字值,该值提供估计的标准误差的预测损失。
参数:reps
a numeric matrix with one column that contains the estimated prediction errors from all replications. This is only returned for repeated cross-validation.
一个数字矩阵的一列包含所有重复的估计预测误差。这是只返回进行反复交叉验证。
参数:seed
the seed of the random number generator before cross-validation was performed.
进行交叉验证之前的随机数发生器的种子。
参数:call
the matched function call.
匹配的函数调用。
(作者)----------Author(s)----------
Andreas Alfons
参见----------See Also----------
rlars, predict.seqModel, cvFolds, cost
rlars,predict.seqModel,cvFolds,cost
实例----------Examples----------
## generate data[#生成数据]
# example is not high-dimensional to keep computation time low[例如不高维的计算时间保持低]
set.seed(1234) # for reproducibility[可重复性]
n <- 100 # number of observations[的观测数]
p <- 25 # number of variables[的变量数目]
beta <- rep.int(c(1, 0), c(5, p-5)) # coefficients[系数]
sigma <- 0.5 # controls signal-to-noise ratio[控制的信号 - 噪声比]
epsilon <- 0.1 # contamination level[污染水平]
x <- replicate(p, rnorm(n)) # predictor matrix[预测矩阵]
e <- rnorm(n) # error terms[误差项]
i <- 1:ceiling(epsilon*n) # observations to be contaminated[受到污染的意见]
e[i] <- e[i] + 5 # vertical outliers[垂直离群]
y <- c(x %*% beta + sigma * e) # response[响应]
x[i,] <- x[i,] + 5 # bad leverage points[坏的平衡点]
## fit and evaluate robust LARS model[#适合和评估强大的LARS模型]
fit <- rlars(x, y)
cv <- repCV(fit)
cv
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|