R语言 sperrorest包 sperrorest()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-30 15:04:52

sperrorest(sperrorest)
sperrorest()所属R语言包：sperrorest

                                    Perform spatial error estimation and variable importance assessment
                                       执行空间错误估计和变量的重要性评估

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

sperrorest is a flexible interface for multiple types of spatial and non-spatial cross-validation and bootstrap error estimation and permutation-based assessment of spatial variable importance.
sperrorest是一个灵活的接口，多种类型的空间和非空间的交叉验证和引导错误估计和空间变量的重要性排列为基础的评估。

用法----------Usage----------

  sperrorest(formula, data, coords = c("x", "y"),
model.fun, model.args = list(), pred.fun = NULL,
pred.args = list(), smp.fun = partition.loo,
smp.args = list(), train.fun = NULL,
train.param = NULL, test.fun = NULL, test.param = NULL,
err.fun = err.default, err.unpooled = TRUE,
err.pooled = FALSE, err.train = TRUE,
imp.variables = NULL, imp.permutations = 1000,
importance = !is.null(imp.variables), distance = FALSE,
do.gc = 1, do.try = FALSE, silent = FALSE, ...)

参数----------Arguments----------

参数：data
a data.frame with predictor and response variables. Training and test samples will be drawn from this data set by train.fun and test.fun, respectively.
data.frame预测和响应变量。从这个数据集train.fun和test.fun，分别训练和测试样本将被绘制。

参数：formula
A formula specifying the variables used by the model. Only simple formulas without interactions or nonlinear terms should be used, e.g. y~x1+x2+x3 but not y~x1*x2+log(x3). Formulas involving interaction and nonlinear terms may possibly work for error estimation but not for variable importance assessment, but should be used with caution.
公式指定的model使用的变量。相互作用或非线性项只有简单的公式，不应该被使用，例如： y~x1+x2+x3但不是y~x1*x2+log(x3)。公式的相互作用和非线性项可能的错误估计，但不变量重要性评估，但应谨慎使用。

参数：coords
vector of length 2 defining the variables in data that contain the x and y coordinates of sample locations
向量，长度为2data包含的x和y坐标的样本的位置定义的变量在

参数：model.fun
Function that fits a predictive model, such as glm or rpart. The function must accept at least two arguments, the first one being a formula and the second a data.frame with the learning sample.
该功能适合的预测模型，如glm或rpart。该函数必须接受至少需要两个参数，第一个是一个公式，第二个是数据框的学习样本。

参数：model.args
Arguments to be passed to model.fun (in addition to the formula and data argument, which are provided by sperrorest)
参数被传递到model.fun（formula和data的说法，所提供的sperrorest）

参数：pred.fun
Prediction function for a fitted model object created by model. Must accept at least two arguments: the fitted object and a data.frame newdata with data on which to predict the outcome.
预测函数的拟合模型创建的对象的model。必须接受至少两个参数的拟合object和data.framenewdata数据作为预测的结果。

参数：pred.args
(optional) Arguments to pred.fun (in addition to the fitted model object and the newdata argument, which are provided by sperrorest)
（可选）的参数pred.fun（在除了拟合模型对象和newdata的说法，这是提供sperrorest）

参数：smp.fun
A function for sampling training and test sets from data. E.g., partition.kmeans for spatial cross-validation using spatial k-means clustering.
从data采样的训练和测试集的功能。例如，partition.kmeans空间交叉验证使用空间的k-means聚类。

参数：smp.args
(optional) Arguments to be passed to est.fun
（可选）参数传递给est.fun

参数：train.fun
(optional) A function for resampling or subsampling the training sample in order to achieve, e.g., uniform sample sizes on all training sets, or maintaining a certain ratio of positives and negatives in training sets. E.g., resample.uniform or resample.strat.uniform
（可选）功能用于重新采样或欠采样的训练样本来实现的，例如，所有的训练集样本大小均匀，保持一定比例的正面和负面的训练集。例如，resample.uniform或resample.strat.uniform

参数：train.param
(optional) Arguments to be passed to resample.fun
（可选）参数传递给resample.fun

参数：test.fun
(optional) Like train.fun but for the test set.
（可选）train.fun，“可是，测试组。

参数：test.param
(optional) Arguments to be passed to test.fun
（可选）参数传递给test.fun

参数：err.fun
A function that calculates selected error measures from the known responses in data and the model predictions delivered by pred.fun. E.g., err.default (the default). See example and details below.
一个函数，计算选择从已知的反应data和交付pred.fun模型预测误差的措施。例如，err.default（默认值）。请参考下面的例子和细节。

参数：err.unpooled
logical (default: TRUE): calculate error measures on each fold within a resampling repetition
逻辑（默认：TRUE）：一个重采样的重复内每个倍计算误差的措施，

参数：err.pooled
logical (default: FALSE): calculate error measures based on the pooled predictions of all folds within a resampling repetition
逻辑（默认：FALSE）：计算错误措施的基础上汇集的预测，所有的倍数内重新取样重复

参数：err.train
logical (default: TRUE): calculate error measures on the training set (in addition to the test set estimation)
逻辑（默认值：TRUE）：在训练集上（除了测试的估计计算误差措施）

参数：imp.variables
(optional; used if importance=TRUE) Variables for which permutation-based variable importance assessment is performed. If importance=TRUE and imp.variables is NULL, all variables in formula will be used.
（可选的，如果importance=TRUE）变量排列为基础的变量的重要性进行评估。如果importance=TRUE和imp.variablesNULL“中的所有变量formula使用。

参数：imp.permutations
(optional; used if importance=TRUE) Number of permutations used for variable importance assessment.
（可选，如果importance=TRUE）号码的排列组合用于可变的重要性评估。

参数：importance
logical: perform permutation-based variable importance assessment?
逻辑：变量的重要性进行排列为基础的评估吗？

参数：...
currently not used
目前未使用

参数：distance
logical (default: FALSE): if TRUE, calculate mean nearest-neighbour distances from test samples to training samples using add.distance.represampling
逻辑（默认：FALSE）：如果TRUE，计算平均训练样本测试样本的最近邻距离add.distance.represampling

参数：do.gc
numeric (default: 1): defines frequency of memory garbage collection by calling gc; if <1, no garbage collection; if >=1, run a gc() after each repetition; if >=2, after each fold
数值（默认为1）：定义频率的内存垃圾回收调用gc如果<1，没有垃圾收集，如果>=1，运行一个gc()后，每个重复的;如果>=2后，每个倍

参数：do.try
logical (default: FALSE): if TRUE [untested!!], use try to robustify calls to model.fun and err.fun; use with caution!
逻辑（默认：FALSE）：如果TRUE[未经考验的！]，使用try的robustify调用model.fun和err.fun;请谨慎使用！

参数：silent
If TRUE, show progress on console (in Windows Rgui, disable 'Buffered output' in 'Misc' menu)
如果TRUE，在控制台上显示进度（在Windows RGUI，禁用缓冲输出“，”杂项“菜单中）

值----------Value----------

A list (object of class sperrorest) with (up to)
类的对象的列表（sperrorest）（最多）

参数：error
a sperroresterror object containing predictive performances at the fold level  <tr valign="top"><td>represampling</td>
sperroresterror对象，其中包含的预测表演的倍数级别<tr valign="top"> <TD>represampling</ TD>

a represampling object  <tr valign="top"><td>pooled.error</td>
represampling对象<tr valign="top"> <TD>pooled.error</ TD>

a sperrorestpoolederror object containing predictive performances at the repetition level  <tr valign="top"><td>importance</td>
sperrorestpoolederror对象，其中包含预测的表演在重复<tr valign="top"> <TD>importance</ TD>

a sperrorestimportance object containing permutation-based variable importances at the fold level
sperrorestimportance对象，其中包含变量的重要性排列为基础的倍数级别

An object of class sperrorest, i.e. a list with components error (of class sperroresterror), represampling (of class represampling), pooled.error (of class sperrorestpoolederror) and importance (of class sperrorestimportance).
类的一个对象sperrorest，即组件的列表error（类sperroresterror），represampling（类represampling）pooled.error （类sperrorestpoolederror）importance（类sperrorestimportance）。

注意----------Note----------

To do: (1) Parallelize the code; (2) Optionally save fitted models, training and test samples in the results object; (3) Optionally save intermediate results in some file, and enable the function to continue an interrupted sperrorest call where it was interrupted. (3) Optionally have sperrorest dump the result of each repetition into a file, and to skip repetitions for which a file already exists. (4) Save sperrorest version number in results object.
要做到：（1）并行化的代码;（2）选择保存拟合模型，结果对象中的训练和测试样本;（3）任选保存中间结果的一些文件，并启用的功能继续中断的sperrorest调用的地方被打断了。（3）任选sperrorest转储到一个文件中每个重复的结果，并跳过重复的文件已经存在。（4）保存sperrorest的版本号在结果对象。

参考文献----------References----------

for the assessment of prediction rules in remote sensing: the R package 'sperrorest'. IEEE International Symposium on Geoscience and Remote Sensing IGARSS, in press.
landslide hazards: review, comparison and evaluation. Natural Hazards and Earth System Sciences, 5(6): 853-862.
Detecting rock glacier flow structures using Gabor filters and IKONOS imagery. Submitted to Remote Sensing of Environment.
agriculture: Management of spatial information. In 13th International Conference on Information Processing and Management of Uncertainty, IPMU 2010; Dortmund; 28 June - 2 July 2010.  Lecture Notes in Computer Science, 6178 LNAI: 350-359.
importance assessment for yield prediction in Precision Agriculture. In Advances in Intelligent Data Analysis IX, Proceedings, 9th International Symposium, IDA 2010, Tucson, AZ, USA, 19-21 May 2010.  Lecture Notes in Computer Science, 6065 LNCS: 184-195.

参见----------See Also----------

ipred
ipred

实例----------Examples----------

data(ecuador) # Muenchow et al. (2012), see ?ecuador[明肖等。（2012年），厄瓜多尔，看到了吗？]
fo = slides ~ dem + slope + hcurv + vcurv +
   log.carea + cslope

# Example of a classification tree fitted to this data:[嵌合到该数据的实施例的一种分类树：]
library(rpart)
ctrl = rpart.control(cp = 0.005) # show the effects of overfitting[过度拟合的影响]
fit = rpart(fo, data = ecuador, control = ctrl)
par(xpd = TRUE)
plot(fit, compress = TRUE, main = "Stoyan's landslide data set")
text(fit, use.n = TRUE)

# Non-spatial 5-repeated 10-fold cross-validation:[非空间的5  -  10倍交叉重复验证：]
mypred.rpart = function(object, newdata) predict(object, newdata)[,2]
nspres = sperrorest(data = ecuador, formula = fo,
model.fun = rpart, model.args = list(control = ctrl),
pred.fun = mypred.rpart,
smp.fun = partition.cv, smp.args = list(repetition=1:5, nfold=10))
summary(nspres$error)
summary(nspres$represampling)
plot(nspres$represampling, ecuador)

# Spatial 5-repeated 10-fold spatial cross-validation:[空间5重复10倍的空间交叉验证：]
spres = sperrorest(data = ecuador, formula = fo,
model.fun = rpart, model.args = list(control = ctrl),
pred.fun = mypred.rpart,
smp.fun = partition.kmeans, smp.args = list(repetition=1:5, nfold=10))
summary(spres$error)
summary(spres$represampling)
plot(spres$represampling, ecuador)

smry = data.frame(
   nonspat.training = unlist(summary(nspres$error,level=1)$train.auroc),
   nonspat.test    = unlist(summary(nspres$error,level=1)$test.auroc),
   spatial.training = unlist(summary(spres$error,level=1)$train.auroc),
   spatial.test    = unlist(summary(spres$error,level=1)$test.auroc))
boxplot(smry, col = c("red","red","red","green"),
   main = "Training vs. test, nonspatial vs. spatial",
   ylab = "Area under the ROC curve")

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册