找回密码
 注册
查看: 370|回复: 0

R语言 sperrorest包 partition.cv.strat()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-9-30 15:02:58 | 显示全部楼层 |阅读模式
partition.cv.strat(sperrorest)
partition.cv.strat()所属R语言包:sperrorest

                                        Partition the data for a stratified (non-spatial) cross-validation
                                         一个分层的(非空间)交叉验证对分区的数据。

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

partition.cv.strat creates a set of sample indices corresponding to cross-validation test and training sets.
partition.cv.strat创建一组对应的交叉验证测试和训练集的样本指数。


用法----------Usage----------


  partition.cv.strat(data, coords = c("x", "y"),
    nfold = 10, return.factor = FALSE, repetition = 1,
    seed1 = NULL, strat)



参数----------Arguments----------

参数:coords
vector of length 2 defining the variables in data that contain the x and y coordinates of sample locations
向量,长度为2data包含的x和y坐标的样本的位置定义的变量在


参数:strat
character: column in data containing a factor variable over which the partitioning should be stratified; or factor vector of length nrow(data): variable over which to stratify
字符列在data包含的一个因素变量的分区应该是分层或系数向量的长度nrow(data):可变的分层


参数:data
data.frame containing at least the columns specified by coords
data.frame的至少包含列指定的coords


参数:nfold
number of partitions (folds) in nfold-fold cross-validation partitioning
在nfold倍的交叉验证分区的分区数(倍)


参数:return.factor
if FALSE (default), return a represampling object; if TRUE (used internally by other sperrorest functions), return a list containing factor vectors (see Value)
如果FALSE(默认),返回一个represampling对象,如果TRUE(内部使用其他sperrorest功能的),返回一个list因子矢量(见值)


参数:repetition
numeric vector: cross-validation repetitions to be generated. Note that this is not the number of repetitions, but the indices of these repetitions. E.g., use repetition=c(1:100) to obtain (the 'first') 100 repetitions, and repetition=c(101:200) to obtain a different set of 100 repetitions.
数字向量:交叉验证的重复产生。注意,这是不重复的次数,但这些重复的指数。例如,使用repetition=c(1:100)(“第一个”)取得100个重复,和repetition=c(101:200)获得一组不同的重复100次。


参数:seed1
seed1+i is the random seed that will be used by set.seed in repetition i (i in repetition) to initialize the random number generator before sampling from the data set.
seed1+i是随机的种子,将用于set.seed重复i(irepetition)采样前的数据初始化随机数生成器设置。


值----------Value----------

A represampling object, see also partition.cv. partition.strat.cv, however, stratified with respect to the variable data[,strat]; i.e., cross-validation partitioning is done within each set data[data[,strat]==i,] (i in levels(data[,strat])), and the ith folds of all levels are combined into one cross-validation fold.
Arepresampling对象,也看到partition.cv。 partition.strat.cv,然而,分层变量data[,strat]“,即,交叉验证分区内完成每一组data[data[,strat]==i,](ilevels(data[,strat])),和i次褶皱各级组合成一个交叉验证倍。


参见----------See Also----------

sperrorest, as.resampling, resample.strat.uniform
sperrorest,as.resampling,resample.strat.uniform


实例----------Examples----------


data(ecuador)
parti = partition.cv.strat(ecuador, strat = "slides", nfold = 5, repetition = 1)
idx = parti[["1"]][[1]]$train
mean(ecuador$slides[idx]=="TRUE") / mean(ecuador$slides=="TRUE")
# always == 1[总是== 1]
# Non-stratified cross-validation:[非分层交叉验证:]
parti = partition.cv(ecuador, nfold = 5, repetition = 1)
idx = parti[["1"]][[1]]$train
mean(ecuador$slides[idx]=="TRUE") / mean(ecuador$slides=="TRUE")
# close to 1 because of large sample size, but with some random variation[因为大样本的大小接近1,但也有一些随机的变化]

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-6-18 20:38 , Processed in 0.022753 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表