createCV(SpatioTemporal)
createCV()所属R语言包:SpatioTemporal
Define Cross-Validation Groups
定义交叉验证组“
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Creates a matrix that specifies cross-validation schemes.
创建一个矩阵指定交叉验证计划的。
用法----------Usage----------
createCV(mesa.data.model, groups = 10, min.dist = 0.1,
random = FALSE, subset=NA, option="all")
参数----------Arguments----------
参数:mesa.data.model
Data structure holding observations, and information regarding the observation locations. See create.data.model and mesa.data.model.
数据结构保持的观察,和有关的观察位置。见create.data.model和mesa.data.model。
参数:groups
Number of cross-validation groups, zero gives leave-one-out cross-validation.
的交叉验证组数,零给人留一交叉验证。
参数:min.dist
Minimum distance between locations for them to end up in separate groups. Points closer than min.dist will be forced into the same group. A high value for min.dist can result in fewer cross-validation groups than specified in groups.
位置,他们结束了在不同的组之间的最小距离。点接近min.dist将被迫在同一组。高值min.dist可能会导致更少的交叉验证个组比指定的groups的的。
参数:random
If FALSE repeated calls to the function will return the same grouping, if TRUE repeated calls will give different CV-groupings. Ensures that simulation studies are reproducable.
如果FALSE重复调用该函数将返回相同的分组,如果TRUE。千呼万唤会给出不同的CV分组。确保模拟研究重复性。
参数:subset
A subset of locations for which to define the cross-validation setup. Only sites listed in subset are dropped from one of the cross-validation groups; in other words sites not in subset are used for estimation and preidiction of all cross-validation groups. This option is ignored if option!="all".
一个子集,用于定义交叉验证设置的位置。只有在subset网站列出的都将被丢弃从交叉验证组之一,换句话说网站不subset用于的估计和preidiction的所有交叉验证组。这个选项被忽略,如果option!="all"。
参数:option
For internal MESA Air usage, see Details below.
对于内部梅萨航空使用,请参阅下面的详细信息。
Details
详细信息----------Details----------
The number of observations left out of each group is can be rather uneven; the main goal of createCV is to create CV-groups such that the groups contain roughly the same number of locations ignoring the number of observations at each location. If there are large differences in the number of observations at differnt locations one could use the subset option to create different CV-groupings for different types of locations. The groups can then be combined as <br> I.final = I.1 | I.2 | I.3.
观察各组的数量是相当不createCV的主要目标是创建CV组等组所包含的数量大致相同的位置,忽略了在每个位置的若干意见。如果有很大的差异的若干意见不同的充位置,可以使用subset选项来创建不同的CV分组为不同类型的位置。的组,然后可以结合为参考I.final = I.1 | I.2 | I.3。
If random=FALSE the function initially sets <br> set.seed(0, kind = "Mersenne-Twister"), <br> and resets the random-seed using .Random.seed and set.seed before terminating.
如果random=FALSE的功能最初设置<BR>set.seed(0, kind = "Mersenne-Twister"),参考和重置随机种子.Random.seed和set.seed终止前。
The option input determines which sites to include in the cross-validation. Possible options are "all", "fixed", "comco", "snapshot" and "home".
option的输入决定哪些网站,包括交叉验证。可能的选项包括"all","fixed","comco","snapshot"和"home"。
"all" Uses all available sites, possibly subset according to subset. The sites will be grouped with sites seperated by less than min.dist being put in the same CV-group.
"all"使用所有可用的站点,可能的子集,根据subset。小于min.dist在同一个集团CV-分隔的网站,该网站将进行分组。
"fixed" Uses only sites that have <br> mesa.data.model$location$type %in% c("AQS","FIXED"). Given the subsettting the sites will be grouped as for "all".
"fixed"使用只有网站有<BR>mesa.data.model$location$type %in% c("AQS","FIXED")。鉴于subsettting该网站将被划分为"all"。
"home" Uses only sites that have <br> mesa.data.model$location$type %in% c("HOME"). Given the subsettting the sites will be grouped as for "all".
"home"使用只有网站有<BR>mesa.data.model$location$type %in% c("HOME")。鉴于subsettting该网站将被划分为"all"。
"comco", "snapshot" Uses only sites that have <br> mesa.data.model$location$type %in% c("COMCO").
"comco","snapshot"使用只有网站有<BR>mesa.data.model$location$type %in% c("COMCO")。
The sites will be grouped together if they are from the same road gradient. The road gradients are grouped by studying the name of the sites. With "?" denoting one or more letters and "#" denoting one or more digits the names are expected to follow "?-?#?#", for random sites, and "?-?#?#?" for the gradients (with all but the last letter being the same for the entire gradient).
该网站将被组合在一起,如果他们是来自同一个道路坡度。道路的坡度进行分组研究的网站的名称。随着“?”表示一个或多个字母和“#”表示一个或多个数字的名称都应该遵守“? - ##”,为随机的网站,和“ - # ?#?“梯度(所有,但整个渐变的最后一个字母是相同的)。
值----------Value----------
Return a (number or observations) - by - (groups) logical matrix. Each column defines a cross-validation set with the TRUE values marking the observations to be left out.
返回(数量或观察) - 由 - (组)的逻辑矩阵。每一列都定义了一个交叉验证TRUE值标记的意见被冷落。
(作者)----------Author(s)----------
Johan Lindstr枚m
参见----------See Also----------
See also estimateCV, and predictCV.
estimateCV和predictCV。
For computing CV statistics, see also compute.ltaCV, predictNaive, and for further illustration see plotCV, CVresiduals.qqnorm, and summaryStatsCV.
对于计算CV统计,也compute.ltaCV,predictNaive,并为进一步的说明,请参阅plotCV,CVresiduals.qqnorm和summaryStatsCV。
实例----------Examples----------
##load the data[#加载的数据。]
data(mesa.data.model)
##create a matrix with the CV-schemes[#创建一个矩阵的CV计划]
I.cv <- createCV(mesa.data.model, groups=10)
##number of observations in each CV-group[在每个CV组若干意见]
colSums(I.cv)
##Which sites belong to which groups?[#哪些网站属于哪个群体?]
ID.cv <- lapply(apply(I.cv,2,list),function(x)
unique(mesa.data.model$obs$ID[x[[1]]]))
print(ID.cv)
##Note that the sites with distance 0.084<min.dist [#需要注意的是站点的距离0.084 min.dist]
##are grouped together (in group 10).[#被组合在一起(组10)。]
mesa.data.model$dist[ID.cv[[10]],ID.cv[[10]]]
##Find out which location belongs to which cv group[#找出哪个位置是属于哪个CV组]
I.col <- apply(sapply(ID.cv,function(x) mesa.data.model$location$ID
%in% x), 1, function(x) if(sum(x)==1) which(x) else 0)
names(I.col) <- mesa.data.model$location$ID
print(I.col)
##Plot the locations, colour coded by CV-grouping[#图的位置,颜色编码的CV分组]
plot(mesa.data.model$location$long, mesa.data.model$location$lat,
pch=23+floor(I.col/max(I.col)+.5), bg=I.col,
xlab="Longitude",ylab="Latitude")
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|