找回密码
 注册
查看: 272|回复: 0

R语言 simFrame包 clusterRunSimulation()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-9-30 02:40:46 | 显示全部楼层 |阅读模式
clusterRunSimulation(simFrame)
clusterRunSimulation()所属R语言包:simFrame

                                        Run a simulation experiment on a cluster
                                         在聚类上运行的模拟实验

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Generic function for running a simulation experiment on a cluster.
在聚类上运行的模拟实验的通用功能。


用法----------Usage----------


clusterRunSimulation(cl, x, setup, nrep, control,
                     contControl = NULL, NAControl = NULL,
                     design = character(), fun, ...,
                     SAE = FALSE)



参数----------Arguments----------

参数:cl
a cluster as generated by makeCluster.
一个聚类所产生的makeCluster。


参数:x
a data.frame (for design-based simulation or simulation based  on real data) or a control object for data generation inheriting from  "VirtualDataControl" (for model-based simulation or mixed simulation  designs).
一个data.frame设计为基础的模拟或模拟真实数据的基础上,或继承自"VirtualDataControl"(基于模型的模拟或混合仿真设计)的数据生成一个控制对象。


参数:setup
an object of class "SampleSetup", containing previously  set up samples, or a control class for setting up samples inheriting  from "VirtualSampleControl".
类的一个对象"SampleSetup",包含以前成立的样品,或一个控件类继承自"VirtualSampleControl"样品。


参数:nrep
a non-negative integer giving the number of repetitions of the  simulation experiment (for model-based simulation, mixed simulation designs  or simulation based on real data).
一个非负的整数,给出的模拟实验的重复数(基于模型的模拟,混合模拟根据实际的数据的设计或模拟)。


参数:control
a control object of class "SimControl"     
控制对象的类"SimControl"


参数:contControl
an object of a class inheriting from  "VirtualContControl", controlling contamination in the simulation  experiment.
一个类继承自"VirtualContControl",控制污染的模拟实验对象。


参数:NAControl
an object of a class inheriting from  "VirtualNAControl", controlling the insertion of missing values in  the simulation experiment.
一个对象的一个类继承自"VirtualNAControl",控制插入缺失值的模拟实验。


参数:design
a character vector specifying variables (columns) to be used  for splitting the data into domains.  The simulations, including  contamination and the insertion of missing values (unless SAE=TRUE),  are then performed on every domain.
分裂域数据转换成用于一个字符矢量指定变量(列)。模拟,包括污染和插入缺失值(除非SAE=TRUE),然后在每个域。


参数:fun
a function to be applied in each simulation run.
一个函数被应用于在每个模拟运行。


参数:...
for runSimulation, additional arguments to be passed  to fun.  For runSim, arguments to be passed to  runSimulation.
runSimulation,其他参数传递给fun。对于runSim,参数被传递到runSimulation。


参数:SAE
a logical indicating whether small area estimation will be used in  the simulation experiment.
一个逻辑指示是否将被用于在模拟实验中的小区域估计。


Details

详细信息----------Details----------

Statistical simulation is embarrassingly parallel, hence computational  performance can be increased by parallel computing.  Since version 0.5.0,  parallel computing in simFrame is implemented using the package  parallel, which is part of the R base distribution since version  2.14.0 and builds upon work done for the contributed packages  multicore and snow.  Note that all objects and packages  required for the computations (including simFrame) need to be made  available on every worker process unless the worker processes are created by  forking (see makeCluster).
统计模拟是令人尴尬的并行可提高并行计算,因此计算性能。从0.5.0版本开始,并行计算simFrame使用的包parallel,这是从版本2.14.0的R基分布,并建立在工作的贡献的软件包<X >和multicore。请注意,所有的对象和包所需的计算(包括snow)需要在每个工作进程,除非创建的工作进程通过fork(见simFrame)。

In order to prevent problems with random numbers and to ensure  reproducibility, random number streams should be used.  With  parallel, random number streams can be created via the  function clusterSetRNGStream().
为了防止随机数的问题,以确保可重复性,随机数流应该被使用。 parallel与,随机数流可以通过创建功能clusterSetRNGStream()。

There are some requirements for slot fun of the control object  control.  The function must return a numeric vector, or a list with  the two components values (a numeric vector) and add  (additional results of any class, e.g., statistical models).  Note that the  latter is computationally slightly more expensive.  A data.frame is  passed to fun in every simulation run.  The corresponding argument  must be called x.  If comparisons with the original data need to be  made, e.g., for evaluating the quality of imputation methods, the function  should have an argument called orig.  If different domains are used  in the simulation, the indices of the current domain can be passed to the  function via an argument called domain.
有一些要求插槽fun的控制对象control。该函数必须返回一个数值向量或列表的两个组成部分values(数字向量)和add(附加任何类,例如,统计模型)。请注意,后者是计算稍微更昂贵。 Adata.frame传递给fun在每次模拟运行。对应参数必须叫做x。例如,如果需要与原始数据进行比较,评估质量的估算方法,函数应该有一个参数叫做orig。如果模拟中使用的不同的域,在当前域的指标,可以通过功能通过参数称为domain。

For small area estimation, the following points have to be kept in mind.  The  slot design of control for splitting the data must be supplied  and the slot SAE must be set to TRUE.  However, the data are  not actually split into the specified domains.  Instead, the whole data set  (sample) is passed to fun.  Also contamination and missing values are  added to the whole data (sample).  Last, but not least, the function must  have a domain argument so that the current domain can be extracted  from the whole data (sample).
对于小面积的估计,必须牢记以下几点。该槽designcontrol分裂,必须将数据供给和插槽SAE必须设置为TRUE。然而,该数据实际上不是分成指定的域。相反,整个数据集(样本)被传递给fun。污染和遗漏值将被添加到整个数据(样本)。最后,但并非最不重要的一点是,该函数必须有一个domain参数,以便可以提取当前域的整个数据(样本)。

In every simulation run, fun is evaluated using try.  Hence  no results are lost if computations fail in any of the simulation runs.
在每一个仿真的运行,fun被评为使用try。因此,没有结果都将丢失,如果计算在任何的模拟运行失败。


值----------Value----------

An object of class "SimResults".
对象的类"SimResults"。


方法----------Methods----------

control = "missing"</dt> convenience wrapper that allows the slots of
控制“失踪”</ P> </ DT>方便的包装,允许插槽

control = "SimControl"</dt> run a simulation experiment based on real data
控制=“SimControl”的</> </ DT>运行一个基于真实数据的仿真实验

nrep = "missing", control = "SimControl"</dt> run a design-based simulation
NREP =“失踪”,控制“SimControl”</ P> </ DT>运行设计为基础的模拟

nrep = "missing", control = "SimControl"</dt> run a design-based simulation
NREP =“失踪”,控制“SimControl”</ P> </ DT>运行设计为基础的模拟

nrep = "numeric", control = "SimControl"</dt> run a model-based simulation
“数字NREP =”控制“SimControl”</ P> </ DT>运行基于模型的模拟

control = "SimControl"</dt> run a simulation experiment using a mixed
控制=“SimControl”</ P> </ DT>运行模拟实验中使用的混合


(作者)----------Author(s)----------


Andreas Alfons



参考文献----------References----------

Statistical Simulation: The R Package <code>simFrame</code>. Journal of  Statistical Software, 37(3), 1&ndash;36. URL  http://www.jstatsoft.org/v37/i03/.
Random-Number Package with Many Long Streams and Substreams. Operations  Research, 50(6), 1073&ndash;1075.
in R. Journal of Computational and Graphical Statistics, 16(2),  399&ndash;420.
Framework for the R System. International Journal of Parallel  Programming, 37(1), 78&ndash;90.

参见----------See Also----------

makeCluster,  clusterSetRNGStream,  runSimulation, "SimControl",  "SimResults", simBwplot,  simDensityplot, simXyplot
makeCluster,clusterSetRNGStream,runSimulation,"SimControl","SimResults",simBwplot,simDensityplot,simXyplot


实例----------Examples----------


## Not run: [#不运行:]
## these examples requires at least a dual core processor[#这些例子至少需要双核处理器]


## design-based simulation[#设计为基础的模拟]
data(eusilcP)  #load data[加载数据]

# start cluster[启动聚类]
cl <- makeCluster(2, type = "PSOCK")

# load package and data on workers[对工人的负载包和数据]
clusterEvalQ(cl, {
    library(simFrame)
    data(eusilcP)
})

# set up random number stream[随机数流]
clusterSetRNGStream(cl, iseed = "12345")

# control objects for sampling and contamination[采样和污染控制对象]
sc <- SampleControl(size = 500, k = 50)
cc <- DARContControl(target = "eqIncome", epsilon = 0.02,
    fun = function(x) x * 25)

# function for simulation runs[功能模拟运行]
sim <- function(x) {
    c(mean = mean(x$eqIncome), trimmed = mean(x$eqIncome, 0.02))
}

# export objects to workers[对象导出到工人]
clusterExport(cl, c("sc", "cc", "sim"))

# run simulation on cluster[上运行模拟聚类]
results <- clusterRunSimulation(cl, eusilcP,
    sc, contControl = cc, fun = sim)

# stop cluster[停止聚类]
stopCluster(cl)

# explore results[探索的结果]
head(results)
aggregate(results)
tv &lt;- mean(eusilcP$eqIncome)  # true population mean[真正的人口是什么意思]
plot(results, true = tv)



## model-based simulation[#基于模型的模拟]

# start cluster[启动聚类]
cl <- makeCluster(2, type = "PSOCK")

# load package on workers[加载包的工人]
clusterEvalQ(cl, library(simFrame))

# set up random number stream[随机数流]
clusterSetRNGStream(cl, iseed = "12345")

# function for generating data[用于产生数据的功能]
rgnorm <- function(n, means) {
    group <- sample(1:2, n, replace=TRUE)
    data.frame(group=group, value=rnorm(n) + means[group])
}

# control objects for data generation and contamination[数据的产生和污染的控制对象]
means <- c(0, 0.25)
dc <- DataControl(size = 500, distribution = rgnorm,
    dots = list(means = means))
cc <- DCARContControl(target = "value",
    epsilon = 0.02, dots = list(mean = 15))

# function for simulation runs[功能模拟运行]
sim <- function(x) {
    c(mean = mean(x$value),
        trimmed = mean(x$value, trim = 0.02),
        median = median(x$value))
}

# export objects to workers[对象导出到工人]
clusterExport(cl, c("rgnorm", "means", "dc", "cc", "sim"))

# run simulation on cluster[上运行模拟聚类]
results <- clusterRunSimulation(cl, dc, nrep = 100,
    contControl = cc, design = "group", fun = sim)

# stop cluster[停止聚类]
stopCluster(cl)

# explore results[探索的结果]
head(results)
aggregate(results)
plot(results, true = means)

## End(Not run)[#(不执行)]

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-5-24 01:57 , Processed in 0.029318 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表