找回密码
 注册
查看: 383|回复: 0

R语言 runjags包 xgrid.run()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-9-28 23:45:19 | 显示全部楼层 |阅读模式
xgrid.run(runjags)
xgrid.run()所属R语言包:runjags

                                        Remote execution of user-specified R functions on Apple Xgrid distributed computing clusters
                                         远端执行用户指定的R函数对苹果Xgrid的分布式计算聚类

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Allows arbitrary R code to be executed on Apple Xgrid distributed computing clusters and the results returned to the R session of the user.  Jobs can either be run synchronously (the process will wait for the model to complete before returning the results) or asynchronously (the process will terminate on submission of the job and results are retrieved at a later time).  Access to an Xgrid cluster with R (along with all packages required by the function) installed is required.  Due to the dependance on Xgrid software to perform the underlying submission and retrieval of jobs, these functions can only be used on machines running Mac OS X.  Further details of required environmental variables and the optional mgrid script to enable multi-task jobs can be found in the details section.
允许苹果Xgrid的分布式计算聚类和结果返回给用户的R会话中执行任意R代码。乔布斯可以同步运行(进程将等待模型来完成,然后返回结果)或异步(进程将终止提交的工作和结果,在以后的时间中检索)。访问一个Xgrid的聚类,其中R(以及由该函数所需的所有程序包)安装是必需的。的DEPENDANCE利用Xgrid软件执行基本的提交和检索的工作,这些功能只能用在机器上运行Mac OS X的进一步详情所需的环境变量和可选的MGRID脚本,使多任务的工作可以发现在细节部分。

'xgrid.run' submits jobs to Xgrid that execute the function provided over the number of iterations specified, then intermittently retrieves the status of the job(s) and, if finished, retrieving and returning the results as an R list object.
xgrid.run提交作业Xgrid的执行指定的迭代的数量超过所提供的功能,然后间歇性地检索的作业的状态(),如果完成,作为R的结果列表对象的检索和返回。

'xgrid.submit' submits the job to xgrid, and returns the name of the started job (this is a convinience wrapper for xgrid.run with  submitandstop=TRUE).
“xgrid.submit将作业提交到XGRID,并开始工作(这是一个优雅的包装为xgrid.run与submitandstop = TRUE)返回的名称。

'xgrid.results' returns the results of a job started using 'xgrid.submit' in the current working directory.  If the job is not complete the function will return the status of the job, or the results for completed threads (without deleting the job) if partial.retrieve=TRUE
“xgrid.results返回结果的工作开始使用的”xgrid.submit“在当前的工作目录。如果作业未完成的功能将返回的工作状态,或已完成的线程(不删除作业)的结果,如果partial.retrieve = TRUE

'xapply' is a convinience wrapper for 'xgrid.run' which takes arguments akin to lapply
的“xapply”是一个优雅的包装“xgrid.run参数类似于lapply


用法----------Usage----------



xgrid.run(f=function(iteration){}, niters,
   object.list=list(), file.list=character(0),
   threads=min(niters,100), arguments=as.list(1:niters),
   jobname=NA, wait.interval="10 min",
   xgrid.method=if(threads==1) 'simple' else
   if(!file.exists(Sys.which('mgrid'))) 'separatejobs'
   else 'separatetasks', Rpath='/usr/bin/R', Rbuild='64',
   cleanup = TRUE, submitandstop = FALSE, tempdir=!submitandstop,
   keep.files = FALSE, show.output = TRUE, max.filesize="1GB",
   sub.app=if(!file.exists(Sys.which('mgrid')))
   'xgrid -job submit -in "$indir"'
   else 'mgrid -t $ntasks -i "$indir"', sub.options="",
   sub.command=paste(sub.app, sub.options, '"$cmd"', sep=' '),
   ...)

xgrid.submit(f=function(iteration){}, niters,
   object.list=list(), file.list=character(0),
   threads=min(niters,100), arguments=as.list(1:niters),
   jobname=NA, xgrid.method=if(threads==1) 'simple' else
   if(!file.exists(Sys.which('mgrid'))) 'separatejobs'
   else 'separatetasks', Rpath='/usr/bin/R', Rbuild='64',
   show.output = TRUE, max.filesize="1GB",
   sub.app=if(!file.exists(Sys.which('mgrid')))
   'xgrid -job submit -in "$indir"'
   else 'mgrid -t $ntasks -i "$indir"', sub.options="",
   sub.command=paste(sub.app, sub.options, '"$cmd"', sep=' '),
   ...)

xgrid.results(jobname, partial.retrieve=FALSE,
   cleanup=!partial.retrieve, keep.files=FALSE, show.output=TRUE)
   
xapply(X, FUN, xgrid.options=list(), ...)




参数----------Arguments----------

参数:f
the function to be iterated over on Xgrid.  This must take at least 1 argument, the first of which represents the value of the 'arguments' list to be passed to the function for that iteration, which is the iteration number unless 'arguments' (or 'X' for xapply) is specified.  Any other arguments to be passed to the function can be supplied as additional arguments to xgrid.run/xgrid.submit/xapply. The value(s) of interest should be returned by this function (an object of any class is permissable).  No default.
功能上进行迭代的Xgrid的。这就必须至少有1个参数,首先代表的价值的“参数”传递给函数的列表进行迭代,这是迭代次数,除非指定了参数(或X为xapply) 。任何其他参数传递给函数的可提供作为附加参数到xgrid.run / xgrid.submit / xapply。通过此功能(所允许的任何一类是对象)的利益应该返回的值(S)。无默认值。


参数:niters
the total number of iterations over which to evaluate the function f.  This can be less than the number of threads, in which case multiple iterations are evaluated serially as part of the same task.  No default.
迭代来评价函数f的总数。这可以是小于的线程数,在这种情况下,多次迭代被评估作为相同的任务的一部分串联。无默认值。


参数:object.list
a named list of objects that will be copied to the global environment on Xgrid and so will be visible inside the function.  Alternatively, this can be a character vector of objects, that will be looked for in the global environment, rather than a named list.  All other objects in the current working directory will not be visible when the function is evaluated. THIS INCLUDES LIBRARIES WHICH MUST BE RE-CALLED WITHIN THE FUNCTION BEFORE USE.  In order to use functions within an R library it is therefore necessary for the required library to be installed on the Xgrid nodes on which the job will be run.  If not all nodes have the required libraries installed, you can use an ART script to ensure the job is sent only to machines that do (see the example provided below), or you can use mgrid to manually request certain nodes using the '-f -h <nodename>' options. Alternatively, text files containing R code can be included in the 'file.list' argument and source()d within the function. Default blank list (no objects copied).
一个命名的对象列表将被复制到的全球环境Xgrid的,所以可以看到里面的功能。另外,这可以是一个字符矢量对象,将寻找在全球环境中,而不是一个名为list的。当前的工作目录中所有其他对象时不可见的功能进行评估。这包括必须重新图书馆内调用的函数,然后使用。为了在R库使用的功能,因此,它是必要的Xgrid的节点上安装该作业将运行所需的库。如果不是所有的节点都安装所需的库,你可以使用ART的脚本,以确保工作只发送给机,(见下面的例子),或者你可以使用MGRID手动请求某些节点使用“-F -H <NODENAME>“选项。另外,文本文件,其中包含R代码可以包含在“file.list参数和内源()D的功能。默认空白的列表(没有对象复制)。


参数:file.list
a vector of filenames representing files in the current working directory that will be copied to the working directory of the executed function.  This allows R code to be source()d, datasets to be loaded, and compiled code to be dynamically linked within the function, among other things.  Default none.
一个矢量文件名,表示当前的工作目录中的文件将被复制到工作目录的执行功能。这使R代码源()D,要加载的数据集,并进行动态编译的代码,内挂的功能,其中包括。默认没有。


参数:threads
the number of threads to generate for the job.  Threads is taken to mean jobs if xgrid.method is 'separatejobs' or tasks if xgrid.method is 'separatetasks'.  Each thread is sent to a separate node for execution, so the more threads there are the faster the job will finish (unless the number of threads exceeds the number of available nodes).  A very large number of threads may cause problems with the Xgrid controller, hence the ability to set fewer threads than iterations.  Functions that return objects of a very large size should use a large number of threads and use the xgrid.method 'separatejobs' to minimise the total size of objects returned by each xgrid job.  Default is equal to the number of iterations if this is less than 100, or 100 otherwise.
的工作产生的线程数。主题意味着工作,如果xgrid.method是的“separatejobs”或任务,如果xgrid.method是“separatetasks”。每个线程被发送到一个单独的节点执行的,所以更多的线程有更快的作业将完成(除非的线程数超过可用节点的数目)。一个非常大的线程数可能会导致问题的Xgrid控制器,因此能够设置更少的线程迭代。函数返回一个非常大的大小的对象应使用大量的线程,并使用本xgrid.method“separatejobs”的对象返回的每一个Xgrid的工作,以尽量减少总规模。默认值是相等的迭代次数,如果这是小于100,否则或100。


参数:arguments
a list of values to be passed as the first argument to the function, with each element of the list specifying the value at that iteration.  Default is as.list(1:niters) which passes only the iteration number to the function.
要传递给函数的第一个参数,用指定的值迭代列表中的每个元素的值的列表。的默认是as.list(1:niters)函数只通过迭代次数。


参数:jobname
for all functions except xgrid.results.jags, the jobname can be provided to make identification of the job using Xgrid Admin easier.  If none is provided, then one is generated using a combination of the username and hostname of the submitting machine. If the provided jobname is already used by a file/folder in the working directory, then the name is altered to be unique using new_unique().  For xgrid.results.jags, the jobname must be supplied to match the jobname value returned by xgrid.submit.jags(file) during job submission.
除了xgrid.results.jags的所有功能,在JOBNAME可以通过Xgrid管理员更容易的工作作出鉴定。如果没有提供,再一个是使用相结合的提交机器的用户名和主机名。如果提供的工作名已被使用的工作目录中的文件/文件夹,然后改变了是唯一使用new_unique的()。 xgrid.results.jags,jobName必须提供相匹配的工作名在作业提交由xgrid.submit.jags(文件)返回的值。


参数:wait.interval
when running xgrid jobs synchronously, the waiting time between retrieving the status of the job.  If the job is found to be finished on retrieving the status then results are returned, otherwise the function waits for 'wait.interval' before repeating the process.  Time units of seconds, minutes, hours, days or weeks can be specified.  If no units are given the number is assumed to represent minutes.  Default "10 min".
Xgrid的工作同步运行时,检索的工作状态之间的等待时间。如果作业完成检索的状态,然后结果被返回,否则功能等待wait.interval“,然后重复这个过程。单位为秒,分钟,小时,天或数周的时间,可以指定。如果没有单位给出的数字被假定为代表分钟。默认的“10分钟”。


参数:xgrid.method
the method of submitting the work to Xgrid - one of 'simple', 'separatejobs' or 'separatetasks'.  The former runs all chains on a single node, whereas 'separatejobs' runs all chains as individual xgrid jobs and 'separatetasks' runs all chains as individual tasks within the same job (this makes the job information in Xgrid Admin easier to read).  The latter method requires a submission script that is capable of supporting multi-task jobs, such as the mgrid script included with the runjags package (see the details section for more details and installation instructions).  If each chain is likely to return a large amount of information then 'separatejobs' should be used; this is because jobs are retrieved individually which reduces the chances of overloading the Xgrid controller. Default 'simple' if threads==1; otherwise 'separatetasks' if mgrid is available or 'separatejobs' if not.
提交的工作Xgrid的 - 一个“简单”,“separatejobs”或“separatetasks”的方法。前者在单个节点上运行的所有链条,,而“separatejobs”运行所有个人的Xgrid的工作和“separatetasks”链链运行的所有个人在同一个工作任务(这使得Xgrid的管理员更容易阅读的作业信息)。后一种方法需要提交的脚本,它是能够支持多任务工作,如MGRID脚本的runjags包(见细节部分更多的细节和安装说明)。如果每个链是可能返回一个大量的信息,应使用然后separatejobs;这是因为工作单独检索时间超载Xgrid控制器的机会。默认的“简单”,如果线程== 1;否则“separatetasks”,如果MGRID提供或separatejobs“如果不是。


参数:Rpath
the path to the R executable on the xgrid machines. If not all machines on the xgrid cluster have R (or a required package) installed then it is possible to use an ART script to ensure the job is sent to only machines that do - see the examples section for details.  Default '/usr/bin/R' (this is the default install location for R).
R可执行文件的路径的Xgrid的机器上。如果不是所有的机器的利用Xgrid聚类上安装有R(或所需的软件包),然后就可以使用ART的脚本,以确保作业被发送,只有机器 - 参见范例章节的细节。默认的/ usr / bin中/ R(这是默认的安装位置为R)。


参数:Rbuild
the preferred binary of R to invoke.  '64' results in  'Rpath64' (if it exists), '32' in 'Rpath32' (if it exists) and &rdquo;  (or either of '32' or '64' if they are not found) results in Rpath. Notice that this indicates a preference, not a certainty - if the indicated build is not avalable then another will be used.  Also note that specifying '64' may be ignored for PPC nodes depending on what version of R they are running (you can ensure only intel nodes are used with mgrid using sub.options='-c intel').  Default &rdquo;.
首选的二进制文件的R调用。 “Rpath64”(如果存在的话),32在“Rpath32”(如果存在的话)“(或任32或6464,如果他们没有发现)在RPATH的结果。请注意,这表明的偏好,而不是必然的 - 如果指示的生成是不avalable然后另一个将被使用。另外请注意,可能会被忽略指定64为PPC节点根据它们正在运行什么版本的R(可以确保只有intel的节点使用MGRID使用sub.options =-C intel的)。默认值“。


参数:partial.retrieve
for xgrid.results, option to retrieve results of partially completed jobs.  By default makes cleanup FALSE.   Default TRUE.
xgrid.results,选项检索结果部分完成的工作。默认情况下,清理FALSE。默认为true。


参数:cleanup
option to delete the job(s) from Xgrid after retrieving result.
选项删除作业(S),从检索结果后,利用Xgrid。


参数:submitandstop
controls whether job should be run synchronously (submitandstop=FALSE), in which case the process will wait for the model to complete before returning the results, or asynchronously (submitandstop=TRUE), in which case the process will terminate on submission of the job and results are retrieved at a later time.  Default for xgrid.run is FALSE.  xgrid.submit is a wrapper to xgrid.run with submitandstop=TRUE.
控制是否工作应同步运行(submitandstop = FALSE),在这种情况下,进程将等待完成,然后返回结果,或者异步(submitandstop = TRUE)的模型,在这种情况下,过程将终止在提交的工作和检索结果在稍后的时间。为xgrid.run的默认值是FALSE。 xgrid.submit是一个包装以xgrid.run与submitandstop = TRUE。


参数:tempdir
for xgrid.run, option to use the temporary directory as specified by the system rather than creating files in the working directory.  Any files created in the temporary directory are removed when the function exits.  A temporary directory cannot be used for xgrid.submit.  Default TRUE when running the job synchronously.
xgrid.run,选项使用指定的临时目录,而不是建立在工作目录中的文件系统。退出函数时,在临时目录中创建的任何文件被删除。一个临时目录不能使用的xgrid.submit。默认为true时同步运行的工作。


参数:keep.files
option to keep the folder with files needed to run the job rather than deleting it, or copy the folder to the working directory before exiting if tempdir=TRUE.  This may be useful for attempting to bug fix failing jobs.  Default FALSE.
选项保持运行所需的工作,而不是将其删除的文件,文件夹或文件夹复制到工作目录,在退出之前如果TEMPDIR = TRUE。这可能是有益的尝试错误修复失败作业。默认为false。


参数:show.output
option to print the output of the function (obtained using cat, writeLine or print for example) at each iteration after retrieving the job(s) from xgrid.  If FALSE, the output is suppressed.  Default TRUE.
选项来打印输出的功能(使用猫,WriteLine,或者例如打印获得),在每次迭代后检索的作业(s)从XGRID。如果为FALSE,输出被抑制。默认为true。


参数:max.filesize
the maximum total size of the objects produced by the function for each thread if xgrid.method=separatejobs, or for the entire job if xgrid.method=separatetasks.  This is a failsafe designed to prevent attempted transfer of huge files bringing the xgrid controller down.  If the maximum size is exceeded for a thread or job then the results are erased for all iterations within that thread or job, and the job will likely have to be re-submitted.  If each chain is likely to return a large amount of information, then 'separatejobs' should be used because jobs are retrieved individually which reduces the chances of overloading the Xgrid controller.  The object.list is also checked to ensure it complies with the maximum size, but the file.list and any objects saved to the working directory by the function are NOT automatically cheked.  Units can be provided as either "MB" or "GB".  Default "1GB".
最大总大小的功能为每个线程如果xgrid.method = separatejobs的,或整个作业所产生的对象如果xgrid.method = separatetasks。这是一种可靠的,旨在防止企图转移带来的巨大文件的Xgrid控制器。如果超过一个线程或工作的最大尺寸,然后将结果被擦除所有迭代在该线程或工作,并可能将不得不重新提交的工作。如果每个链是可能返回一个大量的信息,然后separatejobs“应被使用,因为作业检索单独的机会减少超载Xgrid控制器。的object.list检查,以确保它符合的最大大小,但file.list和不会自动cheked的任何对象保存到工作目录的功能。单位可被提供为“MB”或“绿化地带”。默认“1GB”。


参数:sub.app
the submission application or script to use for job running/submission.  The inbuilt Xgrid application supports most options, but greater functionality is provided by the mgrid script (see the details section for more information and installation instructions). Any other custom script can be used with the requirements that it submit the job provided and print the Xgrid job ID to screen before exiting (as the only numerical value printed), or alternatively the script may submit the job and create a 'jobid.txt' file in the working directory containing the job id.  If xgrid.method is 'separatejobs' then the argument may be of length equal to the number of chains, in which case each job is submitted using a different application/script. Paths with spaces in them must be quoted when the command is passed to the shell (this may mean escaping quotes if necessary).  Default uses mgrid if installed, otherwise 'xgrid -job submit'.
提交应用程序或脚本,用于运行/提交工作。内置的Xgrid的应用程序支持大多数的选择,但更强大的功能,是由MGRID脚本(见细节部分更多的信息和安装说明)。任何其他自定义脚本可以使用的要求,提交作业和打印的Xgrid的工作ID的屏幕,在退出之前(作为唯一的数值打印),或者脚本提交作业,并创建一个“jobid.txt 文件在工作目录中包含的作业ID。 ,如果xgrid.method是separatejobs,则该参数可能是长度等于链的数量,在这种情况下,每个作业提交使用不同的应用程序/脚本。带空格的路径必须用引号时,该命令传递给shell(这可能意味着转义引号,如果必要的话)。默认情况下使用MGRID如果已安装,否则“的Xgrid的工作。


参数:sub.options
one or more option flags to be passed through to the submission application (as a character string).  Examples include ART scripts, email on job completion, and when using the mgrid script many other possibilities (see the details section).  When providing links to files as part of the command, all links must be absolute (ie start with / or ~) as xgrid/mgrid will be will not be called in the working directory, and paths with spaces must be quoted.  If xgrid.method is 'separatejobs' then the argument may be of length equal to the number of chains, in which case each job receives a different set of options.  Some options require the Xgrid controller to be running OS X Leopard (10.5) or later.  Default none.
通过一个或多个选项标志提交应用程序(如字符串)。例子包括ART脚本,电子邮件作业完成,当使用MGRID脚本许多其他的可能性(见细节部分)。提供链接的文件作为命令的一部分时,所有的链接必须是绝对的(即开始与/或~)Xgrid的/ MGRID将不会被调用的工作目录,并带有空格的路径必须用引号。 ,如果xgrid.method是separatejobs,则该参数可能是长度等于链的数量,在这种情况下,每个作业接收一组不同的选项。有些选项需要Xgrid控制器执行OS X Leopard(10.5)或更高版本。默认没有。


参数:sub.command
the actual command to be executed using system() to submit the job.  Changing this results in sub.app and sub.options being ignored, and is probably the best option to use for custom submission scripts (see the sub.app argument for the requirements for custom scripts).  The environmental variables $cmd (the name of the BASH script to be run), $ntasks (the number of tasks), $job (the job number for multiple jobs), and $indir (the input directory) will be available to the script.  For multiple tasks, the custom script should ensure that the task number is supplied as the (only) argument to the BASH script (requires xgrid.method="separatetasks" to function).  If xgrid.method is 'separatejobs' then the argument may be of length equal to the number of chains, in which case each job receives a different command.  Paths with spaces in them must be quoted when the command is passed to the shell (this may mean escaping quotes if necessary).  Default uses the values of sub.app and sub.options.
实际要执行的命令,用system()来提交作业。改变这个结果sub.app sub.options被忽略,可能是最好的选择使用自定义提交脚本(看到的sub.app的参数的要求自定义脚本)。环境变量$(CMD运行的bash脚本的名称),$ ntasks(任务数),工作(工号为多个作业),和美元的间接输入(目录)将提供给脚本。对于多任务,应确保任务数(只)的BASH脚本的参数(需要xgrid.method的“separatetasks”功能)提供的自定义脚本。 ,如果xgrid.method是separatejobs,则该参数可能是长度等于链的数量,在这种情况下,每个作业接收一个不同的命令。带空格的路径必须用引号时,该命令传递给shell(这可能意味着转义引号,如果必要的话)。默认使用的sub.app和sub.options。


参数:X
for xapply, a vector (atomic or list) over which to apply the function provided.  Equivalent to 'arguments' for xgrid.run, with niters = length(X).
xapply,一个向量(原子或列表)应用提供的功能。相当于为xgrid.run的“参数”,与niters =长度(X)。


参数:FUN
for xapply, the function to be passed to xgrid.run as 'f'.
为xapply,在功能上被传递给xgrid.run与“f”。


参数:xgrid.options
for xapply, any arguments (with the exception of 'f', 'niters' and 'arguments' which are ignored) to be passed to  xgrid.run.
xapply,任何参数(除了F,niters“和”参数“,将被忽略)要传递给xgrid.run的。


参数:...
additional arguments to be passed to the function provided.
额外的参数传递给提供的功能。


Details

详细信息----------Details----------

These functions allow JAGS models to be run on Xgrid distributed computing clusters from within R using the same syntax as required to run the models locally.  All the functionality could be replicated by saving all necessary objects to files and using the Xgrid command line utility to submit and retrieve the job manually; these functions merely provide the convenience of not having to do this manually.  Xgrid support is only available on Mac OS X machines.  
这些功能允许JAGS机型上运行的分布式计算聚类在R Xgrid的使用相同的语法需要在本地运行模式。保存所有必要的文件和使用对象的Xgrid的命令行实用程序提交和手工检索作业的所有功能都可以被复制,这些功能只是提供了方便,不必手动执行此操作。 Xgrid的支持仅适用于Mac OS X的机器。

The xgrid controller hostname and password must be set as environmental variables. The command line version of R knows about environmental variables set in the .profile file, but unfortunately the GUI version does not and requires them to be set from within R using:
Xgrid控制器的主机名和密码,必须设置环境变量。 。profile文件中设置环境变量的命令行版本的R知道,但不幸的是,GUI版本并不需要来设置,使用在R:

Sys.setenv(XGRID_CONTROLLER_HOSTNAME="<hostname>")
Sys.setenv(XGRID_CONTROLLER_HOSTNAME =“<HOSTNAME>”)

Sys.setenv(XGRID_CONTROLLER_PASSWORD="<password>")
Sys.setenv(XGRID_CONTROLLER_PASSWORD =“<密码>”)

(These lines could be copied into your .Rprofile file for a 'set and forget' solution)
(这些行复制到。Rprofile文件“设置和忘记的解决方案)

All functions can be run using the built-in xgrid commands, however some added functionality (including multi-tasks jobs to enable the 'separatetasks' method) is provided by the 'mgrid.sh' BASH shell script which is included with the runjags package (in the 'inst/xgrid' folder for the package source or the 'xgrid' folder for the installed package). More details about this script is given at the top of the mgrid.sh file. To install (optional), see the install.mgrid function.
所有的功能都可以运行Xgrid的命令,但一些附加功能(包括多任务工作,使“separatetasks”方法)所提供的“mgrid.sh”的bash shell脚本,包括的runjags包在“安装/ XGRID”文件夹中的包的源或“XGRID”文件夹中已安装的软件包。有关此脚本的更多细节是在顶部的mgrid.sh文件中。 (可选)要安装,请参阅install.mgrid功能。


值----------Value----------

For xgrid.submit, a list containing the jobname (which will be required by xgrid.results to retrieve the job) and the job ID(s) for use with the xgrid command line facilities.  For xgrid.run and xgrid.results, the output of the function over all iterations is returned as a list, with each element of the list representing the results at each iteration.  If the function returned an error, then the error will be held in the list as the return value at the iteration that returned the error.  If the function returns an object that exceeds the 'max.filesize' when combined with the results for other iterations in that job (or greater than max.filesize/threads for multi-task jobs), the results for that thread are replaced with an error message (this is to prevent the xgrid controller crashing due to transferring large files). The xapply function returns as xgrid.run (or xgrid.submit if xgrid.options=list(submitandstop=TRUE) in which case the results can be retrieved using xgrid.results).
对于xgrid.submit,一个列表,其中包含的作业名(这将在由xgrid.results要求来检索作业)和作业ID(s)为使用与XGRID命令行设施。对于xgrid.run和xgrid.results超过所有迭代的功能的输出被返回作为一个列表,与代表在每次迭代的结果的列表的每个元素。如果函数返回一个错误,那么错误将在迭代返回错误的返回值在列表中举行。如果该函数返回一个迭代,工作(或大于线程多任务作业max.filesize /)的结果超过“max.filesize”相结合的对象,结果该线程被替换为一个错误消息(这是为了防止由于传输大文件的Xgrid控制器崩溃)。该函数返回作为xgrid.run(或xgrid.submit如果xapply xgrid.options =(submitandstop = TRUE),在这种情况下的结果的列表可以检索使用xgrid.results)。


(作者)----------Author(s)----------


Matthew Denwood <a href="mailto:matthew.denwood@glasgow.ac.uk">matthew.denwood@glasgow.ac.uk</a>



参见----------See Also----------

xgrid.run.jags for functions to run JAGS models on Xgrid, or run.jags to do so locally.
xgrid.run.jags功能,运行JAGS模型上利用Xgrid,或run.jags这样做在本地。

install.mgrid to install the mgrid script.
install.mgrid安装在MGRID脚本。

mclapply and mcparallel in the multicore package for parallel execution of code over multiple local cores.
mclapply和mcparallel在多个内核并行执行代码的多核包。


实例----------Examples----------



# A basic example of synchronous running of code over 100 iterations, [一个基本的例子同步运行的代码超过100次迭代,]
# split up between 10 tasks (or 10 jobs if mgrid is not installed):[分裂之间的10个任务(或10个就业机会,如果没有安装MGRID):]

## Not run: [#不运行:]

# The function to evaluate:[功能评价:]
f <- function(iteration){
        # All objects supplied to object.list will be visible here, but[供给到object.list所有对象将是可见的,但]
        # remember to call all necessary libraries within the function[记得打检测给所有必需的库函数内的]
       
        cat("Running iteration", iteration, "\n")
        # Some lengthy code evaluation....[一些冗长的代码评估....]
       
        output <- rpois(10, iteration)
        return(output)
}

# Run the function on xgrid for 100 iterations split between 10 machines:[运行100次迭代,分为10台机器的功能XGRID:]
results <- xgrid.run(f, niters=100, threads=10)


## End(Not run)[#(不执行)]



# A basic example of xapply to calculate the mean of a list of numbers:[一个基本的例子xapply计算的平均数字的列表:]

## Not run: [#不运行:]

# A list of 3 datasets from which to calculate the mean:[计算的平均3个数据集的列表:]
datasets <- list(c(1,5,6,NA), c(9,2,NA,0), c(-1,4,10,20))

# Standard lapply syntax:[标准lapply语法:]
results1 <- lapply(datasets, mean, na.rm=TRUE)

# Equivalent xapply syntax:[等效xapply语法:]
results2 <- xapply(datasets, mean,
xgrid.options=list(wait.interval='15s'), na.rm=TRUE)

# Or submit the job:[或提交的工作:]
id <- xapply(datasets, mean, xgrid.options=list(submitandstop=TRUE),
na.rm=TRUE)
# And retrieve the results:[检索结果:]
results3 <- xgrid.results(id)


## End(Not run)[#(不执行)]



# Any packages required by the function need to be installed on the[任何函数所要求的产品的,需要安装在]
# nodes the function is run on.  This function retrieves information[节点的函数上运行。该函数获取信息]
# about the available packages on each of the node names provided:[可用的软件包的节点名称:]

## Not run: [#不运行:]

# The name of one or more nodes to get information about:[一个或多个节点的名称,以获取有关:]
nodenames <- c("mynode", "guestnode", "othernode")

# Run the job:[运行的工作:]
results <- xgrid.run(function(i){
                return(installed.packages()[,'Version'])
        },
        niters=length(nodenames), threads=length(nodenames),
        wait.interval="10 seconds", xgrid.method='separatejobs',
        sub.options=paste("-f -h '", nodenames, "'", sep=""),
        show.output=FALSE)
# Make the names match up to the statistics:[的名称相匹配的统计:]
names(results) <- nodenames

# Show the available packages and their versions for each node:[显示每个节点的可用的软件包和它们的版本:]
results


## End(Not run)[#(不执行)]


# An example of running an Xgrid job within another Xgrid job, using [运行Xgrid的工作在另一个Xgrid的工作范围内,使用的一个例子]
# xgrid.submit to submit a job that runs a JAGS model to convergence [xgrid.submit提交作业,运行一个JAGS的模式,以收敛]
# using xgrid.autorun.jags:[使用xgrid.autorun.jags:]

## Not run: [#不运行:]

# Create an ART script to make sure that (a) R is installed, [创建一个(ART)的脚本,以确保:(a)中R安装,]
# (b) JAGS is installed, and (c) the runjags package is installed [(二)JAGS安装,及(c)的runjags包安装]
# on the node:[节点上的:]
cat('#!/bin/bash[!/斌/庆典]

if [ ! -f /usr/bin/R ]; then
echo 0
exit 0
fi
if [ ! -f /usr/local/bin/jags ]; then
echo 0
exit 0
fi
/usr/bin/R --slave -e "suppressMessages(r<-require(runjags,quietly=T));cat(r*1,fill=T)"
exit 0
', file='runjagsART.sh')

# Some data etc we will need for the model:[一些数据等,我们需要的模型:]
library(runjags)

X <- 1:100
Y <- rnorm(length(X), 2*X + 10, 1)
data <- dump.format(list(X=X, Y=Y, N=length(X)))

# Model in the JAGS format[模型的JAGS格式]
model <- "model {
for(i in 1 : N){
Y[i] ~ dnorm(true.y[i], precision);
true.y[i] <- (m * X[i]) + c;
}
m ~ dunif(-1000,1000);
c ~ dunif(-1000,1000);
precision ~ dexp(1);
}"

# Get the Xgrid controller hostname and password to be passed [获取Xgrid控制器的主机名和密码,以通过]
# to the slave job:[到奴隶作业:]
hostname <- Sys.getenv('XGRID_CONTROLLER_HOSTNAME')
password <- Sys.getenv('XGRID_CONTROLLER_PASSWORD')

# The function we are going to call on xgrid:[功能我们要呼吁XGRID的:]
f <- function(iteration){
        # Make sure the necessary environmental variables are set:[确保必要的环境变量的设置:]
        Sys.setenv(XGRID_CONTROLLER_HOSTNAME=hostname)
        Sys.setenv(XGRID_CONTROLLER_PASSWORD=password)
       
        # Call the library on the node:[调用库中的节点上:]
        library(runjags)
       
        # Use xgrid.autorun.jags to run 2 chains until convergence:[使用xgrid.autorun.jags运行链,直到收敛:]
        results <- xgrid.autorun.jags(model=model,
                monitor=c("m", "c", "precision"), data=data, n.chains=2,
                inits=list(list(.RNG.name='base::Wichmann-Hill'),
                list(.RNG.name='base::Marsaglia-Multicarry')),
                plots = FALSE, xgrid.method='separatejobs',
                wait.interval='1 min', jobname='xgridslavejob')
       
        return(results)
}

# Submit the function to xgrid using our ART script to ensure the [使用我们的艺术脚本,以确保提交的功能XGRID]
# node can handle the job (the ART script path must be specified as [节点都可以处理作业(ART脚本的路径必须指定为]
# an absolute link as xgrid won't be called in the current working [作为一个绝对的链接XGRID将不会被调用在当前工作]
# directory, and all paths must be enclosed in quotes to preserve [目录,所有路径都必须包含在引号中保存]
# spaces):[空格):]
name <- xgrid.submit(f, object.list=list(X=X, Y=Y, model=model,
        data=data, hostname=hostname, password=password), threads=1,
        niters=1, sub.options=if(!file.exists(Sys.which('mgrid')))
        paste('-art "', getwd(), '/runjagsART.sh"', sep='') else
        paste('-a "', getwd(), '/runjagsART.sh"', sep=''),
        xgrid.method='simple')
# Cleanup (remove runjagsART file):[清理(删除runjagsART文件):]
unlink('runjagsART.sh')

# Get the results once it is finished:[一旦完成,获取的结果:]
results <- xgrid.results(name)$iteration.1


## End(Not run)[#(不执行)]



## Not run: [#不运行:]

# Subit an xgrid job just to see which packages are installed [Subit一个的Xgrid的工作只是安装了哪些软件包]
# on a particular machine.[在特定机器上。]

# Ensure mgrid is installed:[确保MGRID的安装:]
if(!file.exists(Sys.which('mgrid'))) install.mgrid()

# A function to harvest details of R version and installed packages:[收获的R版本和已安装的软件包的功能:]
f <- function(i){

archavail <- any(dimnames(installed.packages())[[2]]=='Archs')

# To deal with older versions of R:[为了处理与旧版本的R:]
if(archavail){
packagesinst <- installed.packages()[,c('Version', 'Archs', 'Built')]
}else{
packagesinst <- installed.packages()[,c('Version', 'OS_type', 'Built')]
}

Rinst <- unlist(R.version[c('version.string', 'arch', 'platform')])
names(Rinst) <- c('Version', 'Archs', 'Built')
return(rbind(R=Rinst, packagesinst))

}

# Or to get more details about a particular package:[或一个特定的软件包,以获得更多的细节:]
g <- function(i){
        p <- library(help='bayescount')
        return(p$info)
}

# Get the information back from 2 specific machines called 'newnode1' [获取信息,从特定的机器被称为“newnode1]
# and 'newnode2':[和“newnode2:]

results <- xgrid.run(f, niters=2, threads=2,
sub.options='-h newnode1:newnode2', wait.interval='15 seconds')


## End(Not run)[#(不执行)]




转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2024-11-28 06:46 , Processed in 0.032296 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表