mysqlDBApply(RMySQL)
mysqlDBApply()所属R语言包:RMySQL
Apply R/S-Plus functions to remote groups of DBMS rows (experimental)
R / S-PLUS功能应用到远程DBMS行(实验组)
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Applies R/S-Plus functions to groups of remote DBMS rows without bringing an entire result set all at once. The result set is expected to be sorted by the grouping field.
组远程DBMS行,而不适用于R / S-PLUS的功能,使整个结果集在一次。预期的结果集进行排序分组字段。
用法----------Usage----------
mysqlDBApply(res, INDEX, FUN = stop("must specify FUN"),
begin = NULL,
group.begin = NULL,
new.record = NULL,
end = NULL,
batchSize = 100, maxBatch = 1e6,
..., simplify = TRUE)
参数----------Arguments----------
参数:res
a result set (see dbSendQuery).
一个结果集(见dbSendQuery“)。
参数:INDEX
a character or integer specifying the field name or field number that defines the various groups.
一个字符或整数,指定的字段名或字段,它定义了各组的数字。
参数:FUN
a function to be invoked upon identifying the last row from every group. This function will be passed a data frame holding the records of the current group, a character string with the group label, plus any other arguments passed to dbApply as "...".
一个函数被调用时识别从每个组中的最后一行。此功能将通过一个数据框持有的记录当前组,同组标签的字符串,再加上任何其他的参数传递给dbApply"..."。
参数:begin
a function of no arguments to be invoked just prior to retrieve the first row from the result set.
没有参数的函数被调用之前的检索结果集的第一行。
参数:end
a function of no arguments to be invoked just after retrieving the last row from the result set.
没有参数的函数被调用,只是在检索结果集的最后一行。
参数:group.begin
a function of one argument (the group label) to be invoked upon identifying a row from a new group </table>
一个函数被调用的一个参数(组标签),当发现一个新的组中的行</ TABLE>
参数:new.record
a function to be invoked as each individual record is fetched. The first argument to this function is a one-row data.frame holding the new record.
一个函数被调用,每个记录是牵强的。这个函数的第一个参数是一列数据框的新纪录。
参数:batchSize
the default number of rows to bring from the remote result set. If needed, this is automatically extended to hold groups bigger than batchSize.
默认的行数,以实现从远端结果集。如果需要的话,将自动延长持有群体大于batchSize。
参数:maxBatch
the absolute maximum of rows per group that may be extracted from the result set.
绝对最大每组可提取结果集的行。
参数:...
any additional arguments to be passed to FUN.
任何额外的参数传递给FUN。
参数:simplify
Not yet implemented
尚未实施
Details
详细信息----------Details----------
dbApply This function is meant to handle somewhat gracefully(?) large amounts of data from the DBMS by bringing into R manageable chunks (about batchSize records at a time, but not more than maxBatch); the idea is that the data from individual groups can be handled by R, but not all the groups at the same time.
dbApply几分优雅,此功能是为了处理大量的数据的DBMS(?)将到R可管理的块(约batchSize记录一次,但不超过maxBatch )的想法是由R,从个别组的数据可以被处理,但不是所有的基团在相同的时间。
The MySQL implementation mysqlDBApply allows us to register R functions that get invoked when certain fetching events occur. These include the “begin” event (no records have been yet fetched), “begin.group” (the record just fetched belongs to a new group), “new record” (every fetched record generates this event), “group.end” (the record just fetched was the last row of the current group), “end” (the very last record from the result set). Awk and perl programmers will find this paradigm very familiar (although SAP's ABAP language is closer to what we're doing).
MySQL执行mysqlDBApply允许我们登记的R函数被调用时,一定取事件发生。这些措施包括“开始”事件(还没有记录已取的),“begin.group”(所记录的只是取到一个新的组),“新纪录”(每牵强记录生成此事件),“ group.end“(只是牵强的记录是当前组的最后一行),”结束“(结果集的最后一条记录)。 Awk和Perl程序员会发现这个范例非常熟悉(尽管SAP的ABAP语言是我们在做什么)。
值----------Value----------
A list with as many elements as there were groups in the result set.
一个列表一样多的元素有团体在结果集中。
注意----------Note----------
This is an experimental version implemented only in R (there are plans, time permitting, to implement it in S-Plus).
这是一个实验性的版本,只有在实施R(有计划,时间允许的情况下,实现它在S-PLUS)。
The terminology that we're using is closer to SQL than R. In R what we're referring to “groups” are the individual levels of a factor (grouping field in our terminology).
,我们使用的术语是在R比R. SQL我们指的“群体”是个人层面的因素(分组域在我们的术语)。
参见----------See Also----------
MySQL, dbSendQuery, fetch.
MySQL,dbSendQuery,fetch。
实例----------Examples----------
## compute quanitiles for each network agent[为每个网络代理的#计算quanitiles,]
con <- dbConnect(MySQL(), group="vitalAnalysis")
res <- dbSendQuery(con,
"select Agent, ip_addr, DATA from pseudo_data order by Agent")
out <- dbApply(res, INDEX = "Agent",
FUN = function(x, grp) quantile(x$DATA, names=FALSE))
## End(Not run)[#(不执行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|