track-package(track)
track-package()所属R语言包:track
Overview of track package
跟踪包概述
译者:生物统计家园网 机器人LoveR
描述----------Description----------
The track package sets up a link between R objects in memory and files on disk so that objects are automatically saved to files when they are changed. R objects in files are read in on demand and do not consume memory prior to being referenced. The track package also tracks times when objects are created and modified, and caches some basic characteristics of objects to allow for fast summaries of objects.
这条赛道包套R对象在内存和文件在磁盘上的对象会自动保存到文件时,改变之间的联系。 R对象文件中读入的需求,不消耗内存之前被引用。跟踪包跟踪当对象被创建和修改的时间,并缓存对象的一些基本特征,以便快速汇总的对象。
Each object is stored in a separate RData file using the standard format as used by save(), so that objects can be manually picked out of or added to the track database if needed. The track database is a directory usually named rdatadir that contains a RData file for each object and several housekeeping files that are either plain text or RData files.
每个对象被存储在一个单独的RDATA使用如所使用的标准格式的文件save(),使对象可以手动拾取满分或附加到的轨道数据库中,如果需要的话。轨道数据库是一个目录,通常命名为rdatadir每个对象和几个管理文件,纯文本或RDATA文件,其中包含一个RDATA文件。
Tracking works by replacing a tracked variable by an activeBinding, which when accessed looks up information in an associated 'tracking environment' and reads or writes the corresponding RData file and/or gets or assigns the variable in the tracking environment. In the default mode of operation, R variables that are accessed are stored in memory for the duration of the top level task (i.e., in one expression evaluated from the prompt.) A callback that is called each time a top-level-task completes does three major things:
跟踪的工作原理,通过更换跟踪变量的activeBinding,访问时,看起来在一个相关的“跟踪环境信息,并读取或写入相应的的RDATA文件和/或获取或分配的跟踪环境中的变量。在默认的操作模式,R被访问的变量都存储在内存中的顶级任务的持续时间(即在一个表达式求值的提示。)一个回调函数,每次调用的顶级任务完成主要做了三件事情:
detects newly created or deleted variables, and adds or removes from the tracking database as appropriate, and
检测到新创建或删除变量,添加或删除适当跟踪数据库,
writes changed variables to the database, and
改变的变量写入到数据库中,
deletes cached objects from memory.
从内存中删除缓存的对象。
Tracking is not particularly suitable for storing objects that contain environments, because those environments and their contents will be fully written out in the saved file (in a live R session, environments are references, and there can be multiple references to one environment.) Functions are one of the most common objects that contain environments, which can contain data objects local to the function (e.g., see the examples in the R FAQ in the section "Lexical scoping" under "What are the differences between R and S?" http://cran.r-project.org/doc/FAQ/R-FAQ.html#Lexical-scoping). Additionally, the results of some modeling functions contain environments, e.g., lm holds several references to the environment that contains the data. When an lm object is save'ed, the environment containing the data, and all the other objects in that environment, can be saved in the same file. To work with large data objects and modeling functions, consider first creating a tracking database that contains the data objects. Then, in a different R session (which can be running at the same time), use track.attach to attach the db of data objects at pos=2 on the search list. When working in this way, the data objects will only be kept in memory when being used, and modeling functions that record environments in their results can be successful used (though beware of modeling functions that store large amounts of data in their results.) Alternatively, use modeling functions that do not store references to environments. The utility function show.envs from the track package will show what environments are referenced within an object (though it is not guaranteed to find them all.)
跟踪是不是特别适合用于存储对象,其中包含的环境,因为这种环境下,它们的内容将被完全写出来保存的文件(在现场的R会话,环境的引用,并且可以有多个引用一个环境)功能是一个最常见的对象,其中包含的环境中,它可以包含本地数据对象的功能(例如,看到的例子在R常见问题“一节中词法范围”下的“R和S之间有什么区别? “http://cran.r-project.org/doc/FAQ/R-FAQ.html词汇划定范围)。此外,一些建模功能的结果包含环境,例如lm拥有多项引用的环境,其中包含的数据。当lm对象是save版,环境中的数据,并在该环境中的所有其他对象,可以保存在同一个文件中。要与大型数据对象和建模功能,可以考虑先建立一个跟踪数据库,其中包含的数据对象。然后,在不同的R会话(可同时运行),使用track.attach附加的数据库中的数据对象在pos=2的搜索列表。当以这种方式工作时,数据对象将仅被保存在内存中被使用时,和模拟功能,记录环境,在他们的结果可以成功或者使用(虽然提防建模的功能,用于存储大量的数据,在其结果。) ,使用的建模功能不存储环境。的效用函数show.envs从track包将呈现什么样的环境内被引用的对象(虽然它不能保证找到他们。)
The track package also provides a self-contained incremental history saving function that writes the most recent command to the file .Rincr_history at the end of each top-level task, along with a time stamp that does not appear in the interactive history. The standard history functionality (savehistory/loadhistory) in R writes the history only at the end of the session. Thus, if the R session terminates abnormally, history is lost.
这条赛道包还提供了一个独立的的增量记录保存功能,最近使用的命令写入的文件.Rincr_history在的每个顶级水平的任务,随着一个时间戳,不会出现在历史的互动。在R写入标准的历史的的功能(savehistory / LOADHISTORY),历史只在会话结束时。因此,如果在R会话异常终止时,历史记录丢失。
Details
详细信息----------Details----------
There are four main reasons to use the track package:
使用track包主要有四个原因:
conveniently handle many moderately-large objects that would collectively exhaust memory or be inconvenient to manage in files by manually using save(), load(), and/or save.image().
方便地处理许多中等大小的物体,将共同耗尽内存或不便管理的文件中手动使用save(),load(),和/或save.image()。
have changed or newly created objects saved automatically at the end of each top-level command, which ensures objects are preserved in the event of accidental or abnormal termination of the R session, and which also makes startup and saving much faster when many large objects in the global environment must be loaded or saved.
已更改或新创建的对象会自动保存在每个最高级别的命令,以确保对象被保存在R会话意外或异常终止的情况下,这也使得快启动,节省了大量时,许多大对象全球环境必须加载或保存。
keep track of creation and modification times on objects
跟踪对象的创建和修改时间
get fast summaries of basic characteristics of objects - class, size, dimension, etc.
得到了快速汇总的对象基本特征 - 类,大小,尺寸,等等。
There is an option to control whether tracked objects are cached in memory as well as being stored on disk. By default, objects are cached in memory for the duration of a top-level task. To save time when working with collections of objects that will all fit in memory, turn on caching with and turn off cache-flushing track.options(cache=TRUE, cachePolicy="none"), or start tracking with track.start(..., cache=TRUE, cachePolicy="none"). A possible future improvement is to allow conditional and/or more intelligent caching of objects. Some data that would be needed for this is already collected in access counts and times that are recorded in the tracking summary.
有一个选项来控制是否跟踪对象缓存在内存以及存储在磁盘上。默认情况下,对象被缓存在内存中的顶级任务的持续时间。工作时,为了节省时间,将所有适合在内存中的对象的集合,打开的缓存,并关闭缓存冲洗track.options(cache=TRUE, cachePolicy="none"),或开始追踪与track.start(..., cache=TRUE, cachePolicy="none")。未来可能的改进是允许的条件和/或更智能的缓存对象。这将需要的一些数据,已经收集在访问计数和时间被记录在跟踪摘要。
Here is a brief example of tracking some variables in the global environment:
在全球环境中跟踪一些变量下面是一个简单的例子:
The global environment is the default environment for tracking – it is possible to track variables in other environments, but that environment must be supplied as an argument to the track functions.
全球环境是默认的环境跟踪 - 它可以跟踪在其他环境中的变量,但该环境必须提供作为参数的跟踪功能。
By default, newly created or deleted variables are automatically added to or removed from the tracking database. This feature can be disabled by supplying auto=FALSE to track.start(), or by calling track.auto(FALSE).
默认情况下,新创建或删除的变量会自动添加到或从跟踪数据库中删除。供应auto=FALSE到track.start(),或致电track.auto(FALSE),这个功能可以被禁用。
When tracking is stopped, all tracked variables are saved on disk and will be no longer accessible until tracking is started again.
停止跟踪时,所有跟踪的变量保存在磁盘上,将不再跟踪访问,直到再次启动。
The objects are stored each in their own file in the tracking dir, in the format used by save()/load() (RData files).
对象存储在自己的跟踪目录中的文件,的格式使用save()/load()(RDATA文件)。
列表的基本功能和通用的调用方式----------List of basic functions and common calling patterns----------
For straightforward use of the track package, only a single call to track.start() need be made to start automatically tracking the global environment. If it is desired to save untrackable variables at the end of the session, track.stop() should be called before calling save.image() or q('yes'), because track.stop() will ensure that tracked variables are saved to disk and then remove them from the global environment, leaving save.image() to save only the untracked or untrackable variables. The basic functions used in automatic tracking are as follows:
直接利用的轨道包,只有一个调用track.start()需要开始自动跟踪全球环境。如果需要保存在会话结束时无法追踪的变量,track.stop()应该被调用前调用save.image()或q('yes'),因为track.stop()将确保跟踪的变量都保存到磁盘,然后将它们从全球环境,让save.image(),则仅保存的不露痕迹的或无法追踪的变量。自动跟踪中使用的基本功能如下:
track.start(dir=...): start tracking the global environment, with files saved in dir (the default is rdatadir).
track.start(dir=...):开始跟踪全球环境,保存的文件dir(默认为rdatadir)。
track.summary(): print a summary of the basic characteristics of tracked variables: name, class, extent, and creation, modification and access times.
track.summary():打印一个总结的跟踪变量的基本特征:名称,类别,程度,创建,修改和访问时间。
track.info(): print a summary of which tracking databases are currently active.
track.info():打印的总结,目前正在积极跟踪数据库。
track.stop(pos=, all=): stop tracking. Any unsaved tracked variables are saved to disk. Unless keepVars=TRUE is supplied, all tracked variables become unavailable until tracking starts again.
track.stop(pos=, all=):停止跟踪。任何未保存的跟踪变量保存到磁盘上。除非keepVars=TRUE提供,跟踪所有跟踪的变量变得不可用,直到再次启动。
track.attach(dir=..., pos=): attach an existing tracking database to the search list at the specified position. The default when attaching at positions other than 1 is to use readonly mode, but in non-readonly mode, changes to variables in the attached environment will be automatically saved to the database.
track.attach(dir=..., pos=):现有的跟踪数据库附加到搜索列表中的指定位置。 1以外的位置在安装时默认的是使用只读模式,但在非只读模式,在连接的环境变量的更改将被自动保存到数据库中。
track.rescan(pos=): rescan a tracking directory that was attached by track.attach() at a position other than 1, and that is preferably readonly.
track.rescan(pos=):重新扫描跟踪目录连接track.attach()的位置以外,而且最好是只读的。
For the non-automatic mode, four other functions cover the majority of common usage:
对于非自动模式,其他四个功能涵盖了大部分常见的用法:
track.start(dir=..., auto=TRUE/FALSE): start tracking the global environment, with files saved in dir
track.start(dir=..., auto=TRUE/FALSE):开始追踪全球环境,与保存在dir的文件
track(x): start tracking x - x in the global environment is replaced by an active binding and x is saved in its corresponding file in the tracking directory and, if caching is on, in the tracking environment
track(x):开始跟踪x - x在全球环境中被替换为一个动态绑定和x保存在其相应的跟踪目录中的文件,如果缓存是上,在跟踪的环境
track(x <- value): start tracking x
track(x <- value):开始跟踪x
track(list=c('x', 'y')): start tracking specified variables
track(list=c('x', 'y')):开始跟踪指定的变量
track(all=TRUE): start tracking all untracked variables in the global environment
track(all=TRUE):开始跟踪不露痕迹的在全球环境中的变量
untrack(x): stop tracking variable x - the R object x is put back as an ordinary object in the global environment
untrack(x):停止跟踪变量x - R对象x被放回到一个普通的对象,在全球环境
untrack(all=TRUE): stop tracking all variables in the global environment (but tracking is still set up)
untrack(all=TRUE):停止跟踪的全球环境中的所有变量(但仍然成立跟踪)
untrack(list=...): stop tracking specified variables
untrack(list=...):停止跟踪指定的变量
track.remove(x): completely remove all traces of x from the global environment, tracking environment and tracking directory. Note that if variable x in the global environment is tracked, remove(x) will make x an "orphaned" variable: remove(x) will just remove the active binding from the global environment, and leave x in the tracked environment and on file, and x will reappear after restarting tracking.
track.remove(x):完全x从全球环境,跟踪环境和跟踪目录中删除所有的痕迹。需要注意的是,如果变量x在全球环境中的跟踪,remove(x)做x“孤立”变量:remove(x)只是删除了积极约束力的全球环境,并留下x在履带环境和文件,和x将再次出现后,重新启动跟踪。
功能和通用的调用方式的完整列表----------Complete list of functions and common calling patterns----------
The track package provides many additional functions for controlling how tracking is performed (e.g., whether or not tracked variables are cached in memory), examining the state of tracking (show which variables are tracked, untracked, orphaned, masked, etc.) and repairing tracking environments and databases that have become inconsistent or incomplete (this may result from resource limitiations, e.g., being unable to write a save file due to lack of disk space, or from manual tinkering, e.g., dropping a new save file into a tracking directory.)
track包提供了许多额外的功能,用于控制如何跟踪(例如,是否跟踪变量在内存中缓存),检查跟踪的状态(显示哪些变量进行跟踪,不露痕迹的,孤立的,屏蔽等。)和跟踪环境和数据库修复,已成为不一致或不完整的(这可能会导致从的资源limitiations,例如,不能写一个保存文件由于缺乏的磁盘空间,或从手动修修补补,例如,删除一个新的保存文件跟踪目录。)
The functions that can be used to set up and take down tracking are:
可以用来建立,并采取了跟踪的功能是:
track.start(dir=...): start tracking, using the supplied directory
track.start(dir=...):开始跟踪,使用所提供的目录
track.stop(): stop tracking (any unsaved tracked variables are saved to disk and all tracked variables become unavailable until tracking starts again)
track.stop():停止跟踪(任何未保存的跟踪变量保存到磁盘上,成为不可用,直到再次启动跟踪所有跟踪的变量)
track.dir(): return the path of the tracking directory
track.dir():返回的路径的跟踪目录
Functions for tracking and stopping tracking variables:
跟踪和停止跟踪变量的函数:
track(x) track(var <- value) track(list=...) track(all=TRUE): start tracking variable(s)
track(x)track(var <- value)track(list=...)track(all=TRUE):开始跟踪变量(S)
track.load(file=...): load some objects from a RData file into the tracked environment
track.load(file=...): load some objects from a RData file into the tracked environment
untrack(x, keep.in.db=FALSE) untrack(list=...) untrack(all=TRUE): stop tracking variable(s) - value is left in place, and optionally, it is also left in the the database
untrack(x, keep.in.db=FALSE)untrack(list=...)untrack(all=TRUE):一站式的跟踪变量(S) - 价值留在原地,并选择性地,它也被留在了数据库
Functions for getting status of tracking and summaries of variables:
获得状态变量的跟踪和总结的功能:
track.summary(): return a data frame containing a summary of the basic characteristics of tracked variables: name, class, extent, and creation, modification and access times.
track.summary():返回一个数据框包含的跟踪变量的基本特征的总结:名称,类别,程度,以及创建,修改和访问时间。
track.status(): return a data frame containing information about the tracking status of variables: whether they are saved to disk or not, etc.
track.status():返回一个数据框包含的跟踪变量状态的信息:他们是否被保存到磁盘或等
track.info(): return a data frame containing information about which tracking dbs are currently active.
track.info():返回一个数据框包含的信息跟踪星展目前正在积极。
env.is.tracked(): tell whether an environment is currently tracked
env.is.tracked():判断目前的环境跟踪
The remaining functions allow the user to more closely manage variable tracking, but are less likely to be of use to new users.
其余的功能使用户能够更密切地管理变量跟踪,但不太可能使用新的用户。
Functions for getting status of tracking and summaries of variables:
获得状态变量的跟踪和总结的功能:
tracked(): return the names of tracked variables
tracked():返回的跟踪变量的名称
untracked(): return the names of untracked variables
untracked():返回的不露痕迹的变量名称
untrackable(): return the names of variables that cannot be tracked
untrackable():返回的名称不能被跟踪的变量
track.unsaved(): return the names of variables whose copy on file is out-of-date
track.unsaved():返回变量的名字,他们的拷贝文件的最新
track.orphaned(): return the names of once-tracked variables that have lost their active binding (should not happen)
track.orphaned():返回已失去活性结合(一次跟踪变量的名字应该不会发生)
track.masked(): return the names of once-tracked variables whose active binding has been overwritten by an ordinary variable (should not happen)
track.masked():一次跟踪变量,其活性位已被覆盖,由一个普通的变量(返回的名称应该不会发生)
Functions for managing tracking and tracked variables:
管理跟踪和追踪变量的函数:
track.options(): examine and set options to control tracking
track.options():检查和设置选项来控制跟踪
track.remove(): completely remove all traces of a tracked variable
track.remove():彻底清除所有痕迹跟踪的变量
track.save(): write unsaved variables to disk
track.save():未保存的变量写入到磁盘
track.flush(): write unsaved variables to disk, and remove from memory
track.flush():未保存的变量写入到磁盘,并从内存中删除
track.forget(): delete cached versions without saving to file (file version will be retrieved next time the variable is accessed)
track.forget():删除缓存的版本,而不会保存到文件(文件版本将被检索下次访问该变量的时候)
track.rescan(): reload variable values from disk (can forget all cached vars, remove no-longer existing tracked vars)
track.rescan():重装变量的值从磁盘(可以忘记所有的缓存瓦尔,删除没有已跟踪瓦尔“)
track.load(): load variables from a saved RData file into the tracking session
track.load():负载变量从一个跟踪会话保存的的RDATA文件到
track.copy() and track.move(): copy or move variables from one tracking db to another
track.copy()和track.move():复制或移动变量从一个跟踪数据库,另一个
track.rename() rename variables in a tracking db
track.rename()重命名变量在跟踪数据库
Functions for recovering from errors:
从错误中恢复的功能:
track.rebuild(): rebuild tracking information from objects in memory or on disk
track.rebuild():重建跟踪对象在内存或磁盘上的信息
track.flush: write unsaved variables to disk, and remove from memory
track.flush:未保存的变量写入到磁盘,并从内存中删除
Design and internals of tracking:
设计和内部跟踪:
track.design
track.design
注意----------Note----------
Some special kinds of objects don't work properly if referenced as active bindings and/or stored in a save file. One example is RODBC connections. To make it easy to work with such objects, two ways of excluding variables from automatic tracking are provided: the autoTrackExcludePattern option (a vector regular expressions: variables whose name match one of these will not be tracked); and the autoTrackExcludeClass option (a vector of class names: variables whose class matches one of these will not be tracked). New values can be added to these options as follows:
一些特殊类型的对象无法正常工作,如果被引用作为有效的绑定和/或存储在保存文件。其中一个例子是RODBC连接。为了方便工作与这些对象不包括自动跟踪变量,两种方式提供:autoTrackExcludePattern选项(向量的正则表达式:变量的名称相匹配,这些将无法被跟踪);和 autoTrackExcludeClass选项(向量的类名之一相匹配,将无法被跟踪这些变量的类)。新的值可以被添加到这些选项如下:
(作者)----------Author(s)----------
Tony Plate <tplate@acm.org>
参考文献----------References----------
News, 6(4):19-24, October 2006. http://cran.r-project.org/doc/Rnews and http://sandybox.typepad.com/software
2002. http://cran.r-project.org/doc/Rnews
参见----------See Also----------
Design of the track package.
track包的设计。
Potential future features of the track package.
track包的潜在的未来的功能。
Documentation for save and load (in 'base' package).
save和load(碱基包文件)。
Documentation for makeActiveBinding and related functions (in 'base' package).
makeActiveBinding和相关的功能(在碱基包)的文件。
Inspriation from the packages g.data and filehash.
Inspriation的包g.data和filehash。
Description of the facility (addTaskCallback) for adding a callback function that is called at the end of each top-level task (each time R returns to the prompt after completing a command): http://developer.r-project.org/TaskHandlers.pdf.
该设施的说明(addTaskCallback)添加一个回调函数被调用的每个顶层的任务(每次R返回的提示完成后,命令):http://developer.r- project.org / TaskHandlers.pdf。
实例----------Examples----------
## Not run: [#不运行:]
library(track)
# start tracking the global environment using directory 'rdatadir'[开始跟踪全球环境使用目录rdatadir“]
# inside dontrun to avoid creating/removing rdatadir[内,以避免创建/删除rdatadir的dontrun]
track.start()
a <- 1
b <- 2
ls()
track.status()
track.summary()
track.info()
track.stop()
# variables are now gone[变量现在都没有了]
ls()
# bring them back[把他们带回]
track.start()
ls()
track.stop()
## End(Not run)[#(不执行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|