track.design(trackObjs)
track.design()所属R语言包:trackObjs
Design of a tracking environment
跟踪环境设计
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This document describes the layout of a tracking environment. Object tracking works by replacing a variable with an active binding, and keeping the actual value of the variable on disk and/or in another environment. Tracked objects are automatically resaved to disk when they are changed. Basic characteristics, such as class, size, extent, and creation and modification times are recorded in a summary of all tracked objects.
本文档描述的跟踪环境的布局。对象跟踪的工作原理用活性绑定通过更换一个变量的变量的实际值,并保持在磁盘上和/或在另一个环境中。跟踪的对象会自动重新保存到磁盘时,他们改变。基本特征,如类,大小,程度,创建时间和修改时间被记录在汇总所有被跟踪的对象。
Details
详细信息----------Details----------
Object tracking works by replacing a variable with an active binding, and keeping the actual value of the variable on disk and/or in another environment. Whenever the variable is fetched or assigned, the active binding is called, and it writes the object to disk if necessary, and records basic characteristics of the objects in a summary of all objects, including creation, modification and access times.
对象跟踪的工作原理用活性绑定通过更换一个变量的变量的实际值,并保持在磁盘上和/或在另一个环境中。每当该变量中取出或分配,被称为活性结合,它写入到磁盘的对象如果必要的话,和记录的摘要中的所有对象,包括创建,修改和访问时间的对象的基本特征。
A tracking environment can be linked to one environment on the search path, but the tracking environment is not on the search path itself. An environment can only have one tracking environment linked to it. Variables cannot be tracked automatically: they must be registered with the tracking environment using the function track().
,跟踪环境可以连接到一个环境上的搜索路径,但跟踪不上的搜索路径本身。的环境只能有一个跟踪链接到它的环境。不能将变量自动跟踪:跟踪环境下使用的功能track()他们都必须登记。
Any user-created environment on the search path, or the global environment, can be tracked.
可以跟踪任何用户的搜索路径,或全球环境,创建环境。
The format used to store R objects in files is the one used by save()/load() – the objects in those files can be read using load() if desired.
的格式使用R对象存储在文件中的所使用的save()/load() - 这些文件中的对象可以读取load()如果需要的话。
The various variables and files involved in tracking are as follows (assuming the RData suffix being used is "rda"). Note that the default tracked visible environment is the global environment.
跟踪所涉及的各种变量和文件,如下所示(假设所使用的RDATA后缀是“RDA”)。需要注意的是默认的跟踪可见环境是全球性的环境。
<pre> Tracked Visible Environment (on search list) attr(., "trackingEnv") -> Tracking Environment +-> Tracking Directory (files) +-----------------+ (not on search list) / | | | attr(., "trackingDir") | | | +-------------------------+ + | | | .trackingFileMap | +- filemap.txt | | | .trackingSummary | +- .trackingSummary.rda | | | .trackingUnsaved | | | | | .trackingSummaryChanged | | | | | .trackingOptions | | | x (*) | | x (@) | +- x.rda | abc (*) | | abc (@) | +- abc.rda | Y (*) | | Y (@) | +- _1.rda | x1 | +-------------------------+ | x2 | +-----------------+ </pre>
<PRE>履带的可见环境(搜索列表)的Attr(“trackingEnv”) - >跟踪环境下的+ - >的跟踪目录(文件)+ ---------------- - +(而不是在搜索列表中)/ | | | ATTR(“trackingDir”)| | | + ------------------------- + + | | | trackingFileMap | + - filemap.txt | | |。trackingSummary | + - 。trackingSummary.rda | | |。trackingUnsaved | | | | |。trackingSummaryChanged | | | | |。trackingOptions | | | X(*) | | X(@)| + - x.rda | ABC(*)| | ABC(@)| + - abc.rda | Y(*)| | Y(@)| + - _1.rda | X1 | + ------------------------- + | X2 | + -------- + </前>
variables marked (*) are tracked and are actually an active binding that refers to the corresponding variable in the tracking environment.
变量标记(*)的跟踪,实际上是一个动态绑定,指的是相应的变量在跟踪环境。
variables marked (@) may or may not exist -- if they do not exist in the tracking environment, they will be automatically read from file when the corresponding tracked object is accessed.
变量标记(@)可能会或可能不存在 - 如果它们不存在,在跟踪的环境中,它们将被自动从文件中读取相应的跟踪对象被访问时。
The "trackingEnv" attribute on the tracked environment is the tracking environment. This is implemented as an attribute on the tracked environment rather than as a variable in the tracked environment so that save.image() on the tracked environment will ignore the tracking environment. If the tracking environment were stored as a variable in the tracked environment, save.image() could end up storing two copies of every tracked variable: one when it accessed the active binding (it stores a copy of the object: save() doesn't know it's an active binding); and another if the object is cached in the tracking environment.
的“trackingEnv”属性的跟踪环境是跟踪环境。这是作为一个属性对被跟踪的环境,而不是作为一个变量在履带环境,使save.image()的跟踪环境将忽略跟踪环境。在履带环境变量,如果跟踪存储环境save.image()最终可能存储两个副本的每一个跟踪变量:一个访问的动态绑定(存储对象的副本:save()不知道这是一个积极的绑定);如果对象是缓存在跟踪环境。
The "trackingDir" attribute on the tracking environment specifies the absolute pathname of the directory under which tracked objects are stored on file. It uses the absolute pathname because the current directory of the R session can be changed using setwd(), which would result in losing a relative pathname.
跟踪环境“trackingDir”属性指定跟踪调查的对象都存储在文件的目录的绝对路径名。它使用的绝对路径,因为当前目录下的R会话可以改变使用setwd(),这将导致失去了相对路径名。
.trackingFileMap stores the base part of the file name corresponding to each tracked object as a named character vector (the names on the vector are the object names). Objects that do not have simple names have an associated file name like "\_NNN" where "NNN" is a number. For example, the .trackingFileMap for the above configuration could be c(abc="abc", x="x", Y="_1"). Simple object names are those conforming to the following rules:
.trackingFileMap存储的文件名对应的每个跟踪对象指定的字符矢量(向量上的名字是对象名称)的基本部分。没有简单的名称的对象有关联的文件名称,如“\ _NNN”,其中“NNN”是一个数字。例如,.trackingFileMap上述配置可以是c(abc="abc", x="x", Y="_1")。简单的对象名称是指符合以下规则:
less than 55 characters
小于55个字符
are comprised of only lower-case letters, digits 0 through 9, "." and "\_"
仅包括小写字母,数字0到9,“。”和“\ _”
begin with a lower-case letter
用小写字母开始
are not one of the following: con, prn, aux, nul, com1 through com9, lpt1 through lpt9, and do not begin with one of these names followed by a period (i.e., prn.foo and prn.foo.bar are both not simple names) (these are special file names under Microsoft Windows - see http://en.wikipedia.org/wiki/Filename and http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp)
下列操作之一:con,prn,aux,nul,com1com9,lpt1通过lpt9,并没有开始与之一(这是在Microsoft Windows下的特殊文件名的这些名称后面一段时间(即,prn.foo和prn.foo.bar都没有简单的名称) - 请参阅HTTP :/ / en.wikipedia.org /维基/文件名和http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp的)
The object .trackingFileMap is always kept in memory and is always saved to disk (as text in the file filemap.txt) whenever it is changed.
的对象.trackingFileMap一直保存在内存中,并一直保存到磁盘上(如文本文件中的filemap.txt)时,它被改变。
.trackingSummary is a data frame recording various basic characteristics of the tracked objects, such as class, size and extent, and also times of creation, and most recent modification and access. The tracking summary should be accessed using the function track.summary().
.trackingSummary是一个数据框记录的各种基本特征的跟踪对象,如类,规模和程度,也创造时代,和最近的修改和访问。跟踪概要应使用功能track.summary()访问。
.trackingSummaryChanged A logical flag indicating whether or not the tracking summary copy on disk is in sync with version in memory. To reduce overhead on accessing objects, there is an option to not resave the tracking summary when it is changed on accessing an object – this variable indicates if it has been changed.
.trackingSummaryChanged的逻辑标志,指示是否跟踪磁盘上的文件摘要是同步版本在内存中。为了减少开销,访问对象,有选择不重新保存的跟踪总结时,对访问对象的改变 - 此变量表明它是否已被改变。
.trackingUnsaved: If the tracking options are set up so that objects are not automatically written to files on assignment, this variable contains a vector of names of all objects that have not been saved.
.trackingUnsaved:如果跟踪选项设置,使对象不会自动写入到文件上的分配,这个变量包含一个矢量尚未保存的所有对象的名称。
.trackingOptions: are accessed and changed by the track.options() function. They are kept in memory, and also written to disk whenever they are changed. The tracking directory is organized as an R package. It's layout is as follows (saying, for example, that attr(trackingEnv, "trackingDir") is /tmp/trackdir1, and .trackingFileMap is c(abc="abc", x="x", Y="_1")): </ul>
.trackingOptions:访问和修改的track.options()功能。它们被保存在内存中,并写入到磁盘时,他们被改变。跟踪目录被组织成一个R包。它的布局是如下(例如,说,这attr(trackingEnv, "trackingDir")是/tmp/trackdir1和.trackingFileMap是c(abc="abc", x="x", Y="_1")):</ P>
<pre> /tmp/trackdir1 | +- filemap.txt +- .trackingSummary.rda +- x.rda +- abc.rda +- _1.rda </pre>
<PRE> / tmp/trackdir1 | + - filemap.txt + - ,trackingSummary.rda + - x.rda + - abc.rda + - _1.rda </ pre>
术语----------Terminology----------
One could describe a tracking environment as "attached" to the tracked environment, but that using that term would risk confusion with the role of the attach() function and search path in R. So, instead the trackObjs package says that a tracking environment is "linked" to the tracked environment.
人们可以跟踪环境描述为“附加”的跟踪环境,但是,使用这个词将有可能混淆,attach()的功能和搜索路径中的作用R.因此,而不是trackObjs包他说,跟踪环境是“挂钩”的跟踪环境。
track:The trackObjs tracks variables, by setting up a one-to-one relationship between R objects and files on disks so that when an object in R is modified, the file on disk is automatically updated.
跟踪:trackObjs轨道变量,设置了一对关系R对象和磁盘上的文件,所以,在R当一个对象被修改时,磁盘上的文件会自动更新。
tracked environment:A tracked environment contains user variables and is usually on the search path.
跟踪环境:被监视的环境中包含用户变量,通常是在搜索路径中。
tracked object:A tracked object (in a tracked environment) that has an active binding so that when it is modified, the corresponding file on disk is also modified.
跟踪对象:被跟踪的对象在被监视的环境中,有一个动态绑定,这样当它被修改时,相应的磁盘上的文件也被修改。
untracked object:An untracked object in a tracked environment is an ordinary object that is not tracked and has no corresponding file.
不露痕迹的对象:未跟踪对象的跟踪环境是一个普通的对象,不跟踪,有没有相应的文件。
tracking environment:A tracking environment is a special environment used by the trackObjs package to track objects in the tracked environment
跟踪环境:一个跟踪trackObjs包跟踪对象在履带环境的使用环境是一个特殊的环境
linked:A tracking environment is linked to a tracked environment (by the trackingEnv attribute on the tracked environment, which points to the tracking environment.)
链接:链接到一个跟踪环境跟踪的环境(trackingEnv属性的履带环境,这点跟踪的环境。)
start tracking, stop tracking:Tracking is started by creating a tracking environment, linking it to the tracked environment, and setting up bindings for tracked objects.
开始跟踪,停止跟踪:跟踪开始通过创建一个跟踪环境,将其链接到的履带环境,并为跟踪对象的绑定。
tracking database:A tracking database is the collection of files and directories that stores the tracking information.
跟踪数据库:一个跟踪数据库的文件和目录,用于存储跟踪信息的收集。
active tracking database:A tracking database that is currently linked to an environment in a running R session.
积极跟踪数据库跟踪数据库,是目前连接到一个正在运行的R会话的环境中。
无法跟踪的变量----------Untrackable variables----------
Only ordinary variables can be tracked – variables that are active bindings cannot be tracked.
只有普通变量可以跟踪 - 被激活绑定的变量不能被跟踪。
Several variable names are reserved and cannot be tracked: .trackingEnv, .trackingFileMap, .trackingUnsaved, .trackingSummary, .trackingSummaryChanged, .trackingOptions. Additionally, any variable with a newline character ("\n") as part of its name cannot be tracked (the main reason for this is that the mapping from object names to file names is stored in a text file, and newline character delimits the name).
几个变量的名称被保留,不能被跟踪:.trackingEnv,.trackingFileMap,.trackingUnsaved,.trackingSummary,.trackingSummaryChanged,.trackingOptions。此外,任何变量与一个换行字符(“\ N”)作为其名称的一部分不能被跟踪(此的主要原因是,从对象名到文件名的映射是存储在文本文件中,和换行符分隔的名)。
文件映射----------The file map----------
The mapping from object names to file names is stored in the file fileMap.txt. This data is stored as ordinary text file to make it easy for users to see the object-file mappings outside of R.
文件名映射对象名称来存储文件中的fileMap.txt。此数据存储为普通的文本文件,很容易让用户看到R.以外的对象文件映射
实施注意事项----------Implementation considerations----------
The reason that objects must be explicitly registered for tracking is that there is currently no way of setting up a function to be called when a new object is created, so new objects are always created as ordinary R objects. Similarly, the R remove() functions does not have any hooks, so if remove() is called on a tracked variable, it will just remove the active binding in the visible environment, but will not disturb the underlying tracking environment. The track.remove() function will completely remove a tracked variable from the visible environment and the underlying tracking environment (including deleting an associated disk file.)
对象都必须显式注册为跟踪的原因是,目前还没有办法建立一个函数被调用时创建一个新的对象,所以总是创建新的对象为普通R对象。同样,Rremove()函数不具有任何挂钩,因此,如果remove()被称为跟踪的变量,它只会删除活动在可见光环境的结合,但不会打扰底层的跟踪环境。 track.remove()功能将完全删除一个跟踪变量的可见环境和基础的跟踪环境(包括删除相关联的磁盘文件)。
Object tracking was intended to be used in situations where large numbers of large objects must be manipulated. Consequently, there is a good chance of exhausting resources while using the trackObjs package. The trackObjs code tries to check return codes when creating objects or writing files, and in cases where it is unable to complete an operation it tries leave the tracking environment in a state from which objects can be salvaged. The functions track.rebuild() and track.flush() are provided to help recover from situations where resource limitations prevented successful operation. Note that files are generally written in a "unsafe" manner (i.e., existing files can be overwritten with partial new files), but in these cases data is retained in the memory and can be rewritten after resolving file system problems.
对象跟踪的目的中要使用大量的大对象的情况下,必须被操纵。因此,资源耗竭,而trackObjs使用包是一个很好的机会。 trackObjs代码试图创建对象时,检查返回代码或写入文件,并在情况下,它是无法完成的操作,它试图离开的跟踪状态的对象可以挽救的环境中。的功能track.rebuild()和track.flush()提供,以帮助恢复从资源限制的情况下,无法顺利运作。需要注意的是,文件一般都写在一个“不安全”的方式(即,部分新的文件,可以覆盖现有的文件),但在这些情况下,数据保留在内存中后,可以写入文件系统问题的解决。
The R functions exists() should be used with care on tracked objects, because it will actually fetch the object, possibly needing to read it from disk. In the trackObjs code, the exists("x") function is not used to check existence of a possibly tracked object x, instead an idiom like is.element("x", objects(all=TRUE)) is used.
R功能exists()应跟踪对象小心使用,因为它实际上将提取的对象,可能需要从磁盘中读取。在trackObjs代码,exists("x")不使用此功能,以检查是否存在可能跟踪的对象x,而不是一个成语,如is.element("x", objects(all=TRUE))使用。
These statements about the available facilities in R were true as of R-2.4.1 (released Dec 2006).
这些陈述是真实的现有设施,在R,R-2.4.1(2006年12月发布)。
The rules for how variable names are mapped to file names are based on trying to use filenames that will work properly on all three operating systems R works on (Linux, Windows, and Mac OS X). A somewhat obscure point that must be taken into account is the case-insensitivity of Mac OS X and Windows. Even though modern versions of the OS's seem to use case in their file names, this is because they are case preserving, but they are in fact still case insensitive. This means that a file created with the name "X.rda" is the same file as the "x.rda". Here is a short shell transcript showing this behavior in a bash shell running under Windows and Mac OS X (it's the same in both). <pre> $ echo 123 > X $ cat x 123 $ echo 456 > x $ cat x 456 $ cat X 456 </pre> Thus, in order to work on OS's, file mapping must be used to create different filenames for the R objects "x" and "X" (which are in fact different in R.)
变量名称映射到文件名的规则的基础上尝试使用的文件名,将正常工作在所有三个操作系统修复工程(在Linux,Windows和Mac OS X)。必须考虑一个有点模糊的点是不区分大小写的Mac OS X和Windows。即使现代的操作系统的版本,似乎在它们的文件名使用的情况下,这是因为他们的情况下保留,但他们实际上仍然不区分大小写。这意味着,一个文件创建的的名称“X.rda”是相同的文件的“x.rda”,。这里是一个简短的shell成绩单显示在bash shell下运行Windows和Mac OS X(在这两个是相同的)这种行为。 <PRE>回声$ 123> X猫X 123 X $猫$回声456 x 456 $猫所述456 </ pre>因此,在OS的工作,文件映射,必须用于创建不同的文件名R对象“x”和“X”(这实际上是在R不同)
可移植性----------Portability----------
Tracking directories are intended to be operating-system independent and completely portable across different operating systems.
跟踪目录的目的是要独立的,完全可移植的操作系统在不同的操作系统。
(作者)----------Author(s)----------
Tony Plate <tplate@acm.org>
参考文献----------References----------
Roger D. Peng. Interacting with data using the filehash package. R News, 6(4):19-24, October 2006. <code>http://cran.r-project.org/doc/Rnews</code> and <code>http://sandybox.typepad.com/software</code>
David E. Brahm. Delayed data packages. R News, 2(3):11-12, December 2002. <code>http://cran.r-project.org/doc/Rnews</code>
参见----------See Also----------
Overview of the trackObjs package.
trackObjs包的概述。
Documentation for makeActiveBinding and related functions (in 'base' package).
makeActiveBinding和相关的功能(在碱基包)的文件。
Inspriation from the packages g.data and filehash.
Inspriation的包g.data和filehash。
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|