找回密码
 注册
查看: 452|回复: 0

R语言 track包 track.design()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-10-1 11:19:09 | 显示全部楼层 |阅读模式
track.design(track)
track.design()所属R语言包:track

                                         Design of a tracking environment
                                         跟踪环境设计

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

This document describes the layout of a tracking environment.  Object tracking works by replacing a variable with an active binding, and keeping the actual value of the variable on disk and/or in another environment.  Tracked objects are automatically resaved to disk when they are changed.  Basic characteristics, such as class, size, extent, and creation and modification times are recorded in a summary of all tracked objects.
本文档描述的跟踪环境的布局。对象跟踪的工作原理用活性绑定通过更换一个变量的变量的实际值,并保持在磁盘上和/或在另一个环境中。跟踪的对象会自动重新保存到磁盘时,他们改变。基本特征,如类,大小,程度,创建时间和修改时间被记录在汇总所有被跟踪的对象。


Details

详细信息----------Details----------

Object tracking works by replacing a variable with an active binding, and keeping the actual value of the variable on disk and/or in another environment.  Whenever the variable is fetched or assigned, the active binding is called, and it writes the object to disk if necessary, and records basic characteristics of the objects in a summary of all objects, including creation, modification and access times.
对象跟踪的工作原理用活性绑定通过更换一个变量的变量的实际值,并保持在磁盘上和/或在另一个环境中。每当该变量中取出或分配,被称为活性结合,它写入到磁盘的对象如果必要的话,和记录的摘要中的所有对象,包括创建,修改和访问时间的对象的基本特征。

A tracking environment can be linked to one environment on the search path, but the tracking environment is not on the search path itself.  An environment on the search path can only have one tracking environment linked to it.  In standard use, variables are tracked automatically by a task callback function.  Alternatively, variables to track can be registered with the tracking environment using the function track().
,跟踪环境可以连接到一个环境上的搜索路径,但跟踪不上的搜索路径本身。环境上的搜索路径只能有一个链接到它的跟踪环境。在标准中,变量自动跟踪任务的回调函数。另外,变量跟踪与跟踪环境下使用的功能track()可以注册。

Any user-created environment on the search path, or the global environment, can be tracked.
可以跟踪任何用户的搜索路径,或全球环境,创建环境。

The format used to store R objects in files is the one used by save()/load() – the objects in those files can be read using load() if desired.
的格式使用R对象存储在文件中的所使用的save()/load() - 这些文件中的对象可以读取load()如果需要的话。

The various variables and files involved in tracking are as follows (assuming the RData suffix being used is "rda"). Note that the default tracked visible environment is the global environment.
跟踪所涉及的各种变量和文件,如下所示(假设所使用的RDATA后缀是“RDA”)。需要注意的是默认的跟踪可见环境是全球性的环境。

variables marked (*) are tracked and are actually an active binding that refers to the corresponding variable in the tracking environment.  There can also be untracked variables in the visible tracked environment, but in the standard mode of operation these are detected by the end-of-task callback function and are immediately converted to tracked variables (except for variables with reserved names like .trackingSummary, and variables matching exclude patterns, see options autoTrackExcludePattern and autoTrackExcludeClass in track.options.
变量标记(*)的跟踪,实际上是一个动态绑定,指的是相应的变量在跟踪环境。也可以有不露痕迹的在可见光履带环境变量,但在标准操作模式中,这些被检测的任务结束回调函数,并立即转换到跟踪变量(变量与保留的名称,如除外。trackingSummary,并变量匹配排除模式,选项autoTrackExcludePattern和autoTrackExcludeClass的track.options。

variables marked (@) may or may not exist – if they do not exist in the tracking environment, they will be automatically read from file when the corresponding tracked object is accessed.
变量标记(@)可能会或可能不存在 - 如果它们不存在,在跟踪的环境中,它们将被自动从文件中读取相应的跟踪对象被访问时。

The "trackingEnv" attribute on the tracked environment is the tracking environment.  This is implemented as an attribute on the tracked environment rather than as a variable in the tracked environment so that save.image() on the tracked environment will ignore the tracking environment.  If the tracking environment were stored as a variable in the tracked environment, save.image() could end up storing two copies of every tracked variable: one when it accessed the active binding (it stores a copy of the object: save() doesn't know it's an active binding); and another if the object is cached in the tracking environment.
的“trackingEnv”属性的跟踪环境是跟踪环境。这是作为一个属性对被跟踪的环境,而不是作为一个变量在履带环境,使save.image()的跟踪环境将忽略跟踪环境。在履带环境变量,如果跟踪存储环境save.image()最终可能存储两个副本的每一个跟踪变量:一个访问的动态绑定(存储对象的副本:save()不知道这是一个积极的绑定);如果对象是缓存在跟踪环境。

The "trackingDir" attribute on the tracking environment specifies the absolute pathname of the directory under which tracked objects are stored on file.  It uses the absolute pathname because the current directory of the R session can be changed using setwd(), which would result in losing a relative pathname.
跟踪环境“trackingDir”属性指定跟踪调查的对象都存储在文件的目录的绝对路径名。它使用的绝对路径,因为当前目录下的R会话可以改变使用setwd(),这将导致失去了相对路径名。

.trackingFileMap stores the base part of the file name corresponding to each tracked object as a named character vector (the names on the vector are the object names).  Objects that do not have simple names have an associated file name like "\_NNN" where "NNN" is a number.  For example, the .trackingFileMap for the above configuration could be c(abc="abc", x="x", Y="_1"). Simple object names are those conforming to the following rules:
.trackingFileMap存储的文件名对应的每个跟踪对象指定的字符矢量(向量上的名字是对象名称)的基本部分。没有简单的名称的对象有关联的文件名称,如“\ _NNN”,其中“NNN”是一个数字。例如,.trackingFileMap上述配置可以是c(abc="abc", x="x", Y="_1")。简单的对象名称是指符合以下规则:

less than 55 characters
小于55个字符

are comprised of only lower-case letters, digits 0 through 9, "." and "\_"
仅包括小写字母,数字0到9,“”和“\ _”

begin with a lower-case letter
用小写字母开始

are not one of the following: con, prn, aux, nul, com1 through com9, lpt1 through lpt9, and do not begin with one of these names followed by a period (i.e., prn.foo and prn.foo.bar are both not simple names) (these are special file names under Microsoft Windows - see http://en.wikipedia.org/wiki/Filename and http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp)
下列操作之一:con,prn,aux,nul,com1com9,lpt1通过lpt9,并没有开始与之一(这是在Microsoft Windows下的特殊文件名的这些名称后面一段时间(即,prn.foo和prn.foo.bar都没有简单的名称) - 请参阅HTTP :/ / en.wikipedia.org /维基/文件名和http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp的)

The object .trackingFileMap is always kept in memory and is always saved to disk (as text in the file filemap.txt) whenever it is changed.
的对象.trackingFileMap一直保存在内存中,并一直保存到磁盘上(如文本文件中的filemap.txt)时,它被改变。

.trackingSummary is a data frame recording various basic characteristics of the tracked objects, such as class, size and extent, and also times of creation, and most recent modification and access.  The tracking summary should be accessed using the function track.summary().
.trackingSummary是一个数据框记录的各种基本特征的跟踪对象,如类,规模和程度,也创造时代,和最近的修改和访问。跟踪概要应使用功能track.summary()访问。

.trackingSummaryChanged A logical flag indicating whether or not the tracking summary copy on disk is in sync with version in memory.  To reduce overhead on accessing objects, there is an option to not resave the tracking summary when it is changed on accessing an object – this variable indicates if it has been changed.
.trackingSummaryChanged的逻辑标志,指示是否跟踪磁盘上的文件摘要是同步版本在内存中。为了减少开销,访问对象,有选择不重新保存的跟踪总结时,对访问对象的改变 - 此变量表明它是否已被改变。

.trackingUnsaved: If the tracking options are set up so that objects are not automatically written to files on assignment, this variable contains a vector of names of all objects that have not been saved.
.trackingUnsaved:如果跟踪选项设置,使对象不会自动写入到文件上的分配,这个变量包含一个矢量尚未保存的所有对象的名称。

.trackingOptions: are accessed and changed by the track.options() function.  They are kept in memory, and also written to disk whenever they are changed. The tracking directory is organized as an R package.  It's layout is as follows (saying, for example, that attr(trackingEnv, "trackingDir") is /tmp/trackdir1, and .trackingFileMap is c(abc="abc", x="x", Y="_1")):
.trackingOptions:访问和修改的track.options()功能。它们被保存在内存中,并写入到磁盘时,他们被改变。跟踪目录被组织成一个R包。它的布局是如下(,例如说,这可是attr(trackingEnv, "trackingDir")是/tmp/trackdir1和.trackingFileMap是c(abc="abc", x="x", Y="_1")):


术语----------Terminology----------

One could describe a tracking environment as "attached" to the tracked environment, but that using that term would risk confusion with the role of the attach() function and search path in R.  So, instead the track package says that a tracking environment is "linked" to the tracked environment.
人们可以跟踪环境描述为“附加”的跟踪环境,但是,使用这个词将有可能混淆,attach()的功能和搜索路径中的作用R.所以,取而代之的是track包说,跟踪环境是“挂钩”的跟踪环境。




track: The track tracks variables, by setting up a one-to-one relationship between R objects and files on disks so that when an object in R is modified, the file on disk
跟踪:track轨道变量,设置了一对关系R对象和文件在磁盘上,所以在R当一个对象被修改时,磁盘上的文件




tracked environment: A tracked environment contains
跟踪环境:履带式的环境




tracked object: A tracked object (in a tracked environment) that has an active binding so that when it is modified, the corresponding file on disk is
跟踪对象:被跟踪的对象在被监视的环境中,有一个动态绑定,这样当它被修改时,相应的磁盘上的文件




untracked object: An untracked object in a tracked environment is an ordinary object that is not tracked and has no
不露痕迹的对象:未跟踪对象的跟踪环境是一个普通的对象,不跟踪,也没有




tracking environment: A tracking environment is a special environment used by the track package to track
跟踪环境:一个跟踪环境是一个特殊的环境,使用track包跟踪




linked: A tracking environment is linked to a tracked environment (by the trackingEnv attribute on the tracked
链接:链接到一个的履带环境(trackingEnv属性的跟踪跟踪环境




start tracking, stop tracking: Tracking is started by creating a tracking environment, linking it to the tracked
开始跟踪,停止跟踪:跟踪开始通过创建一个跟踪环境,并把它的履带




tracking database: A tracking database is the
跟踪数据库跟踪数据库是




active tracking database: A tracking database that is
积极跟踪数据库跟踪数据库,该数据库


无法跟踪的变量 - 保留名称----------Untrackable variables – reserved names----------

Only ordinary variables can be tracked – variables that are active bindings cannot be tracked.
只有普通变量可以跟踪 - 被激活绑定的变量不能被跟踪。

Several variable names are reserved and cannot be tracked: .trackingEnv, .trackingFileMap, .trackingUnsaved, .trackingSummary, .trackingSummaryChanged, .trackingOptions.  Additionally, any variable with a newline character ("\n") as part of its name cannot be tracked (the main reason for this is that the mapping from object names to file names is stored in a text file, and newline character delimits the name).
几个变量的名称被保留,不能被跟踪:.trackingEnv,.trackingFileMap,.trackingUnsaved,.trackingSummary,.trackingSummaryChanged,.trackingOptions。此外,所有的变量与一个换行符(“\ n”)作为其名称的一部分,不能被跟踪(这是主要的原因,被保存在一个文本文件,文件名映射对象名称来和换行符划这个名字)。


文件映射----------The file map----------

The mapping from object names to file names is stored in the file fileMap.txt.  This data is stored as ordinary text file to make it easy for users to see the object-file mappings outside of R.
文件名映射对象名称来存储文件中的fileMap.txt。此数据存储为普通的文本文件,很容易让用户看到R.以外的对象文件映射


实施注意事项----------Implementation considerations----------

The reason that objects must be explicitly registered for tracking is that there is currently no way of setting up a function to be called when a new object is created, so new objects are always created as ordinary R objects.  Similarly, the R remove() functions does not have any hooks, so if remove() is called on a tracked variable, it will just remove the active binding in the visible environment, but will not disturb the underlying tracking environment.  The track.remove() function will completely remove a tracked variable from the visible environment and the underlying tracking environment (including deleting an associated disk file.)
对象都必须显式注册为跟踪的原因是,目前还没有办法建立一个函数被调用时创建一个新的对象,所以总是创建新的对象为普通R对象。同样,Rremove()函数不具有任何挂钩,因此,如果remove()被称为跟踪的变量,它只会删除活动在可见光环境的结合,但不会打扰底层的跟踪环境。 track.remove()功能将完全删除一个跟踪变量的可见环境和基础的跟踪环境(包括删除相关联的磁盘文件)。

Object tracking was intended to be used in situations where large numbers of large objects must be manipulated.  Consequently, there is a good chance of exhausting resources while using the track package.  The track code tries to check return codes when creating objects or writing files, and in cases where it is unable to complete an operation it tries leave the tracking environment in a state from which objects can be salvaged.  The functions track.rebuild() and track.flush() are provided to help recover from situations where resource limitations prevented successful operation.  Note that files are generally written in a "unsafe" manner (i.e., existing files can be overwritten with partial new files), but in these cases data is retained in the memory and can be rewritten after resolving file system problems.
对象跟踪的目的中要使用大量的大对象的情况下,必须被操纵。因此,资源耗竭,而track使用包是一个很好的机会。 track代码试图创建对象时,检查返回代码或写入文件,并在情况下,它是无法完成的操作,它试图离开的跟踪状态的对象可以挽救的环境中。的功能track.rebuild()和track.flush()提供,以帮助恢复从资源限制的情况下,无法顺利运作。需要注意的是,文件一般都写在一个“不安全”的方式(即,部分新的文件,可以覆盖现有的文件),但在这些情况下,数据保留在内存中后,可以写入文件系统问题的解决。

The R functions exists() should be used with care on tracked objects, because it will actually fetch the object, possibly needing to read it from disk.  In the track code, the exists("x") function is not used to check existence of a possibly tracked object x, instead an idiom like is.element("x", objects(all=TRUE)) is used.
R功能exists()应跟踪对象小心使用,因为它实际上将提取的对象,可能需要从磁盘中读取。在track代码,exists("x")不使用此功能,以检查是否存在可能跟踪的对象x,而不是一个成语,如is.element("x", objects(all=TRUE))使用。

These statements about the available facilities in R were true as of R-2.4.1 (released Dec 2006).
这些陈述是真实的现有设施,在R,R-2.4.1(2006年12月发布)。

The rules for how variable names are mapped to file names are based on trying to use filenames that will work properly on all three operating systems R works on (Linux, Windows, and Mac OS X).  A somewhat obscure point that must be taken into account is the case-insensitivity of Mac OS X and Windows.  Even though modern versions of the OS's seem to use case in their file names, this is because they are case preserving, but they are in fact still case insensitive.  This means that a file created with the name "X.rda" is the same file as the "x.rda". Here is a short shell transcript showing this behavior in a bash shell running under Windows and Mac OS X (it's the same in both).
变量名称映射到文件名的规则的基础上尝试使用的文件名,将正常工作在所有三个操作系统修复工程(在Linux,Windows和Mac OS X)。必须考虑一个有点模糊的点是不区分大小写的Mac OS X和Windows。即使现代的操作系统的版本,似乎在它们的文件名使用的情况下,这是因为他们的情况下保留,但他们实际上仍然不区分大小写。这意味着,一个文件创建的的名称“X.rda”是相同的文件的“x.rda”,。这里是一个简短的shell成绩单显示在bash shell下运行Windows和Mac OS X(在这两个是相同的)这种行为。


可移植性----------Portability----------

Tracking directories are intended to be operating-system independent and completely portable across different operating systems.
跟踪目录的目的是要独立的,完全可移植的操作系统在不同的操作系统。


压缩----------Compression----------

Saved R objects are compressed by default in R and by the track package.  Decompression speed is very important for interactive response when using track, because each time an object is accessed, it is read from its file (unless the object is cached).  Of the compression algorithms available as of R-2.12.0, which are gzip, bzip2, and xz, gzip is the winner in terms of speed.  The default compression level in R for gzip is 6, but level 1 gives faster compression with slightly larger files (though decompression is not faster).  The lzop compression algorithm http://www.lzop.org is still faster but it is not yet available in R.
默认情况下,在Rtrack包的压缩保存的R对象。解压缩速度是非常重要的,交互式的响应时使用的轨道,因为每次访问一个对象,它是只读文件(除非对象是缓存)。 GZIP压缩算法的R-2.12.0,GZIP,BZIP2和XZ,在速度方面是赢家。在研发gzip的是默认的压缩级别,但1级压缩速度较快的稍微大一点的文件(虽然减压是不是更快)。 lzop的压缩算法http://www.lzop.org的还是比较快,但它尚未在R.

Here are some comparisons and benchmarks of various compression programs:
这里有一些不同的压缩方案的比较和基准:

http://www.linuxjournal.com/node/8051/print


http://tukaani.org/lzma/benchmarks.html


http://stephane.lesimple.fr/wiki/blog/lzop_vs_compress_vs_gzip_vs_bzip2_vs_lzma_vs_lzma2-xz_benchmark_reloaded


http://aliver.wordpress.com/2010/06/22/huge-unix-file-compresser-shootout-with-tons-of-datagraphs


http://www.maximumcompression.com/index.html


http://mattmahoney.net/dc/text.html


Compression/decompression is nicely handled in R: only the call to save() has arguments for compression.  Decompression in load() is handled automatically using a standard code (magic) at the start of the saved file.  Saved files can also be compressed or decompressed outside of R, and load() will still handle them correctly, provided the compression used is one of the types that R knows about.
是很好的压缩/解压缩处理R:只调用save()参数进行压缩。减压load()自动保存的文件在开始使用标准的代码(魔术)处理。保存的文件也可以被压缩或解压缩的R外,和load()仍然会正确处理它们,使用的压缩类型的R知道。


(作者)----------Author(s)----------


Tony Plate <tplate@acm.org>



参考文献----------References----------

News, 6(4):19-24, October 2006. http://cran.r-project.org/doc/Rnews and http://sandybox.typepad.com/software
2002.  http://cran.r-project.org/doc/Rnews

参见----------See Also----------

Overview of the track package.
track包的概述。

Documentation for makeActiveBinding and related functions (in 'base' package).
makeActiveBinding和相关的功能(在碱基包)的文件。

Inspriation from the packages g.data and filehash.
Inspriation的包g.data和filehash。

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2024-11-30 10:39 , Processed in 0.027032 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表