找回密码
 注册
查看: 15595|回复: 1

R语言:scan()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-16 19:50:31 | 显示全部楼层 |阅读模式
scan(base)
scan()所属R语言包:base

                                        Read Data Values
                                         读取数据值

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Read data into a vector or list from the console or file.
读入一个向量或从控制台或文件的列表数据。


用法----------Usage----------


scan(file = "", what = double(), nmax = -1, n = -1, sep = "",
     quote = if(identical(sep, "\n")) "" else "'\"", dec = ".",
     skip = 0, nlines = 0, na.strings = "NA",
     flush = FALSE, fill = FALSE, strip.white = FALSE,
     quiet = FALSE, blank.lines.skip = TRUE, multi.line = TRUE,
     comment.char = "", allowEscapes = FALSE,
     fileEncoding = "", encoding = "unknown", text)



参数----------Arguments----------

参数:file
the name of a file to read data values from.  If the specified file is "", then input is taken from the keyboard (or whatever stdin() reads if input is redirected or R is embedded). (In this case input can be terminated by a blank line or an EOF signal, Ctrl-D on Unix and Ctrl-Z on Windows.)  Otherwise, the file name is interpreted relative to the current working directory (given by getwd()), unless it specifies an absolute path. Tilde-expansion is performed where supported. When running R from a script, file="stdin" can be used to refer to the process's stdin file stream.  As from R 2.10.0 this can be a compressed file (see file).  Alternatively, file can be a connection, which will be opened if necessary, and if so closed at the end of the function call.  Whatever mode the connection is opened in, any of LF, CRLF or CR will be accepted as the EOL marker for a line and so will match sep = "\n".  file can also be a complete URL.  (For the supported URL schemes, see the "URLs" section of the help for url.)  To read a data file not in the current encoding (for example a Latin-1 file in a UTF-8 locale or conversely) use a file connection setting its encoding argument (or scan's fileEncoding argument).  
一个文件名,读取数据值。如果指定的文件是"",然后从键盘输入(或任何stdin()读取,如果输入重定向嵌入或R)。 (输入在这种情况下可以终止由一个空行或一个EOF信号,Ctrl-D在Unix和Ctrl-ZWindows上)。否则,文件名被解释为相对于当前工作目录(给予getwd()),除非它指定一个绝对路径。波浪线扩展进行支持的地方。从脚本运行时的R,file="stdin"可以用来指进程的stdin文件流。从R 2.10.0可以是一个压缩文件(见file)。另外,file可以是一个的connection,如果必要的话,这将被打开,如果在函数调用结束封闭。打开连接在任何模式,任何低频的CRLF或CR将被接纳为一条线的EOL标记等将匹配sep = "\n"。 file也可以是一个完整的URL。 (对于支持的URL方案,请参阅“网址”url帮助部分。)为了读而不是在当前的编码数据文件(例如在一个UTF-8语言环境的Latin-1的文件或相反)使用file连接设置encoding参数(或scan的fileEncoding参数)的。


参数:what
the type of what gives the type of data to be read. The supported types are logical, integer, numeric, complex, character, raw and list. If what is a list, it is assumed that the lines of the data file are records each containing length(what) items ("fields") and the list components should have elements which are one of the first six types listed or NULL, see section "Details" below.
的what类型给出要读取的数据类型。支持的类型是logical,integer,numeric,complex,character,raw和list。如果what是一个列表,它是假设的数据文件的行记录每个含有length(what)(“字段”)和列表组件应该有这是上半年的元素类型上市或NULL,看到一节“详细信息”下面。


参数:nmax
integer: the maximum number of data values to be read, or if what is a list, the maximum number of records to be read.  If omitted or not positive or an invalid value for an integer (and nlines is not set to a positive value), scan will read to the end of file.
整数:数据值最高可阅读,或如果what是一个列表,最大记录数读取。如果省略或不积极或无效值的一个整数(nlines没有被设置为正值),scan会读file年底。


参数:n
integer: the maximum number of data values to be read, defaulting to no limit.  Invalid values will be ignored.
整数:要读取的数据值的最大数量,默认为没有限制。无效的值将被忽略。


参数:sep
by default, scan expects to read "white-space" delimited input fields.  Alternatively, sep can be used to specify a character which delimits fields.  A field is always delimited by an end-of-line marker unless it is quoted.  If specified this should be the empty character string (the default) or NULL or a character string containing just one single-byte character.  
默认情况下,扫描希望阅读“白色空间分隔的输入字段。另外,sep可用于指定分隔字段的字符。字段始终是由一个行结束标记分隔,除非是被引用。如果指定的话,这应该是空字符串(默认)或NULL或一个字符串包含只有一个单字节字符。


参数:quote
the set of quoting characters as a single character string or NULL.  In a multibyte locale the quoting characters must be ASCII (single-byte).
作为一个字符串或NULL引用字符集。在一个多字节语言环境中引用字符必须是ASCII(单字节)。


参数:dec
decimal point character.  This should be a character string containing just one single-byte character.  (NULL and a zero-length character vector are also accepted, and taken as the default.)
小数点字符。这应该是一个字符串,其中包含只有一个单字节字符。 (NULL和一个零长度的字符向量也接受,并采取默认。)


参数:skip
the number of  lines of the input file to skip before beginning to read data values.
跳过输入文件的行数,然后才开始读取数据值。


参数:nlines
if positive, the maximum number of lines of data to be read.
如果阳性的话,最大数量的数据线被读取。


参数:na.strings
character vector.  Elements of this vector are to be interpreted as missing (NA) values.  Blank fields are also considered to be missing values in logical, integer, numeric and complex fields.
特征向量。该向量的元素被解释为失踪(NA)值。空白领域也被认为是缺少逻辑,整数,数字和复杂的领域中的价值。


参数:flush
logical: if TRUE, scan will flush to the end of the line after reading the last of the fields requested. This allows putting comments after the last field, but precludes putting more that one record on a line.
逻辑:如果TRUE,scan将刷新行结束后阅读领域的最后要求。这允许把最后一个字段后的意见,但无法把线路上的一个记录。


参数:fill
logical: if TRUE, scan will implicitly add empty fields to any lines with fewer fields than implied by what.
逻辑:如果TRUE,scan将隐式添加空字段与少于what的暗示领域的任何行。


参数:strip.white
vector of logical value(s) corresponding to items in the what argument.  It is used only when sep has been specified, and allows the stripping of leading and trailing "white space" from character fields (numeric fields are always stripped).  Note: white space inside quoted strings is not stripped.  If strip.white is of length 1, it applies to all fields; otherwise, if strip.white[i] is TRUE and the i-th field is of mode character (because what[i] is) then the leading and trailing unquoted white space from field i is stripped.  
逻辑值(S)what参数项目对应的向量。它仅用于当sep已指定,并允许剥离的开头和结尾的character(numeric领域总是剥离领域)的“白色空间”。注:引号的字符串内的空格是不能剥夺。如果strip.white长度为1的是,它适用于所有领域,否则,如果strip.white[i]是TRUE和i个领域是因为<X(模式字符>)然后在开头和结尾的非上市从外地what[i]被剥离的空白。


参数:quiet
logical: if FALSE (default), scan() will print a line, saying how many items have been read.
逻辑:如果FALSE(默认),扫描()将打印一条线,说多少项目已读。


参数:blank.lines.skip
logical: if TRUE blank lines in the input are ignored, except when counting skip and nlines.
逻辑:如果TRUE在输入空行被忽略,除了当计数skip和nlines。


参数:multi.line
logical.  Only used if what is a list.  If FALSE, all of a record must appear on one line (but more than one record can appear on a single line).  Note that using fill = TRUE implies that a record will be terminated at the end of a line.
逻辑。只用了如果what是一个列表。如果FALSE,所有的记录必须出现在同一行(但多个记录可以出现在一个单一的线)。注意使用fill = TRUE意味着在一行的末尾,记录将被终止。


参数:comment.char
character: a character vector of length one containing a single character or an empty string.  Use "" to turn off the interpretation of comments altogether (the default).
性格:特征向量的长度包含单个字符或一个空字符串之一。使用""关闭评论的解释共(默认值)。


参数:allowEscapes
logical.  Should C-style escapes such as \n be processed (the default) or read verbatim?   Note that if not within quotes these could be interpreted as a delimiter (but not as a comment character).  The escapes which are interpreted are the control characters \a, \b, \f, \n, \r, \t, \v and octal and hexadecimal representations like \040 and \0x2A.  Any other escaped character is treated as itself, including backslash. Note that Unicode escapes (starting \u or \U: see Quotes) are never processed.  
逻辑。 C风格,如\n处理(默认)或读逐字逃逸?请注意,如果不是引号内的这些都可以解释为分隔符(而不是作为一个注释字符)。该解释的逃逸是控制字符\a, \b, \f, \n, \r, \t, \v像\040和\0x2A八进制和十六进制表示。本身被视为任何其他转义字符,包括反斜杠。需要注意的是Unicode转义(开始\u或\U:看到引号)从未处理。


参数:fileEncoding
character string: if non-empty declares the encoding used on a file (not a connection nor the keyboard) so the character data can be re-encoded.  See the "Encoding" section of the help for file, and the "R Data Import/Export Manual".  
字符串:如果非空的声明文件(没有连接,也没有键盘)上使用这样的字符数据可以被重新编码的编码。参见“编码”部分的帮助file“R数据导入/导出手册”。


参数:encoding
encoding to be assumed for input strings.  If the value is "latin1" or "UTF-8" it is used to mark character strings as known to be in Latin-1 or UTF-8: it is not used to re-encode the input (see fileEncoding.  See also "Details".  
假设输入字符串编码。如果该值是"latin1"或"UTF-8"它是用来纪念被称为是在拉丁美洲-1或UTF-8字符串:它不是用来重新编码输入(见<X >参见“详细资料”。


参数:text
character string: if file is not supplied and this is,  then data are read from the value of text via a text connection.  
字符串:file如果不提供的,这是,那么数据是从text值读通过的文本连接。


Details

详情----------Details----------

The value of what can be a list of types, in which case scan returns a list of vectors with the types given by the types of the elements in what.  This provides a way of reading columnar data.  If any of the types is NULL, the corresponding field is skipped (but a NULL component appears in the result).
的what价值可以是一个类型的列表,在这种情况下,scan返回一个向量元素的类型what类型列表。这提供了一个柱状数据读取方式。如果有任何的类型是NULL,相应的字段将被跳过(但NULL组件出现在结果中)。

The type of what or its components can be one of the six atomic vector types or NULL (see is.atomic).
what或它的组件可以是一个六原子向量类型或NULL(见is.atomic)。

"White space" is defined for the purposes of this function as one or more contiguous characters from the set space, horizontal tab, carriage return and line feed.  It does not include form feed or vertical tab, but in Latin-1 and Windows 8-bit locales 'space' includes non-breaking space.
此功能的目的被定义为“白色空间”作为一个或多个连续字符集空间,水平制表符,回车符和换行符。它不包括形式饲料或垂直制表符,但在拉丁美洲-1和Windows 8位语言环境“空间”,包括非打破空间。

Empty numeric fields are always regarded as missing values. Empty character fields are scanned as empty character vectors, unless na.strings contains "" when they are regarded as missing values.
空数字领域始终被视为缺失值。空字符字段为空字符向量扫描,除非na.strings包含""当他们视为缺失值。

The allowed input for a numeric field is optional whitespace followed either NA or an optional sign followed by a decimal or hexadecimal constant (see NumericConstants), or NaN, Inf or infinity (ignoring case).  Out-of-range values are recorded as Inf, -Inf or 0.
允许输入数值字段是可选的空白之后要么NA或十进制或十六进制常数(见NumericConstants)一个可选的标志,或NaN,Inf或infinity (忽略大小写)。范围外的值记录作为Inf,-Inf或0。

For an integer field the allowed input is optional whitespace, followed by either NA or an optional sign and one or more digits (0-9): all out-of-range values are converted to NA_integer_.
一个整数字段允许的输入是可选的空白,要么NA或一个可选的标志和一个或多个数字(0-9):所有范围的值被转换为NA_integer_ 。

If sep is the default (""), the character \ in a quoted string escapes the following character, so quotes may be included in the string by escaping them.
sep如果是默认的(""),字\下面的字符在引用字符串中逃脱,所以报价可能会包含在字符串中,由他们逃跑。

If sep is non-default, the fields may be quoted in the style of ".csv" files where separators inside quotes ('' or "") are ignored and quotes may be put inside strings by doubling them.  However, if sep = "\n" it is assumed by default that one wants to read entire lines verbatim.
sep如果非默认字段可能在.csv文件,其中引号内的分隔符(''或"")被忽略的风格被引用引号可能会增加一倍它们放进字符串。但是,如果sep = "\n"默认情况下,它是由假定一个要逐字读取整行。

Quoting is only interpreted in character fields and in NULL fields (which might be skipping character fields).
报价只解释在字符字段和NULL字段(可能被跳过的字符字段)。

Note that since sep is a separator and not a terminator, reading a file by scan("foo", sep="\n", blank.lines.skip=FALSE) will give an empty final line if the file ends in a linefeed and not if it does not.  This might not be what you expected; see also readLines.
注意sep自是一个分离器,而不是一个终结,阅读文件scan("foo", sep="\n", blank.lines.skip=FALSE)会给一个空的,如果该文件的最后一行结束在一个换行符,如果它不。这可能不是您所预期的,也看到readLines。

If comment.char occurs (except inside a quoted character field), it signals that the rest of the line should be regarded as a comment and be discarded.  Lines beginning with a comment character (possibly after white space with the default separator) are treated as blank lines.
如果comment.char(内引述的字符字段除外),它标志着该行的其余部分应视为评论,被丢弃。注释字符(可能是默认的分隔符后的空白)开始的行被视为空白行。

There is a line-length limit of 4095 bytes when reading from the console (which may impose a lower limit: see "An Introduction to R").
从控制台(可处以下限:见“到R)读取时,有一个4095字节的行长度的限制。

There is a check for a user interrupt every 1000 lines if what is a list, otherwise every 10000 items.
有一个用户中断what如果是一个列表,否则每10000件每1000行的支票。

If file is a character string and fileEncoding is non-default, or it it is a not-already-open connection with a non-default encoding argument, the text is converted to UTF-8 and declared as such (and the encoding argument to scan is ignored).  See the examples of readLines.
file如果是一个字符串和fileEncoding非默认,或者它是一个非默认已经打开的连接encoding的说法,文本转换为UTF -8和声明等(encodingscan被忽略的参数)。见readLines的例子。


值----------Value----------

if what is a list, a list of the same length and same names (as any) as what.
如果what是一个列表,列表what(任何)一个相同的长度和相同的名称。

Otherwise, a vector of the type of what.
否则,的what类型的向量。

Character strings in the result will have a declared encoding if encoding is "latin1" or "UTF-8".
结果字符串将有一个声明的编码,如果encoding是"latin1"或"UTF-8"。


注意----------Note----------

The default for multi.line differs from S.  To read one record per line, use flush = TRUE and multi.line = FALSE. (Note that quoted character strings can still include embedded newlines.)
默认multi.line从与不同的阅读每行记录,使用flush = TRUE和multi.line = FALSE。 (注意:带引号的字符串仍然可以包含嵌入的换行符)。

If number of items is not specified, the internal mechanism re-allocates memory in powers of two and so could use up to three times as much memory as needed.  (It needs both old and new copies.)  If you can, specify either n or nmax whenever inputting a large vector, and nmax or nlines when inputting a large list.
如果未指定的项目数,可以使用内部机制的重新分配内存在两个权力和三倍之多的内存需要。 (它需要老的和新的副本。)如果可以的话,请指定n或nmax只要输入一个大的向量,nmax或nlines当输入一个大名单。

Using scan on an open connection to read partial lines can lose chars: use an explicit separator to avoid this.
使用scan开放连接读取部分行可能会丢失字符:使用一个明确的分隔,避免这种情况。

Having nul bytes in fields (including \0 if allowEscapes = TRUE) may lead to interpretation of the field being terminated at the nul.  They not normally present in text files &ndash; see readBin.
有nul领域中的字节(包括\0如果allowEscapes = TRUE)可能会导致被终止在nul领域的解释。他们通常不会在目前的文本文件 - 看readBin。


参考文献----------References----------

The New S Language. Wadsworth &amp; Brooks/Cole.

参见----------See Also----------

read.table for more user-friendly reading of data matrices; readLines to read a file a line at a time. write.
read.table更方便用户读取数据矩阵; readLines读一个文件一次一行。 write。

Quotes for the details of C-style escape sequences.
QuotesC风格的转义序列的详细信息。

readChar and readBin to read fixed or variable length character strings or binary representations of numbers a few at a time from a connection.
readChar和readBin读取固定或可变长度的字符串或二进制数字表示一些在连接时。


举例----------Examples----------


cat("TITLE extra line", "2 3 5 7", "11 13 17", file="ex.data", sep="\n")
pp <- scan("ex.data", skip = 1, quiet= TRUE)
scan("ex.data", skip = 1)
scan("ex.data", skip = 1, nlines=1) # only 1 line after the skipped one[只有1号线后,跳过]
scan("ex.data", what = list("","","")) # flush is F -&gt; read "7"[冲洗的F  - >改为“7”]
scan("ex.data", what = list("","",""), flush = TRUE)
unlink("ex.data") # tidy up[清理]

## "inline" usage[#“内联”的用法]
scan(text="1 2 3")


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

发表于 2013-10-1 09:50:22 | 显示全部楼层
参数:flush到底怎么用,翻译的好 晕?
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-22 23:04 , Processed in 0.022351 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表