strsplit(base)
strsplit()所属R语言包:base
Split the Elements of a Character Vector
分裂的特征向量的元素
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Split the elements of a character vector x into substrings according to the matches to substring split within them.
根据子串x在他们的比赛,分裂成子字符向量split元素。
用法----------Usage----------
strsplit(x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE)
参数----------Arguments----------
参数:x
character vector, each element of which is to be split. Other inputs, including a factor, will give an error.
特征向量,其中的每个元素被分裂。其他的投入,包括一个因素,将给出一个错误。
参数:split
character vector (or object which can be coerced to such) containing regular expression(s) (unless fixed = TRUE) to use for splitting. If empty matches occur, in particular if split has length 0, x is split into single characters. If split has length greater than 1, it is re-cycled along x.
特征向量(或对象可以强制等)含正则表达式(S)(除非fixed = TRUE)用于分裂。如果出现空场比赛,尤其是如果split长度为0,x分成单个字符。如果split长度大于1,它是沿x循环再用。
参数:fixed
logical. If TRUE match split exactly, otherwise use regular expressions. Has priority over perl.
逻辑。如果TRUE匹配split准确,否则使用正则表达式。有过perl优先。
参数:perl
logical. Should perl-compatible regexps be used?
逻辑。 Perl兼容的正则表达式应该使用?
参数:useBytes
logical. If TRUE the matching is done byte-by-byte rather than character-by-character, and inputs with marked encodings are not converted. This is forced (with a warning) if any input is found which is marked as "bytes".
逻辑。如果TRUE进行匹配字节逐字节而不是字符,并与显着的编码输入字符不会被转换。这是被迫(警告),如果发现任何输入被标记为"bytes"。
Details
详情----------Details----------
Argument split will be coerced to character, so you will see uses with split = NULL to mean split = character(0), including in the examples below.
参数split将被强制转换为字符,所以你会看到split = NULL指split = character(0),包括在下面的例子使用。
Note that splitting into single characters can be done via split = character(0) or split = ""; the two are equivalent. The definition of "character" here depends on the locale: in a single-byte locale it is a byte, and in a multi-byte locale it is the unit represented by a "wide character" (almost always a Unicode point).
请注意,分拆为单个字符,可以通过split = character(0)或split = "";两者是等价的。这里的“性格”的定义取决于语言环境:在一个单字节语言环境,它是一个字节和多字节语言环境中,它是由“宽字符”(几乎总是一个Unicode点)所代表的单位。
A missing value of split does not split the corresponding element(s) of x at all.
一个split缺失值不分裂x(S)在所有相应的元素。
The algorithm applied to each input string is
该算法应用到每一个输入字符串
值----------Value----------
A list of the same length as x, the i-th element of which contains the vector of splits of x[i].
长度相同的名单x,i个元素,其中包含矢量分裂x[i]。
If any element of x or split is declared to be in UTF-8 (see Encoding), all non-ASCII character strings in the result will be in UTF-8 and have their encoding declared as UTF-8. As from R 2.10.0, for perl = TRUE, useBytes = FALSE all non-ASCII strings in a multibyte locale are translated to UTF-8.
如果x或split宣布在UTF-8(见Encoding),结果在所有非ASCII字符的字符串将在UTF-8,并有任何元素为UTF-8编码声明。从R 2.10.0,perl = TRUE, useBytes = FALSE在多字节语言环境中的所有非ASCII字符串转换为UTF-8。
注意----------Note----------
Prior to R 2.11.0 there was an argument extended which could be used to select "basic" regular expressions: this was often used when fixed = TRUE would be preferable. In the actual implementation (as distinct from the POSIX standard) the only difference was that ?, +, {, |, (, and ) were not interpreted as metacharacters.
ŕ2.11.0之前,有一个参数extended可以用来选择基本正则表达式:这是经常被用来当fixed = TRUE将是可取的。 (POSIX标准不同)在实际执行中,唯一的区别是?,+,{,|,(,)没有被解释为元字符。
参见----------See Also----------
paste for the reverse, grep and sub for string search and manipulation; also nchar, substr.
paste相反,grep和sub字符串搜索和操纵;nchar,substr。
"regular expression" for the details of the pattern specification.
“正则表达式的模式规范的细节。
举例----------Examples----------
noquote(strsplit("A text I want to display with spaces", NULL)[[1]])
x <- c(as = "asfef", qu = "qwerty", "yuiop[", "b", "stuff.blah.yech")
# split x on the letter e[拆分上的字母e的x]
strsplit(x, "e")
unlist(strsplit("a.b.c", "."))
## [1] "" "" "" "" ""[#[1]“”“”“”“”“”]
## Note that 'split' is a regexp![#注意,“分裂”是一个正则表达式!]
## If you really want to split on '.', use[#如果你真的想上。分裂,使用]
unlist(strsplit("a.b.c", "\\."))
## [1] "a" "b" "c"[#[1]“A”“B”“C”]
## or[#或]
unlist(strsplit("a.b.c", ".", fixed = TRUE))
## a useful function: rev() for strings[#一个有用的功能:转()字符串]
strReverse <- function(x)
sapply(lapply(strsplit(x, NULL), rev), paste, collapse="")
strReverse(c("abc", "Statistics"))
## get the first names of the members of R-core[#R型铁芯的成员名字]
a <- readLines(file.path(R.home("doc"),"AUTHORS"))[-(1:8)]
a <- a[(0:2)-length(a)]
(a <- sub(" .*","", a))
# and reverse them[和扭转它们]
strReverse(a)
## Note that final empty strings are not produced:[#注意,没有产生最后的空字符串:]
strsplit(paste(c("", "a", ""), collapse="#"), split="#")[[1]][“),分裂=”#“)[1]]
# [1] "" "a"[[1]“”“一”]
## and also an empty string is only produced before a definite match:[#前一个明确的比赛只产生一个空字符串:]
strsplit("", " ")[[1]] # character(0)[字符(0)]
strsplit(" ", " ")[[1]] # [1] ""[[1]“”]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|