找回密码
 注册
查看: 5685|回复: 0

R语言:reshape()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-16 21:47:45 | 显示全部楼层 |阅读模式
reshape(stats)
reshape()所属R语言包:stats

                                        Reshape Grouped Data
                                         重塑分组数据

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

This function reshapes a data frame between "wide" format with repeated measurements in separate columns of the same record and "long" format with the repeated measurements in separate records.
此功能重塑一个单独的列相同的记录,并在单独的记录重复测量长格式的重复测量之间的“宽”格式的数据框。


用法----------Usage----------


reshape(data, varying = NULL, v.names = NULL, timevar = "time",
        idvar = "id", ids = 1:NROW(data),
        times = seq_along(varying[[1]]),
        drop = NULL, direction, new.row.names = NULL,
        sep = ".",
        split = if (sep == "") {
            list(regexp = "[A-Za-z][0-9]", include = TRUE)
        } else {
            list(regexp = sep, include = FALSE, fixed = TRUE)}
        )




参数----------Arguments----------

参数:data
a data frame
一个数据框


参数:varying
names of sets of variables in the wide format that correspond to single variables in long format ("time-varying").  This is canonically a list of vectors of variable names, but it can optionally be a matrix of names, or a single vector of names.  In each case, the names can be replaced by indices which are interpreted as referring to names(data).  See "Details" for more details and options.
套在宽格式长格式的单变量(“时变”),对应的变量名。这是规范的向量变量名的名单,但它可以选择成为一个矩阵的名称或名称的单矢量。在每一种情况下,名称可以解释为指names(data)指数取代。看到更多的细节和选项的“详细资料”。


参数:v.names
names of variables in the long format that correspond to multiple variables in the wide format.  See "Details".
在长格式,对应的宽幅多个变量的变量名。见“详细资料”。


参数:timevar
the variable in long format that differentiates multiple records from the same group or individual.  If more than one record matches, the first will be taken.
在长格式的变量的区别来自同一团体或个人的多个记录。如果超过一个匹配的记录,首先将采取。


参数:idvar
Names of one or more variables in long format that identify multiple records from the same group/individual.  These variables may also be present in wide format.
在长格式中的一个或多个变量,找出多条记录,从同一组/个人的名称。这些变量也可能是目前在宽格式。


参数:ids
the values to use for a newly created idvar variable in long format.
使用一个新创建的idvar长格式的变量的值。


参数:times
the values to use for a newly created timevar variable in long format.  See "Details".
使用一个新创建的timevar长格式的变量的值。见“详细资料”。


参数:drop
a vector of names of variables to drop before reshaping.
变量名的向量下降重塑。


参数:direction
character string, either "wide" to reshape to wide format, or "long" to reshape to long format.
字符串,要么"wide"重塑宽幅,或"long"重塑长格式。


参数:new.row.names
logical; if TRUE and direction = "wide", create new row names in long format from the values of the id and time variables.
逻辑;如果TRUE和direction = "wide",id和time变量的值在长格式中创建新行名。


参数:sep
A character vector of length 1, indicating a separating character in the variable names in the wide format.  This is used for guessing v.names and times arguments based on the names in varying.  If sep == "", the split is just before the first numeral that follows an alphabetic character.  This is also used to create variable names when reshaping to wide format.
一个长度为1的特征向量,这表明在宽格式变量名的分隔符。这是用于猜测v.names和times根据varying的名字参数。如果sep == "",分裂前的第一个数字如下一个字母字符。这也可以用来重塑宽幅时创建变量名。


参数:split
A list with three components, regexp, include, and (optionally) fixed.  This allows an extended interface to variable name splitting.  See "Details".
与三个组成部分的列表,regexp,include,和(可选)fixed。这使得一个变量名分裂的扩展接口。见“详细资料”。


Details

详情----------Details----------

The arguments to this function are described in terms of longitudinal data, as that is the application motivating the functions.  A "wide" longitudinal dataset will have one record for each individual with some time-constant variables that occupy single columns and some time-varying variables that occupy a column for each time point.  In "long" format there will be multiple records for each individual, with some variables being constant across these records and others varying across the records.  A "long" format dataset also needs a "time" variable identifying which time point each record comes from and an "id" variable showing which records refer to the same person.
这个函数的参数描述纵向数据方面,因为这是激励功能的应用。 “宽”的纵向数据集,将有一些时间常数变量占据单个列和一些随时间变化的变量,占据了每个时间点列的每一个人的纪录之一。在长格式会有多个记录的每一个人,有些变量在整个记录这些记录和不同的人不断。一个“长”格式的数据集,还需要一个时间变量,确定哪个时间点每个记录来自ID变量显示的记录是指同一人。

If the data frame resulted from a previous reshape then the operation can be reversed simply by reshape(a).  The direction argument is optional and the other arguments are stored as attributes on the data frame.
如果数据框,导致从以前的reshape然后操作可逆转reshape(a)简单。 direction参数是可选的,其他参数都存储为数据框的属性。

If direction = "wide" and no varying or v.names arguments are supplied it is assumed that all variables except idvar and timevar are time-varying.  They are all expanded into multiple variables in wide format.
如果direction = "wide"没有varying或v.names参数提供假定,除了idvar和timevar是随时间变化的变量。他们都在扩大到宽幅多个变量。

If direction = "long" the varying argument can be a vector of column names (or a corresponding index).  The function will attempt to guess the v.names and times from these names.  The default is variable names like x.1, x.2, where sep = "." specifies to split at the dot and drop it from the name.  To have alphabetic followed by numeric times use sep = "".
如果direction = "long"varying参数可以是一个列名(或相应的指数)的向量。该函数将尝试猜测v.names和times从这些名字。默认的是变量名,像x.1x.2,其中sep = "."指定分割点,拖放它的名字。其次是数字时代的字母使用sep = ""。

Variable name splitting as described above is only attempted in the case where varying is an atomic vector, if it is a list or a matrix, v.names and times will generally need to be specified, although they will default to, respectively, the first variable name in each set, and sequential times.
如上所述的变量名分裂只试图在varying是一个原子的向量,如果它是一个列表或矩阵,v.names和times一般会需要指定情况虽然他们将默认分别在每一组的第一个变量的名字,连续倍。

Also, guessing is not attempted if v.names is given explicitly.  Notice that the order of variables in varying is like x.1,y.1,x.2,y.2.
此外,猜测,如果没有尝试v.names给出明确。请注意,在变量的顺序varying是x.1,y.1,x.2,y.2。

The split argument should not usually be necessary.  The split$regexp component is passed to either strsplit() or regexp(), where the latter is used if split$include is TRUE, in which case the splitting occurs after the first character of the matched string.  In the strsplit() case, the separator is not included in the result, and it is possible to specify fixed-string matching using split$fixed.
split参数通常不应该是必要的。 split$regexp组件传递给strsplit()或如果regexp(),后者用于split$include是TRUE,在这种情况下,发生分裂后的第一次字符匹配的字符串。分离器的在strsplit()的情况下,不包括在结果中,使用split$fixed,它可以指定固定字符串匹配。


值----------Value----------

The reshaped data frame with added attributes to simplify reshaping back to the original form.
重塑数据框添加的属性,以简化返回到原来的形式重塑。


参见----------See Also----------

stack, aperm; relist for reshaping the result of unlist.
stack,aperm;relist重塑unlist结果为。


举例----------Examples----------


summary(Indometh)
wide <- reshape(Indometh, v.names = "conc", idvar = "Subject",
                timevar = "time", direction = "wide")
wide

reshape(wide, direction = "long")
reshape(wide, idvar = "Subject", varying = list(2:12),
        v.names = "conc", direction = "long")

## times need not be numeric[#次不需要是数字]
df <- data.frame(id = rep(1:4, rep(2,4)),
                 visit = I(rep(c("Before","After"), 4)),
                 x = rnorm(4), y = runif(4))
df
reshape(df, timevar = "visit", idvar = "id", direction = "wide")
## warns that y is really varying[#Y是真的变警告]
reshape(df, timevar = "visit", idvar = "id", direction = "wide", v.names = "x")


##  unbalanced 'long' data leads to NA fill in 'wide' form[#不平衡“长”的数据,导致为NA填写在“宽”的形式]
df2 <- df[1:7, ]
df2
reshape(df2, timevar = "visit", idvar = "id", direction = "wide")

## Alternative regular expressions for guessing names[#另一种猜测的名字正则表达式]
df3 <- data.frame(id = 1:4, age = c(40,50,60,50), dose1 = c(1,2,1,2),
                  dose2 = c(2,1,2,1), dose4 = c(3,3,3,3))
reshape(df3, direction = "long", varying = 3:5, sep = "")


## an example that isn't longitudinal data[#一个例子,是不是纵向数据]
state.x77 <- as.data.frame(state.x77)
long <- reshape(state.x77, idvar = "state", ids = row.names(state.x77),
                times = names(state.x77), timevar = "Characteristic",
                varying = list(names(state.x77)), direction = "long")

reshape(long, direction = "wide")

reshape(long, direction = "wide", new.row.names = unique(long$state))

## multiple id variables[#多个ID变量]
df3 <- data.frame(school = rep(1:3, each = 4), class = rep(9:10, 6),
                  time = rep(c(1,1,2,2), 3), score = rnorm(12))
wide <- reshape(df3, idvar = c("school","class"), direction = "wide")
wide
## transform back[#转换回]
reshape(wide)


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-22 21:40 , Processed in 0.056011 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表