R语言 rockchalk包 summarize()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-27 22:45:07

summarize(rockchalk)
summarize()所属R语言包：rockchalk

                                    Sorts numeric from factor variables and returns separate
                                       从因子变量，并返回单独的数字排序

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

The work is done by the functions summarizeNumerics and summarizeFactors. Please see the help pages for those functions for complete details.
这项工作是做的功能summarizeNumerics和summarizeFactors。有关完整的详细信息，请参阅这些功能的帮助页面。

用法----------Usage----------

  summarize(dat, ...)

参数----------Arguments----------

参数：dat
A data frame
一个数据框

参数：...
Optional arguments that are passed to summarizeNumerics and summarizeFactors.  These may be used: maxLevels The maximum number of levels that will be reported.  alphaSort If TRUE (default), the columns are re-organized in alphabetical order. If FALSE, they are presented in the original order.  digits integer, used for number formatting output.
可选参数被传递的summarizeNumerics和summarizeFactors的。这些可以单独使用：maxLevels的最大数目将被报告的水平。 alphaSort如果是TRUE（默认值），重新组织列的字母顺序排列。如果为FALSE，他们在原来的顺序。位整数，用于数字格式输出。

值----------Value----------

A list with 2 objects, numerics and factors. numerics is a matrix of summary information, while factors is a list of factor summaries.
列表2个对象，数值和因素。数值解是一个矩阵的摘要信息，而因素是一系列的因素总结。

----------Author(s)----------

Paul E. Johnson <pauljohn@ku.edu>

实例----------Examples----------

library(rockchalk)

set.seed(23452345)
N <- 100
x1 <- gl(12, 2, labels = LETTERS[1:12])
x2 <- gl(8, 3, labels = LETTERS[12:24])
x1 <- sample(x = x1, size=N, replace = TRUE)
x2 <- sample(x = x2, size=N, replace = TRUE)
z1 <- rnorm(N)
a1 <- rnorm(N, mean = 1.2, sd = 1.7)
a2 <- rpois(N, lambda = 10 + a1)
a3 <- rgamma(N, 0.5, 4)
b1 <- rnorm(N, mean = 1.3, sd = 1.4)
dat <- data.frame(z1, a1, x2, a2, x1, a3, b1)
summary(dat)

summarize(dat)

summarizeNumerics(dat)
summarizeFactors(dat, maxLevels = 5)

summarize(dat, alphaSort = FALSE)

summarize(dat, digits = 6, alphaSort = FALSE)

summarize(dat, digits = 22, alphaSort = FALSE)

summarize(dat, maxLevels = 2)

datsumm <- summarize(dat)

datsumm$numerics
datsumm[[1]]  ## same: gets numerics[＃一样的：获取数值解]

datsumm$factors
datsumm[[2]]

## Use numerics output to make plots. First,[＃使用数值输出，使图。首先，]
## transpose gives varnames x summary stat matrix[＃调换给varnames X摘要统计矩阵]
datsummNT <- t(datsumm$numerics)
datsummNT <- as.data.frame(datsummNT)

plot(datsummNT$mean, datsummNT$var, xlab = "The Means",
ylab = "The Variances")

plot(datsummNT$mean, datsummNT$var, xlab = "The Means",
ylab = "The Variances", type = "n")
text(datsummNT$mean, datsummNT$var, labels = rownames(datsummNT))

## Here's a little plot wrinkle.  Note variable names are 'out to the[＃这里有一个小图皱纹。请注意变量名“的]
##  edge' of the plot. If names are longer they don't stay inside[＃边缘“的图。如果名称是更长的时间，他们不留在里面]
##  figure. See?[＃图。看到了吗？]

## Make the variable names longer[＃的变量名，使之不再]

rownames(datsummNT)
rownames(datsummNT) <- c("boring var", "var with long name",
"tedious name var", "stupid varname", "buffoon not baboon")
plot(datsummNT$mean, datsummNT$var, xlab = "The Means",
ylab = "The Variances", type = "n")
text(datsummNT$mean, datsummNT$var, labels = rownames(datsummNT),
cex = 0.8)
## That's no good. Names across the edges[＃这是没有好。名称在边缘]

## We could brute force the names outside the edges like[＃我们可以暴力破解等边缘以外的名称]
##  this[＃这]
par(xpd = TRUE)
text(datsummNT$mean, datsummNT$var, labels = rownames(datsummNT),
cex = 0.8)
## but that is not much better[但也好不了多少]
par(xpd = FALSE)

## Here is one fix. Make the unused space inside the plot[＃这里是一个修复。未使用的空间里面的图]
##  larger by[＃大]
## making xlim and ylim bigger.  I use the magRange[＃作适当调整和ylim更大。我用的是magRange]
##  function from[＃函数]
## rockchalk to easily expand range to 1.2 times its[＃rockchalk轻松扩展至1.2倍的范围]
##  current size.[＃电流的大小。]
## otherwise, long variable names do not fit inside plot.[＃否则，不适合长变量名内部图。]
##  magRange[＃magRange]
## could be asymmetric if we want, but this use is[＃可能是不对称的，如果我们想要的，但使用的是]
##  symmetric.[＃对称的。]

rownames(datsummNT)
rownames(datsummNT) <- c("boring var", "var with long name",
"tedious name var", "stupid varname", "buffoon not baboon")
plot(datsummNT$mean, datsummNT$var, xlab = "The Means",
ylab = "The Variances", type = "n", xlim = magRange(datsummNT$mean,
      1.2), ylim = magRange(datsummNT$var, 1.2))
text(datsummNT$mean, datsummNT$var, labels = rownames(datsummNT),
cex = 0.8)

## Here's another little plot wrinkle.  If we don't do that to keep[＃这里是另一个小的图皱纹。如果我们不这样做，为了保持]
## the names in bounds, we need some fancy footwork.  Note when a[＃中的名称界，我们需要一些奇特的步法。请注意，当一个]
## point is near the edge, I make sure the text prints toward the[＃点附近的优势，向我要确保文本打印]
## center of the graph.[＃中心的曲线图。]
plot(datsummNT$mean, datsummNT$var, xlab = "The Means",
ylab = "The Variances")
## calculate label positions. This is not as fancy as it could be.  If[＃计算标签的位置。这不是幻想，因为它可以。如果]
##  there were lots of variables, we'd have to get smarter about[＃有大量的变量，我们就必须变得更聪明]
##  positioning labels on above, below, left, or right.[定位标签的上方，下方，左，或右。]
labelPos <- ifelse(datsummNT$mean - mean(datsummNT$mean,
na.rm = TRUE) > 0, 2, 4)
text(datsummNT$mean, datsummNT$var, labels = rownames(datsummNT),
cex = 0.8, pos = labelPos)

x <- data.frame(x = rnorm(N), y = gl(50, 2), z = rep(1:4,
25), ab = gl(2, 50))

summarize(x)
summarize(x, maxLevels = 15)

sumry <- summarize(x)
sumry[[1]]  ##another way to get the numerics output[＃另一种方式来获得输出的数值计算]
sumry[[2]]  ##another way to get the factors output[＃另一种方式来获得输出的因素]

dat <- data.frame(x = rnorm(N), y = gl(50, 2), z = factor(rep(1:4,
25), labels = c("A", "B", "C", "D")), animal = factor(ifelse(runif(N) <
0.2, "cow", ifelse(runif(N) < 0.5, "pig", "duck"))))

summarize(dat)

## Run this if you have internet access[＃如果您有互联网连接，运行此]

## dat <- read.table(url("http://pj.freefaculty.org/guides/stat/DataSets/USNewsCollege/USNewsCollege.csv"),[＃DAT < -  read.table（的URL（“http://pj.freefaculty.org/guides/stat/DataSets/USNewsCollege/USNewsCollege.csv”），]
## sep = ",")[＃九月=“，”）]

## colnames(dat) <- c("fice", "name", "state", "private", "avemath",[＃colnames（DAT） -  C（“五虎”，“名”，“状态”，“私人”，的“avemath”]
##                   "aveverb", "avecomb", "aveact", "fstmath",[，＃“aveverb”中，“avecomb”中，“aveact”中，“fstmath”，]
##                   "trdmath", "fstverb", "trdverb", "fstact",[＃“trdmath”，“fstverb”，“trdverb”，“fstact”，]
##                   "trdact", "numapps", "numacc", "numenr",[＃“trdact，”numapps的“，”numacc“中，”numenr“，]
##                   "pctten", "pctquart", "numfull", "numpart",[＃“pctten”，“pctquart”，“numfull”，“numpart”，]
##                   "instate", "outstate", "rmbrdcst", "roomcst",[＃“缴费”中，“outstate”，“rmbrdcst”，“roomcst”，]
##                   "brdcst", "addfees", "bookcst", "prsnl",[＃“BRDCST”中，“addfees”中，“bookcst”中，“prsnl”，]
##                   "pctphd", "pctterm", "stdtofac", "pctdonat",[＃“pctphd中，”pctterm“中，”stdtofac“中，”pctdonat“，]
##                   "instcst", "gradrate")[＃“instcst”中，“gradrate”）]

## dat$private <- factor(dat$private, labels = c("public",[＃DAT私人< - 因子（DAT $私人，标签= C（“公共”，]
##                                  "private"))[＃“私人”））]
## sumry <- summarize(dat, digits = 2)[sumry <＃ - 总结，数字= 2（DAT）]
## sumry[＃sumry]

## sumry[[1]][＃sumry [[1]]]
## sumry[[2]][＃sumry [[2]]]

## summarize(dat[, c("fice", "name", "private", "fstverb",[＃总结（DAT，C（“五虎”，“名”，“私人”，“fstverb”]
##                "avemath")], digits = 4)[＃“avemath”）] = 4，数字）]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 rockchalk包 summarize()函数中文帮助文档(中英文对照)

浏览过的版块