SDData(SDisc)
SDData()所属R语言包:SDisc
Data container for SDisc analyses
数据容器SDisc分析中
译者:生物统计家园网 机器人LoveR
描述----------Description----------
SDisc dataset container constructor taking as input a description of the data and an analysis prefix.
SDisc数据集容器构造函数接受作为输入的数据,并分析前缀的说明。
用法----------Usage----------
## Default S3 method:[默认方法]
SDData(x, prefix, dataOrig=NULL, TData=NULL, settings=NULL, initFun=list(SDDataCC), subset=NULL, ...)
## S3 method for class 'SDData'
print(x, rseed=NULL, range=1:3, allNumVars = FALSE, latex=FALSE, ...)
## S3 method for class 'SDData'
plot(x, q=NULL, est = 1, zlim = c(-2, 2), latex=FALSE, ...)
## S3 method for class 'SDData'
summary(object, q=NULL, latex=FALSE, digits = 3, ...)
## S3 method for class 'SDData'
predict(object, newdata, prefix = "Newdata", subset = NULL, ...)
参数----------Arguments----------
参数:x
a data matrix, a previously instantiated SDData container or an SDisc object from which the SDData will be extracted,
一个数据矩阵,先前实例化的SDData容器或从其中一个SDisc对象SDData将被提取,
参数:dataOrig
original data, NULL by default
原始数据,缺省情况下,NULL
参数:TData
a set of operations TData to apply to the data
一组操作TData适用于数据
参数:settings
a data matrix as generated by SDDataSettings or the path to a CSV file separated by ";"
数据矩阵所产生的SDDataSettings或以“分隔的CSV文件的路径;
参数:initFun
a list of function taking a data matrix as input and returning a treated matrix as output. As default, returns all complete cases records
函数的列表的数据矩阵作为输出作为输入,并返回一个经过处理的基质。由于默认情况下,将返回所有完备的情况下记录
参数:prefix
a prefix that will serve to identify the analysis in the dynamic report and to define the storage place
前缀,这将有助于识别动态报表的分析和定义存储的地方
参数:subset
a subset of record indeces for the data set (row names)
记录选取的数据集的一个子集(列名)
参数:rseed
an integer to set the random number generator that will select randomly a set of rows and columns to see before and after the data treatment, the data matrix
之前和之后的数据处理,数据矩阵的整数来设置随机数发生器,将随机选择的一组行和列的看到
参数:range
a sequence of integers to subset the randomly ordered column and row names vectors
一个整数序列子集的随机排序的列名和列名向量
参数:allNumVars
whether all numeric variables should be printed, inclusive those not necessarily included in the cluster modeling; these variables are retrieved from dataOrig
是否应打印,所有数值变量包括在内不一定包括在聚类中的造型,这些变量是取自dataOrig
参数:q
limit the summary to a subset of the data treatments (regular expression). When a character vector is provided, the data treatments matching the regular expression are plotted side by side in an image.
限制汇总的数据处理(正则表达式)的一个子集。设置,当一个字符矢量匹配正则表达式的数据处理的绘制图像中的由侧侧。
参数:est
TODO
TODO
参数:zlim
limits for the heatmap
限制的热图
参数:object
an data set container SDData
一个数据集容器SDData
参数:digits
how many digits should be reported in the SDData summary
多少个数字,应当报SDData总结
参数:newdata
a new dataset on which to apply the same transformation estimated on the first SDData
新的数据集上应用相同的变换估计第一SDData
参数:latex
whether the table must be returned into a LaTeX code
该表是否必须返回一个LaTeX代码
参数:...
additional parameters to be passed to the subfunctions
额外的参数传递给这些子功能
Details
详细信息----------Details----------
SDData is the data container constructor for SDisc analyses. It proceeds copying the original data, creating the working directories for figures and tables, and archive the dataset as an RData file. Default dataset initialization function (initFun) filter out incomplete cases because clustering algorithm used in SDisc (Mclust) does not handle missing values. To select a data subset from a previous SDData, a selection index can also be passed.
SDData是SDisc分析的数据容器的构造函数。复制原始数据,建立工作目录中的数字和表格,作为一个RDATA文件和归档数据集进行。默认数据集的初始化函数(initFun)过滤掉不完整的情况下,因为聚类算法在SDisc使用,(Mclust)不处理缺失值。要选择数据subset以前的SDData,选择指数也可以通过。
predict transforms a new data set based on the transformation estimates -like mean, standard deviation- from another SDData container.
predict把一个新的数据集的基础上改造的估计,如平均值,标准偏差,从另一个SDData容器。
print returns the data matrix that results from the initialization (e.g. complete cases) and the data treatments applied on the different variables, as defined in the data settings configuration file. To verify the data treatments, an rseed can be provided to place in parallel a random extract of the data matrix before and after the data treatments. The range parameter gives the number of rows and columns to extract randomly.
print返回的数据矩阵,从初始化(例如,完整的情况下),以及数据处理的数据设置的配置文件中定义的不同的变量,对施加的结果。要验证的数据处理方法,rseed可以被提供给数据处理之前和之后放置在平行的随机数据矩阵提取物。 range参数给出随机提取的行和列的数目。
plot reports as PDF the boxplots and histograms for each variable of the data set container. If latex is set to TRUE, then LaTeX code for Sweave vignettes is returned.
plot的PDF报表的每个变量的数据集集装箱的箱线图和柱状图。如果latex设置为TRUE,然后LaTeX代码为Sweave护身符的返回。
summary returns a summary of the data treatments operated on the data set. For mean, sd, scale, etc, it returns the estimates. For lm, it returns the estimates of the coefficients along with their standard error, $p$-value, the $R^2$ and adjusted $R^2$ of the transformation, and the number of records on which the estimate was based on.
summary返回一个在数据集上的数据处理操作的概要。对于mean,sd,scale,等它返回的估计。对于lm,它返回标准错误的估计系数,$ P $值,$ R ^ 2 $,经调整后$ R ^ 2美元的改造,和记录的数量估计基础上的。
(作者)----------Author(s)----------
Fabrice Colas
参见----------See Also----------
modelBasedEM, SDDataSettings, naPattern,
modelBasedEM,SDDataSettings,naPattern,
实例----------Examples----------
settings <- SDDataSettings(iris)
settings['Species',] <- c(NA,FALSE, NA, NA, NA,NA)
x <- SDData(iris, settings=settings, prefix='iris')
summary(x)
### DO NOT RUN[##不要运行]
# plot(x)[图(X)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|