tree(tree)
tree()所属R语言包:tree
Fit a Classification or Regression Tree
适合一个分类回归树
译者:生物统计家园网 机器人LoveR
描述----------Description----------
A tree is grown by binary recursive partitioning using the response in the specified formula and choosing splits from the terms of the right-hand-side.
通过的二进制递归分割使用的响应指定的公式种植一棵树选择分裂的右手侧的条款。
用法----------Usage----------
tree(formula, data, weights, subset,
na.action = na.pass, control = tree.control(nobs, ...),
method = "recursive.partition",
split = c("deviance", "gini"),
model = FALSE, x = FALSE, y = TRUE, wts = TRUE, ...)
参数----------Arguments----------
参数:formula
A formula expression. The left-hand-side (response) should be either a numerical vector when a regression tree will be fitted or a factor, when a classification tree is produced. The right-hand-side should be a series of numeric or factor variables separated by +; there should be no interaction terms. Both . and - are allowed: regression trees can have offset terms.
公式表达。左手侧(响应)应该是一个数值向量回归树将被嵌合时或一个因素,当产生一个分类树。右手侧应该是一个系列的数字或因子变量分离+;应该没有互动方面。这两个.和-,是允许的:回归树可以有offset条款。
参数:data
A data frame in which to preferentially interpret formula, weights and subset.
一个数据框中优先解释formula,weights和subset。
参数:weights
Vector of non-negative observational weights; fractional weights are allowed.
向量的非负的观测重量;小数权重是允许的。
参数:subset
An expression specifying the subset of cases to be used.
的表达式,指定要使用的情况下的子集。
参数:na.action
A function to filter missing data from the model frame. The default is na.pass (to do nothing) as tree handles missing values (by dropping them down the tree as far as possible).
一个函数来过滤丢失的数据模型框架。默认值是na.pass(什么都不做)tree处理缺失值(通过投下他们在树中尽可能)。
参数:control
A list as returned by tree.control.
返回一个列表tree.control。
参数:method
character string giving the method to use. The only other useful value is "model.frame".
字符串的方法使用。其他有用的值是"model.frame"。
参数:split
Splitting criterion to use.
分裂标准使用。
参数:model
If this argument is itself a model frame, then the formula and data arguments are ignored, and model is used to define the model. If the argument is logical and true, the model frame is stored as component model in the result.
如果这种说法本身就是一个模型框架,然后formula和data参数将被忽略,model是用来定义模型。如果参数是逻辑和真实的,该模型的帧存储组件model的结果。
参数:x
logical. If true, the matrix of variables for each case is returned.
逻辑。如果为true,矩阵变量的每一种情况下返回。
参数:y
logical. If true, the response variable is returned.
逻辑。如果为true,则返回响应变量。
参数:wts
logical. If true, the weights are returned.
逻辑。如果为true,将返回的权重。
参数:...
Additional arguments that are passed to tree.control. Normally used for mincut, minsize or mindev.
其他参数传递给tree.control。通常用于mincut,minsize或mindev。
Details
详细信息----------Details----------
A tree is grown by binary recursive partitioning using the response in the specified formula and choosing splits from the terms of the right-hand-side. Numeric variables are divided into X < a and X > a; the levels of an unordered factor are divided into two non-empty groups. The split which maximizes the reduction in impurity is chosen, the data set split and the process repeated. Splitting continues until the terminal nodes are too small or too few to be split.
通过的二进制递归分割使用的响应指定的公式种植一棵树选择分裂的右手侧的条款。数值变量被分为X < a和X > a;一个无序因子的水平被分为两个非空的组。的分裂,而最大限度地减少杂质被选择,数据集分割,并重复该过程。分裂继续进行,直到终端节点是过小或过几个以被分割。
Tree growth is limited to a depth of 31 by the use of integers to label nodes.
由整数使用标签节点树的生长被限制为31的深度。
Factor predictor variables can have up to 32 levels. This limit is imposed for ease of labelling, but since their use in a classification tree with three or more levels in a response involves a search over 2^(k-1) - 1 groupings for k levels, the practical limit is much less.
因子预测变量最多可以有32个级别。这个限制被施加以方便的标签,但在响应中有三个或更多的水平,因为它们的使用在一个分类树涉及搜索超过2^(k-1) - 1k水平分组,实际限制是要少得多。
值----------Value----------
The value is an object of class "tree" which has components
该值是一个对象类"tree"组件
参数:frame
A data frame with a row for each node, and row.names giving the node numbers. The columns include var, the variable used at the split (or "<leaf>" for a terminal node), n, the (weighted) number of cases reaching that node, dev the deviance of the node, yval, the fitted value at the node (the mean for regression trees, a majority class for classification trees) and split, a two-column matrix of the labels for the left and right splits at the node. Classification trees also have yprob, a matrix of fitted probabilities for each response level.
有一排的每个节点的数据框,row.names发出的节点号。列包括var,使用的变量的分割(或"<leaf>"终端节点),n,(加权)例数达到该节点,dev越轨行为的节点,yval,在节点(拟合值的均值回归树,多数类分类树)和split的标签,一个两列的矩阵为左和右在该节点分裂。分类树也有yprob,每个响应级别的拟合概率矩阵。
参数:where
An integer vector giving the row number of the frame detailing the node to which each case is assigned.
的整数向量给予详细说明每一种情况下被分配到的节点的帧的行数。
参数:terms
The terms of the formula.
的公式。
参数:call
The matched call to Tree.
匹配调用Tree。
参数:model
If model = TRUE, the model frame.
如果model = TRUE,模型框架。
参数:x
If x = TRUE, the model matrix.
如果x = TRUE,模型矩阵。
参数:y
If y = TRUE, the response.
如果y = TRUE“的响应。
参数:wts
If wts = TRUE, the weights.
如果wts = TRUE,权重。
and attributes xlevels and, for classification trees, ylevels.
和属性xlevels,分类树,ylevels。
A tree with no splits is of class "singlenode" which inherits from class "tree".
没有分裂的树类"singlenode"继承自类"tree"的。
(作者)----------Author(s)----------
B. D. Ripley
参考文献----------References----------
Classification and Regression Trees. Wadsworth.
Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge. Chapter 7.
参见----------See Also----------
tree.control, prune.tree,
tree.control,prune.tree,
实例----------Examples----------
data(cpus, package="MASS")
cpus.ltr <- tree(log10(perf) ~ syct+mmin+mmax+cach+chmin+chmax, cpus)
cpus.ltr
summary(cpus.ltr)
plot(cpus.ltr); text(cpus.ltr)
ir.tr <- tree(Species ~., iris)
ir.tr
summary(ir.tr)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|