R语言 caTools包 LogitBoost()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-16 08:51:32

LogitBoost(caTools)
LogitBoost()所属R语言包：caTools

                                    LogitBoost Classification Algorithm
                                       LogitBoost分类算法

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Train logitboost classification algorithm using decision
火车logitboost分类算法决定

用法----------Usage----------

LogitBoost(xlearn, ylearn, nIter=ncol(xlearn))

参数----------Arguments----------

参数：xlearn
A matrix or data frame with training data. Rows contain samples  and columns contain features
训练数据矩阵或数据框。样品和行包含列包含的功能

参数：ylearn
Class labels for the training data samples.  A response vector with one label for each row/component of xlearn. Can be either a factor, string or a numeric vector.
训练数据样本的类标签。每行/组件xlearn的一个标签的响应矢量。可以是一个因素，字符串或数字矢量。

参数：nIter
An integer, describing the number of iterations for which boosting should be run, or number of decision stumps that will be  used.
一个整数，描述迭代的升压的数目应该运行，或数字，将用于决定树桩。

Details

详细信息----------Details----------

The function was adapted from logitboost.R function written by Marcel  Dettling. See references and "See Also" section. The code was modified in  order to make it much faster for very large data sets. The speed-up was  achieved by implementing a internal version of decision stump classifier  instead of using calls to rpart. That way, some of the most time  consuming operations were precomputed once, instead of performing them at  each iteration. Another difference is that training and testing phases of the  classification process were split into separate functions.
改编自由Marcel Dettling logitboost.R函数的功能。见参考文献“请参阅”部分。为了让它变得更快非常大的数据集的代码进行了修改。通过执行一个内部版本决定残端分类器，而不是使用调用rpart实现高速化。这样一来，一些最耗时的操作预先计算一次，而不是在每次迭代中执行它们。另一个区别是，在分类过程中的培训和测试阶段被分成独立的功能。

值----------Value----------

An object of class "LogitBoost" including components:
一个对象的类，包括组件“LogitBoost”：

参数：Stump
List of decision stumps (one node decision trees) used:
名单的决定树桩一个节点（决策树），用于：

column 1: feature numbers or each stump, or which column each stump  operates on
列1：功能号码或每个树桩，或哪一列，每个树桩上运行

column 2: threshold to be used for that column
要用于该列的第2栏：阈值

column 3: bigger/smaller info: 1 means that if values in the column  are above threshold than corresponding samples will be labeled as  lablist[1]. Value "-1" means the opposite.
第3栏：更大/更小的信息：1表示，如果列中的值高于阈值比相应的样品将被标记为lablist[1]。值“-1”的意思相反。

If there are more than two classes, than several "Stumps" will be cbind'ed
如果有两个以上的类，不是几个“树桩”将cbind版

参数：lablist
names of each class
每个类的名称

（作者）----------Author(s)----------

Jarek Tuszynski (SAIC) <a href="mailto:jaroslaw.w.tuszynski@saic.com">jaroslaw.w.tuszynski@saic.com</a>

参考文献----------References----------

Expression Data, available on the web page  http://stat.ethz.ch/~dettling/boosting.html.

参见----------See Also----------

predict.LogitBoost has prediction half of LogitBoost code
predict.LogitBoost预测的一半LogitBoost代码

logitboost function from logitboost library (not in CRAN or BioConductor but can be found at  http://stat.ethz.ch/~dettling/boosting.html) is very similar but much slower on very large datasets. It also perform optional cross-validation.
logitboost从logitboost库（没有，在CRAN或BioConductor可以发现在http://stat.ethz.ch/的~dettling的/ boosting.html的）非常相似，但相对很慢非常大的功能数据集。它也可以执行可选的交叉验证。

实例----------Examples----------

  data(iris)
  Data  = iris[,-5]
  Label = iris[, 5]

  # basic interface[基本界面]
  model = LogitBoost(Data, Label, nIter=20)
  Lab = predict(model, Data)
  Prob  = predict(model, Data, type="raw")
  t    = cbind(Lab, Prob)
  t[1:10, ]

  # two alternative call syntax[两种可供选择的调用语法]
  p=predict(model,Data)
  q=predict.LogitBoost(model,Data)
  pp=p[!is.na(p)]; qq=q[!is.na(q)]
  stopifnot(pp == qq)

  # accuracy increases with nIter (at least for train set)[与硝的精度提高（至少在训练集）]
  table(predict(model, Data, nIter= 2), Label)
  table(predict(model, Data, nIter=10), Label)
  table(predict(model, Data),          Label)

  # example of spliting the data into train and test set[例如劈裂的训练和测试集的数据]
  mask = sample.split(Label)
  model = LogitBoost(Data[mask,], Label[mask], nIter=10)
  table(predict(model, Data[!mask,], nIter=2), Label[!mask])
  table(predict(model, Data[!mask,]),       Label[!mask])

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册