R语言 BioSeqClass包 selectWeka()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 13:43:10

selectWeka(BioSeqClass)
selectWeka()所属R语言包：BioSeqClass

                                    Feature Selection by Weka
                                       功能选择由WEKA

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

feature selection by Weka.
WEKA的特征选择。

用法----------Usage----------

  selectWeka(train, evaluator="CfsSubsetEval", search="BestFirst", n)

参数----------Arguments----------

参数：train
a data frame including the feature matrix and class label of training set.
一个数据框，包括特征矩阵和类标签的训练集。

参数：evaluator
a string for the feature selection method used by WEKA. This  must be one of the strings "CfsSubsetEval", "ChiSquaredAttributeEval",  "InfoGainAttributeEval", or "SVMAttributeEval".
WEKA的使用功能选择方法的字符串。这必须是一个琴弦“CfsSubsetEval”，的“ChiSquaredAttributeEval”，“InfoGainAttributeEval”，或“SVMAttributeEval”。

参数：search
a string for the search method used by WEKA. This must be one  of the strings "BestFirst" or "Ranker".
WEKA的使用的搜索方法的字符串。这必须是一个字符串“BestFirst”或“RANKER”。

参数：n
an integer for the number of selected features.
所选功能的整数。

Details

详情----------Details----------

Parameter "evaluator" supportes three feature selection methods provided by WEKA: "CfsSubsetEval": Evaluate the worth of a subset of attributes by considering  the individual predictive ability of each feature along with  the degree of redundancy between them. "ChiSquaredAttributeEval": Evaluate the worth of an attribute by computing the  value of the chi-squared statistic with respect to  the class. "InfoGainAttributeEval": Evaluate attributes individually by measuring  information gain with respect to the class. "SVMAttributeEval": Evaluate the worth of an attribute by using an SVM classifier.  Attributes are ranked by the square of the weight assigned  by the SVM. Attribute selection for multiclass problems is  handled by ranking attributes for each class seperately  using a one-vs-all method and then "dealing" from the top  of each pile to give a final ranking.
参数“评估”supportes WEKA内提供的三个特征选择方法：“CfsSubsetEval”：评估值得考虑每个功能的预测能力，以及它们之间的冗余程度的个人属性的一个子集。 “ChiSquaredAttributeEval”计算卡方统计值类属性的价值评估。 “InfoGainAttributeEval”：单独评估测量类信息增益的属性。 “SVMAttributeEval”：使用SVM分类属性的价值评估。属性的排名由SVM的分配重量的平方。处理多类问题的属性选择为每个类分开使用一个VS-所有的方法，然后从每根桩的顶部的“交易”给一个最终的排名，排名的属性。

Parameter "search" supportes three feature subset search methods provided by WEKA: "BestFirst":  Searches the space of attribute subsets by greedy hillclimbing  augmented with a backtracking facility. Setting the number of  consecutive non-improving nodes allowed controls the level of  backtracking done. Best first may start with the empty set of  attributes and search forward, or start with the full set of  attributes and search backward, or start at any point and search  in both directions (by considering all possible single attribute  additions and deletions at a given point). "Ranker": Ranks attributes by their individual evaluations.
参数“搜索”supportes三个特征子集搜索WEKA内提供的方法：“BestFirst”：搜索的属性子集的由回溯设施增强贪婪hillclimbing空间。设置连续非改善允许节点数量控制回溯完成的水平。最好先与空集的属性和向前搜索可能会开始，或开始全套的属性和向后搜索，或在任何时候和在两个方向上的搜索开始（在某一时间点考虑所有可能的单一属性增删）。 “RANKER”：由他们个人的评价队伍的属性。

作者（S）----------Author(s)----------

Hong Li

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册