R语言 logicFS包 logicFS()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 23:34:01

logicFS(logicFS)
logicFS()所属R语言包：logicFS

                                    Feature Selection with Logic Regression
                                       逻辑回归的特征选择

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Identification of interesting interactions between binary variables using logic regression. Currently available for the classification, the linear regression and the logistic regression approach of logreg and for a multinomial logic regression as implemented in mlogreg.
使用逻辑回归的二元变量之间的有趣互动的鉴定。目前的分类，线性回归和Logistic回归方法logreg在mlogreg实施的多项逻辑回归。

用法----------Usage----------

## Default S3 method:[默认方法]
logicFS(x, y, B = 100, useN = TRUE, ntrees = 1, nleaves = 8,
  glm.if.1tree = FALSE, replace = TRUE, sub.frac = 0.632,
  anneal.control = logreg.anneal.control(), onlyRemove = FALSE,
  prob.case = 0.5, addMatImp = TRUE, fast = FALSE, rand = NULL, ...)

## S3 method for class 'formula'[类formula的方法]
logicFS(formula, data, recdom = TRUE, ...)

参数----------Arguments----------

参数：x
a matrix consisting of 0's and 1's. Each column must correspond to a binary variable and each row to an observation. Missing values are not allowed.
0和1组成的矩阵。每一列必须对应一个二进制变量和每个行观察。遗漏值是不允许的。

参数：y
a numeric vector or a factor specifying the values of a response for all the observations  represented in x, where missing values are not allowed in y.  If a numeric vector, then y either contains  the class labels (coded by 0 and 1) or the values of a continuous response depending on whether the classification or logistic regression approach of logic regression, or the linear regression approach, respectively, should be used. If the response is categorical, then y must be a factor naming the class labels of the observations.
一个数值向量或指定的所有代表的意见的响应值的因素x，遗漏值不会允许在y。如果一个数值向量，然后y要么包含类的标签（由0和1的编码）或连续响应值取决于是否逻辑回归，线性回归的方法，分别分类Logistic回归方法，应使用。如果反应是明确的，那么y必须是一个因素，命名观测类的标签。

参数：B
an integer specifying the number of iterations.
一个整数，指定的迭代次数。

参数：useN
logical specifying if the number of correctly classified out-of-bag observations should be used in the computation of the importance measure. If FALSE, the proportion of correctly classified oob observations is used instead.
逻辑指定的数量，如果正确分类袋意见，应在计算的重要措施。如果FALSE，OOB意见正确归类的比例来代替。

参数：ntrees
an integer indicating how many trees should be used. For a binary response: If ntrees is larger than 1, the logistic regression approach of logic regreesion will be used. If ntrees is 1, then by default the classification approach of logic regression will be used (see glm.if.1tree.)  For a continuous response: A linear regression model with ntrees trees is fitted in each of the B iterations.  For a categorical response: n.lev-1 logic regression models with ntrees trees are fitted, where n.lev is the number of levels of the response (for details, see mlogreg).
应使用一个整数，表示多少树木。对于二进制的反应：如果ntrees是大于1，逻辑regreesion的的logistic回归方法将被使用。如果ntrees是1，那么默认情况下，逻辑回归的分类方法将被使用（见glm.if.1tree。）对于连续反应：一个线性回归模型，用ntrees树木被安装在每个迭代的B。一个明确的回应：n.lev-1逻辑回归模型与ntrees树木装，其中n.lev是各级响应的数量（详见mlogreg）。

参数：nleaves
a numeric value specifying the maximum number of leaves used in all trees combined. For details, see the help page of the function logreg of the package LogicReg.
一个数值，指定用于所有的树木叶片的最大数量。有关详情，请参阅帮助页面功能包logregLogicReg。

参数：glm.if.1tree
if ntrees is 1 and glm.if.1tree is TRUE the logistic regression approach of logic regression is used instead of the classification approach. Ignored if ntrees is not 1, or the response is not binary.
如果ntrees1glm.if.1tree是TRUE逻辑回归Logistic回归方法使用的分类方法，而不是。被忽略，如果ntrees1，或响应是不是二进制。

参数：replace
should sampling of the cases be done with replacement? If  TRUE, a Bootstrap sample of size length(cl) is drawn from the length(cl) observations in each of the B iterations. If FALSE, ceiling(sub.frac * length(cl)) of the observations are drawn without replacement in each iteration.
采样的情况下，应进行更换？如果TRUE的，大小length(cl)的Bootstrap样本的length(cl)观测来自每个的B迭代。如果FALSE，ceiling(sub.frac * length(cl))的意见都没有更换在每次迭代的情况下制定。

参数：sub.frac
a proportion specifying the fraction of the observations that are used in each iteration to build a classification rule if replace = FALSE. Ignored if replace = TRUE.
指定的分数，在每次迭代中使用的意见，以建立一个分类规则，如果replace = FALSE比例。如果replace = TRUE忽略。

参数：anneal.control
a list containing the parameters for simulated annealing. See the help of the function logreg.anneal.control in the LogicReg package.
一个列表，其中包含模拟退火参数。看到功能的帮助logreg.anneal.controlLogicReg包。

参数：onlyRemove
should in the single tree case the multiple tree measure be used? If TRUE, the prime implicants are only removed from the trees when determining the importance in the single tree case. If FALSE, the original single tree measure is computed for each prime implicant, i.e.\ a prime implicant is not only removed from the trees in which it is contained, but also added to the trees that do not contain this interaction. Ignored in all other than the classification case.
单树的情况下应该使用多个树措施？如果TRUE，首要implicants是只拆除确定在单树情况的重要性时，从树上。如果FALSE，原单树措施每个首要蕴含计算，即\素蕴含不仅从包含它的树木，但也加入到不包含这种互动的树木。在所有的分类情况比其他忽略。

参数：prob.case
a numeric value between 0 and 1. If the outcome of the logistic regression, i.e.\ the predicted probability, for an observation is larger than prob.case this observations will be classified as case  (or 1).
0和1之间的数值。如果Logistic回归的结果，即\观察，预测概率大于prob.case这个意见将被作为案例（1）分类。

参数：addMatImp
should the matrix containing the improvements due to the prime implicants in each of the iterations be added to the output? (For each of the prime implicants, the importance is computed by the average over the B improvements.) Must be set to TRUE, if standardized importances should be computed using  vim.norm, or if permutation based importances should be computed  using vim.signperm.
应包含在每个迭代的改进，由于黄金implicants矩阵被添加到输出？标准化的重要性，应计算（对于首要implicants，重要性B改善，平均计算。）必须设置TRUE，vim.norm，或者如果置换基础的重要性，应计算使用vim.signperm。

参数：fast
should a greedy search (as implemented in logreg) be used instead of simulated annealing?
应该一个贪婪的搜索（logreg实施的）可以用来代替模拟退火？

参数：rand
numeric value. If specified, the random number generator will be set into a reproducible state.
数值。如果指定，随机数发生器将被设置成一个可重复的状态。

参数：formula
an object of class formula describing the model that should be fitted.
类formula描述模型，应配备的对象。

参数：data
a data frame containing the variables in the model. Each row of data must correspond to an observation, and each column to a binary variable (coded by 0 and 1)  or a factor (for details, see recdom) except for the column comprising the response, where no missing values are allowed in data. The response must be either binary (coded by 0 and 1), categorical or continuous. If continuous, a linear model is fitted in each of the B iterations of logicFS. If categorical, the column of data specifying the response must be a factor. In this case, multinomial logic regressions are performed as implemented in mlogreg. Otherwise, depending on ntrees (and glm.if.1tree) the classification or the logistic regression approach of logic regression is used.
一个数据框包含在模型中的变量。每个data行必须对应一个观察，和每一个二进制变量（由0和1的编码）或除列因素（详见recdom），其中包括响应列，没有缺失值被允许在data。响应必须是二进制（0和1的编码），类别或连续。如果连续，线性模型被安装在每个BlogicFS迭代。如果分类，列data指定的响应必须是一个因素。在这种情况下，多元逻辑回归进行mlogreg实施。否则，取决于ntrees（glm.if.1tree）的分类或逻辑回归Logistic回归方法是使用。

参数：recdom
a logical value or vector of length ncol(data) comprising whether a SNP should be transformed into two binary dummy variables coding for a recessive and a dominant effect. If recdom is TRUE (and a logical value), then all factors/variables with three levels will be coded by two dummy variables as described in make.snp.dummy. Each level of each of the other factors  (also factors specifying a SNP that shows only two genotypes) is coded by one indicator variable.  If recdom isFALSE (and a logical value), each level of each factor is coded by an indicator variable. If recdom is a logical vector, all factors corresponding to an entry in recdom that is TRUE are assumed to be SNPs and transformed into two binary variables as described above. All variables corresponding to entries of recdom that are TRUE (no matter whether recdom is a vector or a value) must be coded either by the integers 1 (coding for the homozygous reference genotype), 2 (heterozygous),  and 3 (homozygous variant), or alternatively by the number of minor alleles, i.e. 0, 1, and 2, where no mixing of the two coding schemes is allowed. Thus, it is not allowed that some SNPs are coded by 1, 2, and 3, and others are coded by 0, 1, and 2.
一个逻辑值或长度的向量ncol(data)包括是否SNP应转化成二元虚拟变量编码的隐性和显性效应。如果recdom是TRUE（和逻辑值），然后所有三个层次的因素/变量将两个虚拟变量编码为make.snp.dummy。每个每个级别的其他因素（也指定一个SNP显示只有两种基因型的因素）进行编码由一个指标变量。如果recdom是FALSE（和逻辑值），每个因素的每个级别的指标变量进行编码。 recdom如果是一个逻辑向量，在进入相应的所有因素recdom，TRUE假设是单核苷酸多态性，并转化为两个以上所述的二进制变量。所有变量对应的条目recdom是TRUE（无论是否recdom是一个向量或值）必须编码由1的整数（编码为纯合子的参考基因型） 2（杂合子）和3个（合子变种），或者由次要等位基因数，即0，1和2，其中没有两个编码方案，允许混合使用。因此，它是不允许的，一些SNPs编码由1，2，3，和其他编码是0，1和2。

参数：...
for the formula method, optional parameters to be passed to the low level function logicFS.default. Otherwise, ignored.
为formula方法，可选的参数被传递到低级别功能logicFS.default。否则，忽略。

值----------Value----------

An object of class logicFS containing
一个类的对象logicFS含有

参数：primes
the prime implicants,
首要implicants，

参数：vim
the importance of the prime implicants,
的首要implicants的重要性，

参数：prop
the proportion of logic regression models that contain the prime  implicants,
逻辑回归模型包含黄金implicants的比例，

参数：type
the type of model (1: classification, 2: linear regression, 3: logistic regression),
类型分类（1：2：线性回归，logistic回归）模型，

参数：param
further parameters (if addInfo = TRUE),
更多的参数（如果addInfo = TRUE）

参数：mat.imp
the matrix containing the improvements if addMatImp = TRUE, otherwise, NULL,
矩阵的改进，否则，如果addMatImp = TRUENULL，

参数：measure
the name of the used importance measure,
所使用的重要性措施的名称，

参数：useN
the value of useN,
useN的价值，

参数：threshold
NULL,
为NULL，

参数：mu
NULL.
NULL。

作者（S）----------Author(s)----------

Holger Schwender, <a href="mailto:holger.schwender@udo.edu">holger.schwender@udo.edu</a>

参考文献----------References----------

Journal of Computational and Graphical Statistics, 12, 475-511.
Using Logic Regression. Biostatistics, 9(1), 187-198.

参见----------See Also----------

plot.logicFS, logic.bagging
plot.logicFS，logic.bagging

举例----------Examples----------

# Load data.[加载数据。]
data(data.logicfs)

# For logic regression and hence logic.fs, the variables must[变量必须为逻辑回归，因此logic.fs]
# be binary. data.logicfs, however, contains categorical data [是二进制。 data.logicfs，但是，包含分类数据]
# with realizations 1, 2 and 3. Such data can be transformed [实现1，2和3。这些数据可以转化]
# into binary data by[由二进制数据]
bin.snps<-make.snp.dummy(data.logicfs)

# To speed up the search for the best logic regression models[为了加快寻找最佳的逻辑回归模型]
# only a small number of iterations is used in simulated annealing.[只有少数的迭代用于模拟退火。]
my.anneal<-logreg.anneal.control(start=2,end=-2,iter=10000)

# Feature selection using logic regression is then done by[然后通过功能选择，使用逻辑回归]
log.out<-logicFS(bin.snps,cl.logicfs,B=20,nleaves=10,
   rand=123,anneal.control=my.anneal)

# The output of logic.fs can be printed[可以打印输出logic.fs]
log.out

# One can specify another number of interactions that should be[一个可以指定应该是另一种互动]
# printed, here, e.g., 15.[在这里，印刷，例如，15。]
print(log.out,topX=15)

# The variable importance can also be plotted.[也可绘制变量的重要性。]
plot(log.out)

# And the original variable names are displayed in[与原来的变量名都显示在]
plot(log.out,coded=FALSE)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册