R语言 rminer包 mining()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-27 00:02:13

mining(rminer)
mining()所属R语言包：rminer

 Powerful function that trains and tests a particular fit model under several runs and a given validation method
 强大的功能，火车和测试一个特定的配合下几个运行模式和给定的验证方法

 译者：生物统计家园网机器人LoveR

描述----------Description----------

Powerful function that trains and tests a particular fit model under several runs and a given validation method. Since there can be a huge number of models, the fitted models are not stored. Yet, several useful statistics (e.g. predictions) are returned.
火车和强大的功能，测试下几个运行一个特别合适的模型和验证方法。既然不可能有数量庞大的模型，拟合模型不会被存储。然而，一些有用的统计数据（例如，预测）被返回。

用法----------Usage----------

mining(x, data = NULL, Runs = 1, method = NULL, model = "default",
 task = "default", search = "heuristic", mpar = NULL,
 feature="none", scale = "default", transform = "none",
 debug = FALSE, ...)

参数----------Arguments----------

参数：x
a symbolic description (formula) of the model to be fit. If x contains the data, then data=NULL (similar to x in ksvm, kernlab package).
一个象征性的模型来描述（公式）是合适的。 x如果包含数据，那么data=NULL（类似x的ksvm，kernlab包）。

参数：data
an optional data frame (columns denote attributes, rows show examples) containing the training data, when using a formula.
一个可选的数据框包含的训练数据（列表示的属性，行显示的例子），当使用一个公式。

参数：Runs
number of runs used (e.g. 1, 5, 10, 20, 30)
使用（例如，1，5，10，20，30的运行数）

参数：method
a vector with c(vmethod,vpar), where vmethod is:
一个向量与c（vmethod，VPAR），其中vmethod是：

all – all NROW examples are used as both training and test sets (no vpar is needed).
all - 所有NROW例子作为训练和测试集（没有VPAR是必要的）。

holdout – standard holdout method. If vpar<1 then NROW*vpar random samples are used for training and the remaining rows are used for testing. Else, then NROW*vpar random samples are used for testing and the remaining are used for training. For classification tasks (prob or class) a stratified sampling is used.
holdout - 标准的抵抗方法。如果VPAR <1，则NROW * VPAR随机样本用于训练，剩余的行用于测试。否则，，然后NROW VPAR随机抽样用于测试，其余用于训练。对于分类任务（prob或class）分层抽样方法。

holdoutorder – similar to holdout except that instead of a random sampling, the first rows (until the split) are used for training and the remaining ones for testing (equal to mode="order" in holdout).
“”holdoutorder - 类似holdout除，而不是一个随机抽样，第一行（直到分裂）用于训练，其余的用于测试（等于mode="order"中 holdout“）。

holdoutinc – incremental holdout retraining (e.g. used for spam data). Here, vpar is the batch size.
holdoutinc - 增量抵抗再培训（例如用于垃圾邮件的数据）。在这里，VPAR是批量大小。

kfold – K-fold cross-validation method, where vpar is the number of folds.
kfold - K-折交叉验证方法，VPAR是倍数的数目。

kfoldo – similar to kfold except that instead of a random sampling, the order of the rows is used to build the folds.
kfoldo - 类似kfold除了，而不是随机抽样，行的顺序是用来建立褶皱。

参数：model
See fit for details.
见fit的详细信息。

参数：task
See fit for details.
见fit的详细信息。

参数：search
See fit for details.
见fit的详细信息。

参数：mpar
See fit for details. 
见fit对 的详细信息。

参数：feature
See fit for more details about feature="none", "sabs" or "sbs" options. For the mining function, additional options are feature=fmethod, where fmethod can be one of:
见fit更详细的feature="none"，"sabs"或"sbs"选项。 对于mining功能，额外的选项feature= fmethod ，其中fmethod可以是如下之一：

sens or sensg – compute the 1-D sensitivity analysis input importances ($sen), gradient measure.
sens或sensg - 计算的1-D的敏感性分析输入的重要性（$sen），梯度测量。

sensv – compute the 1-D sensitivity analysis input importances ($sen), variance measure.
sensv - 1-D的敏感性分析计算输入的重要性（$sen），方差措施。

sensr – compute the 1-D sensitivity analysis input importances ($sen), range measure.
sensr - 1-D的敏感性分析计算输入的重要性（$sen），一系列的措施。

simp, simpg or s – equal to sensg but also computes the 1-D sensitivity responses ($sresponses, useful for graph="VEC").
simp，simpg或s - 等于sensg“也计算1-D的敏感性反应（$sresponses，有用的graph="VEC"）。

simpv – equal to sensv but also computes the 1-D sensitivity responses (useful for graph="VEC").
simpv - sensv但也计算在1-D的敏感性反应（graph="VEC"有用）。

simpr – equal to sensr but also computes the 1-D sensitivity responses (useful for graph="VEC").
simpr - sensr但也计算在1-D的敏感性反应（graph="VEC"有用）。

参数：scale
See fit for details.
见fit的详细信息。

参数：transform
See fit for details.
见fit的详细信息。

参数：debug
If TRUE shows some information about each run.
如果是TRUE显示每个运行的一些信息。

参数：...
See fit for details.
见fit的详细信息。

Details

详细信息----------Details----------

Powerful function that trains and tests a particular fit model under several runs and a given validation method (see [Cortez, 2010] for more details). Several Runs are performed. In each run, the same validation method is adopted (e.g. holdout) and several relevant statistics are stored. Warning: be patient, this function can require some computational effort, specially if a high number of Runs is used.
功能强大，火车和一个特别合适的模型在几个运行测试和验证方法（见[2010]科尔特斯，了解更多详情）。 几个Runs执行。在每次运行时，采用相同的验证方法（例如holdout）和一些相关的统计信息存储。警告：耐心等待，此功能可能需要一些计算工作，特别是如果大量的Runs使用。

值----------Value----------

A list with the components:
Alist的组件：

$time – vector with time elapsed for each run.
$时间 - 矢量每次运行时间。

$test -- vector list, where each element contains the test (target) results for each run.
测试 - 向量列表，其中每个元素包含测试（目标）每次运行的结果。

$pred -- vector list, where each element contains the predicted results for each test set and each run.
PRED - 矢量列表，其中每个元素包含每个测试组，每个运行的预测结果。

$error -- vector with an error metric for each run (the error depends on the metric parameter of mpar, valid options are explained in mmetric).
错误 - 错误metric每次运行的（错误的metric参数的向量mpar，有效的选项解释mmetric）。

$mpar -- data.frame with each fit model mpar parameters, the sequence repeats Runs (times vpar if kfold is used).
MPAR - 每一个合适的模型MPAR参数数据框，重复序列Runs（次VPAR如果kfold使用）。

$model -- the model.
$的模式 - model。

$task -- the task.
$的任务 - task。

$method -- the external validation method.
$方法 - 外部验证method。

$sen -- a matrix with the 1-D sensitivity analysis input importances. The number of rows is Runs times vpar, if kfold, else is Runs.
SEN - 矩阵的1-D的敏感性分析输入的重要性。的行数Runs倍VPAR，，如果kfold，否则是Runs。

$sresponses -- a vector list with a size equal to the number of attributes (useful for graph="VEC"). Each element contains a list with the 1-D sensitivity analysis input responses (n – name of the attribute; l – number of levels; x – attribute values; y – 1-D sensitivity responses. Important note: sresponses (and "VEC" graphs) are only available if feature="sabs" or "simp" related (see feature).
$ sresponses - 一个向量列表的大小等于的属性数（graph="VEC"）有用。每个元素包含一个列表1-D的敏感性分析输入的响应（n“ - 属性的名称，”l - 数的水平; x“ - 属性值，”y - 1-D的敏感性反应。 重要注意事项：sresponses（与“VEC图），是唯一可用的，如果feature="sabs"或"simp"相关（见feature）。

$runs -- the Runs.
$运行 - Runs。

$attributes -- vector list with all attributes (features) selected in each run (and fold if kfold) if a feature selection algorithm is used.
属性 - 向量列表的所有属性（特征）在每次运行时（倍如果kfold）的特征选择算法。

$feature -- the feature. </ul>
$的功能 - feature。 </ ul>

注意----------Note----------

See also http://www3.dsi.uminho.pt/pcortez/rminer.html
也http://www3.dsi.uminho.pt/pcortez/rminer.html

（作者）----------Author(s)----------

Paulo Cortez <a href="http://www3.dsi.uminho.pt/pcortez">http://www3.dsi.uminho.pt/pcortez</a>

参考文献----------References----------

To check for more details about rminer and for citation purposes: P. Cortez. Data Mining with Neural Networks and Support Vector Machines Using the R/rminer Tool. In P. Perner (Ed.), Advances in Data Mining - Applications and Theoretical Aspects 10th Industrial Conference on Data Mining (ICDM 2010), Lecture Notes in Artificial Intelligence 6171, pp. 572-583, Berlin, Germany, July, 2010. Springer. ISBN: 978-3-642-14399-1. @Springer: http://www.springerlink.com/content/e7u36014r04h0334 http://www3.dsi.uminho.pt/pcortez/2010-rminer.pdf </ul>

参见----------See Also----------

fit, predict.fit, mgraph, mmetric, savemining, holdout and Importance.
fit，predict.fit，mgraph，mmetric，savemining，holdout和Importance。

实例----------Examples----------

### simple regression example[＃＃简单的回归的例子]
x1=rnorm(200,100,20); x2=rnorm(200,100,20)
y=0.7*sin(x1/(25*pi))+0.3*sin(x2/(25*pi))
M=mining(y~x1+x2,Runs=2,model="mlpe",search=2)
print(M)
print(mmetric(M,metric="MAE"))

### classification example (task="prob")[＃＃分类的例子（任务=“概率”）]
data(iris)
M=mining(Species~.,iris,Runs=10,method=c("kfold",3),model="dt")
print(mmetric(M,metric="CONF"))
print(mmetric(M,metric="AUC"))
print(meanint(mmetric(M,metric="AUC")))
mgraph(M,graph="ROC",TC=2,baseline=TRUE,Grid=10,leg="Versicolor",
 main="versicolor ROC")
mgraph(M,graph="LIFT",TC=2,baseline=TRUE,Grid=10,leg="Versicolor",
 main="Versicolor ROC")
M2=mining(Species~.,iris,Runs=10,method=c("kfold",3),model="svm")
L=vector("list",2)
L[[1]]=M;L[[2]]=M2
mgraph(L,graph="ROC",TC=2,baseline=TRUE,Grid=10,leg=c("DT","SVM"),main="ROC")

### regression example[＃＃回归的例子]
data(sin1reg)
M=mining(y~.,data=sin1reg,Runs=3,method=c("holdout",2/3),model="mlpe",
 search="heuristic5",mpar=c(50,3,"kfold",3,"MAE"),feature="sabs")
print(mmetric(M,metric="MAE"))
print(M$mpar)
cat("median H nodes:",medianminingpar(M)[1],"\n")
print(M$attributes)
mgraph(M,graph="RSC",Grid=10,main="sin1 MLPE scatter plot")
mgraph(M,graph="REP",Grid=10,main="sin1 MLPE scatter plot",sort=FALSE)
mgraph(M,graph="REC",Grid=10,main="sin1 MLPE REC")
mgraph(M,graph="IMP",Grid=10,main="input importances",xval=0.1,leg=names(sin1reg))
mgraph(M,graph="VEC",Grid=10,main="x1 VEC curve",xval=1,leg=names(sin1reg)[1])

### another classification example[＃＃另一种分类示例]
data(iris)
M=mining(Species~.,data=iris,Runs=2,method=c("kfold",2),model="svm",
search="heuristic",mpar=c(NA,NA,"kfold",3,"AUC"),feature="s")
print(mmetric(M,metric="AUC",TC=2))
mgraph(M,graph="ROC",TC=2,baseline=TRUE,Grid=10,leg="SVM",main="ROC",intbar=FALSE)
mgraph(M,graph="IMP",TC=2,Grid=10,main="input importances",xval=0.1,
leg=names(iris),axis=1)
mgraph(M,graph="VEC",TC=2,Grid=10,main="Petal.Width VEC curve",
data=iris,xval=4)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册