找回密码
 注册
查看: 1844|回复: 0

R语言 rminer包 mining()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-9-27 00:02:13 | 显示全部楼层 |阅读模式
mining(rminer)
mining()所属R语言包:rminer

                                         Powerful function that trains and tests a particular fit model under several runs and a given validation method
                                         强大的功能,火车和测试一个特定的配合下几个运行模式和给定的验证方法

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Powerful function that trains and tests a particular fit model under several runs and a given validation method. Since there can be a huge number of models, the fitted models are not stored. Yet, several useful statistics (e.g. predictions) are returned.
火车和强大的功能,测试下几个运行一个特别合适的模型和验证方法。既然不可能有数量庞大的模型,拟合模型不会被存储。然而,一些有用的统计数据(例如,预测)被返回。


用法----------Usage----------


mining(x, data = NULL, Runs = 1, method = NULL, model = "default",
       task = "default", search = "heuristic", mpar = NULL,
       feature="none", scale = "default", transform = "none",
       debug = FALSE, ...)



参数----------Arguments----------

参数:x
a symbolic description (formula) of the model to be fit. If x contains the data, then data=NULL (similar to x in ksvm, kernlab package).
一个象征性的模型来描述(公式)是合适的。 x如果包含数据,那么data=NULL(类似x的ksvm,kernlab包)。


参数:data
an optional data frame (columns denote attributes, rows show examples) containing the training data, when using a formula.
一个可选的数据框包含的训练数据(列表示的属性,行显示的例子),当使用一个公式。


参数:Runs
number of runs used (e.g. 1, 5, 10, 20, 30)
使用(例如,1,5,10,20,30的运行数)


参数:method
a vector with c(vmethod,vpar), where vmethod is:  
一个向量与c(vmethod,VPAR),其中vmethod是:

all – all NROW examples are used as both training and test sets (no vpar is needed).
all - 所有NROW例子作为训练和测试集(没有VPAR是必要的)。

holdout &ndash; standard holdout method. If vpar<1 then NROW*vpar random samples are used for training and the remaining rows are used for testing. Else, then NROW*vpar random samples are used for testing and the remaining are used for training. For classification tasks (prob or class) a stratified sampling is used.
holdout  - 标准的抵抗方法。如果VPAR <1,则NROW * VPAR随机样本用于训练,剩余的行用于测试。否则,,然后NROW VPAR随机抽样用于测试,其余用于训练。对于分类任务(prob或class)分层抽样方法。

holdoutorder &ndash; similar to holdout except that instead of a random sampling, the first rows (until the split) are used for training and the remaining ones for testing (equal to mode="order" in holdout).
“”holdoutorder  - 类似holdout除,而不是一个随机抽样,第一行(直到分裂)用于训练,其余的用于测试(等于mode="order"中 holdout“)。

holdoutinc &ndash; incremental holdout retraining (e.g. used for spam data). Here, vpar is the batch size.
holdoutinc - 增量抵抗再培训(例如用于垃圾邮件的数据)。在这里,VPAR是批量大小。

kfold &ndash; K-fold cross-validation method, where vpar is the number of folds.
kfold -  K-折交叉验证方法,VPAR是倍数的数目。

kfoldo &ndash; similar to kfold except that instead of a random sampling, the order of the rows is used to build the folds.
kfoldo - 类似kfold除了,而不是随机抽样,行的顺序是用来建立褶皱。


参数:model
See fit for details.
见fit的详细信息。


参数:task
See fit for details.
见fit的详细信息。


参数:search
See fit for details.
见fit的详细信息。


参数:mpar
See fit for details.<br>
见fit对<BR>的详细信息。


参数:feature
See fit for more details about feature="none", "sabs" or "sbs" options.<br> For the mining function, additional options are feature=fmethod, where fmethod can be one of:   
见fit更详细的feature="none","sabs"或"sbs"选项。<br>对于mining功能,额外的选项feature= fmethod ,其中fmethod可以是如下之一:

sens or sensg &ndash; compute the 1-D sensitivity analysis input importances ($sen), gradient measure.  
sens或sensg  - 计算的1-D的敏感性分析输入的重要性($sen),梯度测量。

sensv &ndash; compute the 1-D sensitivity analysis input importances ($sen), variance measure.
sensv -  1-D的敏感性分析计算输入的重要性($sen),方差措施。

sensr &ndash; compute the 1-D sensitivity analysis input importances ($sen), range measure.
sensr -  1-D的敏感性分析计算输入的重要性($sen),一系列的措施。

simp, simpg or s &ndash; equal to sensg but also computes the 1-D sensitivity responses ($sresponses,  useful for graph="VEC").
simp,simpg或s  - 等于sensg“也计算1-D的敏感性反应($sresponses,有用的graph="VEC") 。

simpv &ndash; equal to sensv but also computes the 1-D sensitivity responses (useful for graph="VEC").
simpv - sensv但也计算在1-D的敏感性反应(graph="VEC"有用)。

simpr &ndash; equal to sensr but also computes the 1-D sensitivity responses (useful for graph="VEC").
simpr - sensr但也计算在1-D的敏感性反应(graph="VEC"有用)。


参数:scale
See fit for details.
见fit的详细信息。


参数:transform
See fit for details.
见fit的详细信息。


参数:debug
If TRUE shows some information about each run.
如果是TRUE显示每个运行的一些信息。


参数:...
See fit for details.
见fit的详细信息。


Details

详细信息----------Details----------

Powerful function that trains and tests a particular fit model under several runs and a given validation method (see [Cortez, 2010] for more details).<br> Several Runs are performed. In each run, the same validation method is adopted (e.g. holdout) and several relevant statistics are stored. Warning: be patient, this function can require some computational effort, specially if a high number of Runs is used.
功能强大,火车和一个特别合适的模型在几个运行测试和验证方法(见[2010]科尔特斯,了解更多详情)。<BR>几个Runs执行。在每次运行时,采用相同的验证方法(例如holdout)和一些相关的统计信息存储。警告:耐心等待,此功能可能需要一些计算工作,特别是如果大量的Runs使用。


值----------Value----------

A list with the components:  
Alist的组件:

$time &ndash; vector with time elapsed for each run.
$时间 - 矢量每次运行时间。

$test -- vector list, where each element contains the test (target) results for each run.
测试 - 向量列表,其中每个元素包含测试(目标)每次运行的结果。

$pred -- vector list, where each element contains the predicted results for each test set and each run.
PRED  - 矢量列表,其中每个元素包含每个测试组,每个运行的预测结果。

$error -- vector with an error metric for each run (the error depends on the metric parameter of mpar, valid options are explained in mmetric).
错误 - 错误metric每次运行的(错误的metric参数的向量mpar,有效的选项解释mmetric)。

$mpar -- data.frame with each fit model mpar parameters, the sequence repeats Runs (times vpar if kfold is used).
MPAR  - 每一个合适的模型MPAR参数数据框,重复序列Runs(次VPAR如果kfold使用)。

$model -- the model.
$的模式 -  model。

$task -- the task.
$的任务 -  task。

$method -- the external validation method.
$方法 - 外部验证method。

$sen -- a matrix with the 1-D sensitivity analysis input importances. The number of rows is Runs times vpar, if kfold, else is Runs.
SEN  - 矩阵的1-D的敏感性分析输入的重要性。的行数Runs倍VPAR,,如果kfold,否则是Runs。

$sresponses -- a vector list with a size equal to the number of attributes (useful for graph="VEC").  Each element contains a list with the 1-D sensitivity analysis input responses (n &ndash; name of the attribute; l &ndash; number of levels; x &ndash; attribute values; y &ndash; 1-D sensitivity responses.<br> Important note: sresponses (and "VEC" graphs) are only available if feature="sabs" or "simp" related (see feature).
$ sresponses  - 一个向量列表的大小等于的属性数(graph="VEC")有用。每个元素包含一个列表1-D的敏感性分析输入的响应(n“ - 属性的名称,”l - 数的水平; x“ - 属性值,”y -  1-D的敏感性反应。<BR>重要注意事项:sresponses(与“VEC图),是唯一可用的,如果feature="sabs"或"simp"相关(见feature)。

$runs -- the Runs.
$运行 -  Runs。

$attributes -- vector list with all attributes (features) selected in each run (and fold if kfold) if a feature selection algorithm is used.
属性 - 向量列表的所有属性(特征)在每次运行时(倍如果kfold)的特征选择算法。

$feature -- the feature. </ul>
$的功能 -  feature。 </ ul>


注意----------Note----------

See also http://www3.dsi.uminho.pt/pcortez/rminer.html
也http://www3.dsi.uminho.pt/pcortez/rminer.html


(作者)----------Author(s)----------



Paulo Cortez <a href="http://www3.dsi.uminho.pt/pcortez">http://www3.dsi.uminho.pt/pcortez</a>




参考文献----------References----------


To check for more details about rminer and for citation purposes:<br> P. Cortez.<br> Data Mining with Neural Networks and Support Vector Machines Using the R/rminer Tool.<br> In P. Perner (Ed.), Advances in Data Mining - Applications and Theoretical Aspects 10th Industrial Conference on Data Mining (ICDM 2010), Lecture Notes in Artificial Intelligence 6171, pp. 572-583, Berlin, Germany, July, 2010. Springer. ISBN: 978-3-642-14399-1.<br> @Springer: http://www.springerlink.com/content/e7u36014r04h0334<br> http://www3.dsi.uminho.pt/pcortez/2010-rminer.pdf<br> </ul>

参见----------See Also----------

fit, predict.fit, mgraph, mmetric, savemining, holdout and Importance.
fit,predict.fit,mgraph,mmetric,savemining,holdout和Importance。


实例----------Examples----------


### simple regression example[##简单的回归的例子]
x1=rnorm(200,100,20); x2=rnorm(200,100,20)
y=0.7*sin(x1/(25*pi))+0.3*sin(x2/(25*pi))
M=mining(y~x1+x2,Runs=2,model="mlpe",search=2)
print(M)
print(mmetric(M,metric="MAE"))

### classification example (task="prob")[##分类的例子(任务=“概率”)]
data(iris)
M=mining(Species~.,iris,Runs=10,method=c("kfold",3),model="dt")
print(mmetric(M,metric="CONF"))
print(mmetric(M,metric="AUC"))
print(meanint(mmetric(M,metric="AUC")))
mgraph(M,graph="ROC",TC=2,baseline=TRUE,Grid=10,leg="Versicolor",
       main="versicolor ROC")
mgraph(M,graph="LIFT",TC=2,baseline=TRUE,Grid=10,leg="Versicolor",
       main="Versicolor ROC")
M2=mining(Species~.,iris,Runs=10,method=c("kfold",3),model="svm")
L=vector("list",2)
L[[1]]=M;L[[2]]=M2
mgraph(L,graph="ROC",TC=2,baseline=TRUE,Grid=10,leg=c("DT","SVM"),main="ROC")

### regression example[##回归的例子]
data(sin1reg)
M=mining(y~.,data=sin1reg,Runs=3,method=c("holdout",2/3),model="mlpe",
         search="heuristic5",mpar=c(50,3,"kfold",3,"MAE"),feature="sabs")
print(mmetric(M,metric="MAE"))
print(M$mpar)
cat("median H nodes:",medianminingpar(M)[1],"\n")
print(M$attributes)
mgraph(M,graph="RSC",Grid=10,main="sin1 MLPE scatter plot")
mgraph(M,graph="REP",Grid=10,main="sin1 MLPE scatter plot",sort=FALSE)
mgraph(M,graph="REC",Grid=10,main="sin1 MLPE REC")
mgraph(M,graph="IMP",Grid=10,main="input importances",xval=0.1,leg=names(sin1reg))
mgraph(M,graph="VEC",Grid=10,main="x1 VEC curve",xval=1,leg=names(sin1reg)[1])

### another classification example[##另一种分类示例]
data(iris)
M=mining(Species~.,data=iris,Runs=2,method=c("kfold",2),model="svm",
search="heuristic",mpar=c(NA,NA,"kfold",3,"AUC"),feature="s")
print(mmetric(M,metric="AUC",TC=2))
mgraph(M,graph="ROC",TC=2,baseline=TRUE,Grid=10,leg="SVM",main="ROC",intbar=FALSE)
mgraph(M,graph="IMP",TC=2,Grid=10,main="input importances",xval=0.1,
leg=names(iris),axis=1)
mgraph(M,graph="VEC",TC=2,Grid=10,main="Petal.Width VEC curve",
data=iris,xval=4)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2024-11-24 01:39 , Processed in 0.026013 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表