tunePareto(TunePareto)
tunePareto()所属R语言包:TunePareto
Generic function for multi-objective parameter tuning of classifiers
通用多目标函数的参数整定的分类
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This generic function tunes parameters of arbitrary classifiers in a multi-objective setting and returns the Pareto-optimal parameter combinations.
这个通用功能的曲调任意分类的多目标设定的参数,并返回帕累托最优的参数组合。
用法----------Usage----------
tunePareto(..., data, labels,
classifier, parameterCombinations,
sampleType = c("full","uniform",
"latin","halton",
"niederreiter","sobol",
"evolution"),
numCombinations,
mu=10, lambda=20, numIterations=100,
objectiveFunctions, objectiveBoundaries,
keepSeed = TRUE, useSnowfall = FALSE, verbose=TRUE)
参数----------Arguments----------
参数:data
The data set to be used for the parameter tuning. This is usually a matrix or data frame with the samples in the rows and the features in the columns.
的数据集,被用于参数调谐。这通常是与样品中的行和列中的功能的一个矩阵或数据框。
参数:labels
A vector of class labels for the samples in data.
一个向量的样品中data类的标签。
参数:classifier
A TuneParetoClassifier wrapper object containing the classifier to tune. A number of state-of-the-art classifiers are included in TunePareto (see predefinedClassifiers). Custom classifiers can be employed using tuneParetoClassifier.
ATuneParetoClassifier包装对象,它包含的分类调整。许多国家的艺术分类都包含在TunePareto(见predefinedClassifiers“)。自定义分类,可以采用使用tuneParetoClassifier。
参数:parameterCombinations
If not all combinations of parameter ranges for the classifier are meaningful, you can set this parameter instead of specifying parameter values in the ... argument. It holds an explicit list of possible combinations, where each element of the list is a named sublist with one entry for each parameter.
如果不是所有的组合参数范围的分类是有意义的,你可以设置这个参数,而不是在指定的参数值...的说法。它拥有一个明确列出可能的组合,其中每个元素的列表是一个已命名的子表的一个条目为每个参数。
参数:sampleType
Determines the way parameter configurations are sampled. If type="full", all possible combinations are tried. This is only possible if all supplied parameter ranges are discrete or if the combinations are supplied explicitly in parameterCombinations. If type="uniform", numCombinations combinations are drawn uniformly at random. If type="latin", Latin Hypercube sampling is applied. This is particularly encouraged when tuning using continuous parameters. If type="halton","niederreiter","sobol", numCombinations parameter combinations are drawn on the basis of the corresponding quasi-random sequences (initialized at a random step to ensure that different values are drawn in repeated runs). This is particularly encouraged when tuning using continuous parameters. type="niederreiter" and type="sobol" require the gsl package to be installed. If type="evolution", an evolutionary algorithm is applied. In details, this employs mu+lambda Evolution Strategies with uncorrelated mutations and non-dominated sorting for survivor selection. This is encouraged when the space of possible parameter configurations is very large. For smaller parameter spaces, the above sampling methods may be faster.
确定采样参数配置的方式。如果type="full",所有可能的组合都试过了。这是唯一可能的,如果所有的参数范围是离散的,如果组合中明确提供parameterCombinations。如果type="uniform",numCombinations组合均匀随机绘制。如果type="latin",拉丁超立方抽样的应用。这是特别鼓励调谐时使用连续的参数。如果type="halton","niederreiter","sobol",numCombinations绘制参数的组合的基础上,相应的伪随机序列(初始化在一个随机的步骤,不同的值,以确保被绘制在反复运行)。这是特别鼓励调谐时使用连续的参数。 type="niederreiter"和type="sobol"需要gsl包安装。如果type="evolution",进化算法。在细节中,采用mu+lambda的不相关的突变和非劣排序为幸存者选择的进化策略。时,可能的参数配置的空间是非常大的,这是鼓励。对于参数空间较小,上述采样方法可能会更快。
参数:numCombinations
If this parameter is set, at most numCombinations randomly chosen parameter configurations are tested. Otherwise, all possible combinations of the supplied parameter ranges are tested.
如果此参数设置,在最numCombinations随机选择的参数配置进行测试。否则,所提供的参数范围内所有可能的组合进行了测试。
参数:mu
The number of individuals used in the Evolution Strategies if type="evolution".
数的个人演进过程中的策略,如果type="evolution"。
参数:lambda
The number of offspring per generation in the Evolution Strategies if type="evolution".
每一代的后代的数量在进化策略type="evolution"。
参数:numIterations
The number of iterations/generations the evolutionary algorithm is run if type="evolution".
如果type="evolution"数的迭代/代的进化算法的运行。
参数:objectiveFunctions
A list of objective functions used to tune the parameters. There are a number of predefined objective functions (see predefinedObjectiveFunctions). Custom objective functions can be created using createObjective.
目标函数的列表用于调整参数。有一些预定义的目标函数(见predefinedObjectiveFunctions“)。可以创建自定义的目标函数,使用createObjective。
参数:objectiveBoundaries
If this parameter is set, it specifies boundaries of the objective functions for valid solutions. That is, each element of the supplied vector specifies the upper or lower limit of an objective (depending on whether the objective is maximized or minimized). Parameter combinations that do not meet all these restrictions are not included in the result set, even if they are Pareto-optimal. If only some of the objectives should have bounds, supply NA for the remaining objectives.
如果此参数设置,它指定了有效的解决方案的目标函数的边界。也就是说,所提供的矢量的每个元素指定一个目标(取决于目标是否被最大化或最小化)的上限或下限。 ,即使他们是帕累托最优的参数组合不符合这些限制不包括在结果集中。如果只有部分的目标应该有界限,提供NA剩余的目标。
参数:keepSeed
If this is true, the random seed is reset to the same value for each of the tested parameter configurations. This is an easy way to guarantee comparability in randomized objective functions. E.g., cross-validation runs of the classifiers will all start with the same seed, which results in the same partitions. Attention: If you set this parameter to FALSE, you must ensure that all configuration are treated equally in the objective functions: There may be randomness in processes such as classifier training, but there should be no random difference in the rating itself. In particular, the choice of subsets for subsampling experiments should always be the same for all configurations. For example, you can provide precalculated fold lists to the cross-validation objectives in the foldList parameter. If parameter configurations are rated under varying conditions, this may yield over-optimistic or over-pessimistic ratings for some configurations due to outliers.
如果这是真的,是随机种子重置为每个测试的参数配置为相同的值。这是一个简单的方法来保证可比性随机目标函数。例如,交叉验证的分类都开始使用相同的种子,这将导致在相同的分区运行。注意:如果你设置这个参数传递给FALSE,你必须确保所有的配置都是一视同仁的目标函数:有可能是随机性的过程中,如分类培训,但评级本身应该没有随机差异。特别是二次抽样实验的子集的选择应始终是相同的所有配置。例如,您可以提供预先计算倍列表,交叉验证的目标foldList参数。如果参数配置被评为在不同情况下,这可能会产生过分乐观或过分悲观的评价,对于某些配置,由于异常值。
参数:useSnowfall
If this parameter is true, the routine loads the snowfall package and processes the parameter configurations in parallel. Please note that the snowfall cluster has to be initialized properly before running the tuning function and stopped after the run.
如果此参数为true,则程序加载snowfall包和并行处理的参数配置。请注意,snowfall聚类必须正确初始化运行整定功能前,后停止运行。
参数:verbose
If this parameter is true, status messages are printed. In particular, the algorithm prints the currently tested combination.
如果此参数为true,状态信息被打印出来。特别是,该算法打印当前测试的组合。
参数:...
The parameters of the classifier and predictor functions that should be tuned. The names of the parameters must correspond to the parameters specified in classifierParameterNames and predictorParameterNames of tuneParetoClassifier. Each supplied argument describes the possible values of a single parameter. These can be specified in two ways: Discrete parameter ranges are specified as lists of possible values. Continous parameter ranges are specified as intervals using as.interval. The algorithm then generates combinations of possible parameter values. Alternatively, the combinations can be defined explicitly using the parameterCombinations parameter.
应调谐的分类和预测函数的参数。的参数名称必须符合指定的参数classifierParameterNames和predictorParameterNamestuneParetoClassifier。一个单一的参数,每个参数的可能值。这些可以指定在两个方面:范围被指定为离散参数的可能值的列表。连续参数指定的范围使用as.interval的间隔。然后,该算法产生的可能的参数值的组合。此外,可以显式定义的组合使用parameterCombinations参数。
Details
详细信息----------Details----------
This is a generic function that allows for parameter tuning of a wide variety of classifiers. You can either specify the values or intervals of tuned parameters in the ... argument, or supply selected combinations of parameter values using parameterCombinations. In the first case, combinations of parameter values specified in the ... argument are generated. If sampleType="uniform", sampleType="latin", sampleType="halton", sampleType="niederreiter" or sampleType="sobol", a random subset of the possible combinations is drawn. If sampleType="evolution", random parameter combinations are generated and optimized using Evolution Strategies.
这是一个通用的功能,可以让各种各样的分类器参数整定。您可以指定间隔调整的参数的值或在...参数,或提供选择的参数值组合使用parameterCombinations。在第一种情况下,...参数中指定的参数值组合生成。如果sampleType="uniform",sampleType="latin",sampleType="halton",sampleType="niederreiter"或sampleType="sobol",可能的组合的随机子集的绘制。如果sampleType="evolution",生成随机参数的组合和优化演化策略。
In the latter case, only the parameter combinations specified explicitly in parameterCombinations are tested. This is useful if certain parameter combinations are invalid. You can create parameter combinations by concatenating results of calls to allCombinations. Only sampleType="full" is allowed in this mode.
在后者的情况下,只有在parameterCombinations明确指定的参数组合测试。这是非常有用的,如果某些参数的组合是无效的。您可以创建参数的组合,通过连接调用allCombinations的结果。只有sampleType="full"允许在此模式下。
For each of the combinations, the specified objective functions are calculated. This usually involves training and testing a classifier. From the resulting objective values, the non-dominated parameter configurations are calculated and returned.
对于每个组合,所指定的目标函数计算。这通常涉及一个分类的培训和测试。从所得的目标值,非支配的参数配置被计算出来并返回。
The ... argument is the first argument of tunePareto for technical reasons (to prevent partial matching of the supplied parameters with argument names of tunePareto. This requires all arguments to be named.
...参数是第一个参数tunePareto由于技术原因(防止部分匹配提供的参数与参数名tunePareto,这需要所有的参数被命名为。
值----------Value----------
Returns a list of class TuneParetoResult with the following components:
返回列表类TuneParetoResult以下组件:
参数:bestCombinations
A list of Pareto-optimal parameter configurations. Each element of the list consists of a sub-list with named elements corresponding to the parameter values.
帕累托最优的参数配置列表。的列表中的每一个元素包括命名的元素对应的参数值的子列表。
参数:bestObjectiveValues
A matrix containing the objective function values of the Pareto-optimal configurations in bestCombinations. Each row corresponds to a parameter configuration, and each column corresponds to an objective function.
一个矩阵包含的目标函数值的帕累托最优配置bestCombinations。每一行对应于一个参数的配置,并且每一列对应于一个目标函数。
参数:testedCombinations
A list of all tested parameter configurations with the same structure as bestCombinations.
一个列表中的所有测试参数的配置具有相同的结构bestCombinations。
参数:testedObjectiveValues
A matrix containing the objective function values of all tested configurations with the same structure as bestObjectiveValues.
含A矩阵的目标函数的值的所有测试的配置用相同的结构bestObjectiveValues。
参数:dominationMatrix
A Boolean matrix specifying which parameter configurations dominate each other. If a configuration i dominates a configuration j, the entry in the ith row and the jth column is TRUE.
布尔矩阵指定参数配置主宰对方。如果配置i主导的结构j,ith行和j列中的条目是TRUE。
参数:minimizeObjectives
A Boolean vector specifying which of the objectives are minimization objectives. This is derived from the objective functions supplied to tunePareto.
一个的布尔向量确定的目标是最小化的目标。这是来自供给到tunePareto的目标函数。
参见----------See Also----------
predefinedClassifiers, predefinedObjectiveFunctions, createObjective, allCombinations
predefinedClassifiers,predefinedObjectiveFunctions,createObjective,allCombinations
实例----------Examples----------
# tune 'k' of a k-NN classifier [调k的一个k-NN分类]
# on two classes of the 'iris' data set --[在两个班的IRIS数据集 - ]
# see ?knn[看到了吗?KNN]
print(tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.knn(),
k = c(1,3,5,7,9),
objectiveFunctions = list(cvError(10, 10),
reclassError())))
# example using predefined parameter configurations,[例如,使用预定义的参数配置,]
# as certain combinations of k and l are invalid:[k和l的某些组合是无效的:]
comb <- c(allCombinations(list(k=1,l=0)),
allCombinations(list(k=3,l=0:2)),
allCombinations(list(k=5,l=0:4)),
allCombinations(list(k=7,l=0:6)))
print(tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.knn(),
parameterCombinations = comb,
objectiveFunctions = list(cvError(10, 10),
reclassError())))
# tune 'cost' and 'kernel' of an SVM on[调“成本”和“内核”的SVM]
# the 'iris' data set using Latin Hypercube sampling --[IRIS使用拉丁超立方抽样的数据 - ]
# see ?svm and ?predict.svm[看到了吗?SVM和predict.svm]
print(tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.svm(),
cost = as.interval(0.001,10),
kernel = c("linear", "polynomial",
"radial", "sigmoid"),
sampleType="latin",
numCombinations=20,
objectiveFunctions = list(cvError(10, 10),
cvSensitivity(10, 10, caseClass="setosa"))))
# tune the same parameters using Evolution Strategies[调整使用相同的参数进化策略]
print(tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.svm(),
cost = as.interval(0.001,10),
kernel = c("linear", "polynomial",
"radial", "sigmoid"),
sampleType="evolution",
numCombinations=20,
numIterations=20,
objectiveFunctions = list(cvError(10, 10),
cvSensitivity(10, 10, caseClass="setosa"),
cvSpecificity(10, 10, caseClass="setosa"))))
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|