找回密码
 注册
查看: 550|回复: 0

R语言 RSofia包 sofia()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-9-28 22:14:01 | 显示全部楼层 |阅读模式
sofia(RSofia)
sofia()所属R语言包:RSofia

                                        Fitting sofia-ml models
                                         配件索非亚毫升模型

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

sofia is used to fit classification and regression models provided by D. Sculley's sofia-ml.
sofia D.斯卡利的索非亚毫升,以适应分类和回归模型。


用法----------Usage----------



## S3 method for class 'formula'[类formula的方法]
sofia(x, data, random_seed = floor(runif(1, 1, 65535)), lambda = 0.1,
    iterations = 1e+05, learner_type = c("pegasos", "sgd-svm",
        "passive-aggressive", "margin-perceptron", "romma", "logreg-pegasos"),
    eta_type = c("pegasos", "basic", "constant"), loop_type = c("stochastic",
        "balanced-stochastic", "rank", "roc", "query-norm-rank",
        "combined-ranking", "combined-roc"), rank_step_probability = 0.5,
    passive_aggressive_c = 1e+07, passive_aggressive_lambda = 0,
    perceptron_margin_size = 1, training_objective = FALSE, hash_mask_bits = 0,
    verbose = FALSE, reserve = 0, ...)

## S3 method for class 'character'
sofia(x, random_seed = floor(runif(1, 1, 65535)), lambda = 0.1,
    iterations = 1e+05, learner_type = c("pegasos", "sgd-svm",
        "passive-aggressive", "margin-perceptron", "romma", "logreg-pegasos"),
    eta_type = c("pegasos", "basic", "constant"), loop_type = c("stochastic",
        "balanced-stochastic", "rank", "roc", "query-norm-rank",
        "combined-ranking", "combined-roc"), rank_step_probability = 0.5,
    passive_aggressive_c = 1e+07, passive_aggressive_lambda = 0,
    perceptron_margin_size = 1, training_objective = FALSE,no_bias_term = FALSE, dimensionality=150000, hash_mask_bits = 0,
    verbose = FALSE, buffer_mb = 40, ...)



参数----------Arguments----------

参数:x
a formula object or a character with a path to a file
一个formula对象或字符的文件的路径


参数:data
data to parse formula on, when model is specified via a formula
数据解析公式,模型的指定是通过一个公式时,


参数:random_seed
an integer. Makes algorithm use this seed. Can be useful in testing and parameter tuning
一个整数。算法使用这个种子。可能是有用的测试和参数调整


参数:lambda
a numeric scalar. Value of lambda for SVM regularization, used by both Pegasos SVM and SGD-SVM.
数字标。的lambda值为SVM正规化,Pegasos的SVM和SGD-SVM使用。


参数:iterations
an integer. Number of stochastic gradient steps to take.
一个整数。数随机梯度采取的步骤。


参数:learner_type
a character string indicating which type of learner to use. One of "pegasos" (default), "sgd-svm", "passive-aggressive", "margin-perceptron", "romma", "logreg-pegasos"
一个字符串,表示使用哪种类型的学习者。 "pegasos"(默认),"sgd-svm","passive-aggressive","margin-perceptron","romma","logreg-pegasos"


参数:eta_type
a character string indicating the type of update for learning rate to use. One of "pegasos" (default), "basic", "constant"
一个字符串,学习速率使用的更新类型。 "pegasos"(默认),"basic","constant"


参数:loop_type
a character string indicating the type of sampling loop to use for training. One of   "stochastic" - Perform normal stochastic sampling for stochastic gradient descent, for training binary classifiers. On each iteration, pick a new example uniformly at random from the data set.  "balanced-stochastic" - Perform a balanced sampling from positives and negatives in data set. For each iteration, samples one positive example uniformly at random from the set of all positives, and samples one negative example uniformly at random from the set of all negatives. This can be useful for training binary classifiers with a minority-class distribution.  "rank" - Perform indexed sampling of candidate pairs for pairwise learning to rank. Useful when there are examples from several different qid groups.  "roc" - Perform indexed sampling to optimize ROC Area.  "query-norm-rank" - Perform sampling of candidate pairs, giving equal weight to each qid group regardless of its size. Currently this is implemented with rejection sampling rather than indexed sampling, so this may run more slowly.  "combined-ranking" - Performs CRR algorithm for combined regression and ranking. Alternates between pairwise rank-based steps and standard stochastic gradient steps on single examples. Relies on "rank_step_probability" to balance between these two kinds of updates.  "combined-roc" - Performs CRR algorithm for combined regression and ROC area optimization. Alternates between pairwise roc-optimization-based steps and standard stochastic gradient steps on single examples. Relies on "rank_step_probability" to balance between these two kinds of updates. This can be faster than the combined-ranking option when there are exactly two classes.  
一个字符串表示的类型,采样循环使用的培训。一"stochastic"  - 执行正常的随机抽样,随机梯度下降,为二分类培训。在每次迭代中,选择一个新的例子均匀,随机的数据集。 "balanced-stochastic"  - 执行一个平衡取样的数据集从正面和负面的。在每次迭代中,一个正面的例子均匀随机从集合中的所有阳性样品,样品的反面教材均匀随机从集合中的所有底片。这可能是有用的一的少数族裔类分布二元分类与培训。 "rank"  - 执行索引抽样的候选对成对学习的排名。时十分有用的例子,从几个不同的序号。 "roc"  - 执行索引取样,以优化我国区域。 "query-norm-rank"  - 候选人对进行采样,每一天四次组,无论其大小相等的权重。目前,这是实施拒绝抽样,而不是索引的采样,所以这可能会运行得更慢。 "combined-ranking" - 执行CRR算法合并回归和排名的。交替成对的排名为基础的步骤和标准随机梯度的步骤的单实例。凭借"rank_step_probability"之间的平衡这两种类型的更新。 "combined-roc" - 执行CRR组合的回归和ROC面积优化算法。交替成对鹏优化的单实例的步骤和标准随机梯度的步骤的。凭借"rank_step_probability"之间的平衡这两种类型的更新。这是速度比联合高级选项时,有整整两班。


参数:rank_step_probability
a numeric scalar. Probability that we will take a rank step (as opposed to a standard stochastic gradient step) in a combined ranking or combined ROC loop.
数字标。的概率,我们将采取的秩工序(如相对于一个标准的随机梯度步骤)在组合排名或合并中华民国环路。


参数:passive_aggressive_c
a numeric scalar. Maximum size of any step taken in a single passive-aggressive update
数字标。在一个单一的被动进取的更新采取的任何步骤的最大规模的


参数:passive_aggressive_lambda
a numeric scalar. Lambda for pegasos-style projection for passive-aggressive update. When set to 0 (default) no projection is performed.
数字标。拉姆达台Pegasos式投影被动进取的更新。当设置为0(默认),无投影。


参数:perceptron_margin_size
Width of margin for perceptron with margins. Default of 1 is equivalent to unregularized SVM-loss
边际利润的感知与宽度。默认值为1相当于unregularized SVM损失


参数:training_objective
logical. When TRUE, computes the value of the standard SVM objective function on training data, after training.
逻辑。当TRUE,计算的标准SVM的目标函数值的训练数据,训练结束后。


参数:dimensionality
integer. Index id of largest feature index in training data set, plus one.
整数。索引ID指数的训练数据集,再加上一个最大的特点。


参数:hash_mask_bits
an integer. When set to a non-zero value, causes the use of a hased weight vector with hashed cross product features. This allows learning on conjunction of features, at some increase in computational cost. Note that this flag must be set both in training and testing to function properly. The size of the hash table is set to 2^hash_mask_bits. default value of 0 shows that hash cross products are not used.
一个整数。当设置为一个非零值,使散列交叉产品功能的使用的hased权重向量。这使得学习结合的特点,在计算成本有所增加。请注意,此标志必须设置在培训和测试,以正常。哈希表的大小被设置为2 ^hash_mask_bits。默认值0表示,不使用散列交叉乘积的。


参数:verbose
logical.
逻辑。


参数:no_bias_term
logical. When set, causes a bias term x_0 to be set to 0 for every feature vector loaded from files, rather than the default of x_0 = 1. Setting this flag is equivalent to forcing a decision threshold of exactly 0 to be used. The same setting of this flag should be used for training and testing. Note that this flag as no effect for rank and roc optimzation. Default: not set. To set this flag using the formula interface use ( Y ~ -1 + . )
逻辑。设置时,会导致的偏见术语X_0的可以从文件加载的每一个特征向量,而不是默认的X_0 = 1设置为0。设置此标志将是相当于迫使恰好为0的决定阈值,以使用。相同的设置,这个标志应该用于训练和测试。请注意,职级和大鹏优化中的效果,因为没有这个标志。默认值:未设置。要设置此标志使用公式接口使用(Y~(-1)+。)


参数:reserve
integer. experimental, should vector be explicity reserved for data?  
整数。实验时,应矢量明确地预留给数据?


参数:buffer_mb
integer. Size of buffer to use in reading/writing to files, in MB.
整数。使用读/写文件的缓冲区大小,单位为MB。


参数:...
items passed to methods.
项目传递给方法。


值----------Value----------

sofia returns an object of class "sofia".
sofia返回一个对象类的“索菲亚”。

An object of class "sofia" is a list containing at least the following components:
类“索非亚”一个目的是一个列表,包含至少以下组件:


参数:par
a list containing the parameters specified in training the model
在训练模型的参数指定一个列表,其中包含


参数:weights
a numeric vector of the parameter weights (the model)  
数值参数的权重向量(模型)


参数:training_time
time used to fit the model (does not include io time)  
所用的时间,以适应模型(不包括IO时间)

If the method was called via the formula interface, it will additionally include:
如果该方法被称为通过式接口,它另外包括:


参数:formula
formula with the specification of the model
式与模型的规范


参考文献----------References----------





参见----------See Also----------

http://code.google.com/p/sofia-ml/



实例----------Examples----------



data(irismod)

model.logreg <- sofia(Is.Virginica ~ ., data=irismod, learner_type="logreg-pegasos")


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2024-11-27 20:40 , Processed in 0.024218 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表