sparsenet(sparsenet)
sparsenet()所属R语言包:sparsenet
Fit a linear model regularized by the nonconvex MC+ sparsity penalty
适合的线性模型规范的非凸MC +稀疏惩罚
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Sparsenet uses coordinate descent on the MC+ nonconvex penalty family, and fits a surface of solutions over the two-dimensional parameter space. This penalty family is indexed by an overall strength paramter lambda (like lasso), and a convexity parameter gamma. Gamma = infinity corresponds to the lasso, and gamma = 1 best subset.
sparsenet使用坐标下降的MC +非凸罚款家庭,和配合表面的解决方案,在两维的参数空间。这种刑罚,家庭整体实力放慢参数λ(如套索)和凸性参数伽玛索引。 γ=无穷大对应的套索,和γ= 1的子集。
用法----------Usage----------
sparsenet(x, y, weights, exclude, dfmax = nvars + 1, pmax = min(dfmax *2, nvars),
ngamma = 9, nlambda = 50, max.gamma = 150, min.gamma = 1.000001,
lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e-04), lambda = NULL,
gamma = NULL, parms = NULL, warm = c("lambda", "gamma", "both"), thresh = 1e-05, maxit = 1e+06)
参数----------Arguments----------
参数:x
Input matrix of nobs x nvars predictors
NOBS的x nvars的预测的输入矩阵
参数:y
response vector
响应向量
参数:weights
Observation weights; default 1 for each observation
观察权重;默认为每个观察
参数:exclude
Indices of variables to be excluded from the model. Default is none.
指数的变量从模型中被排除在外。默认是没有的。
参数:dfmax
Limit the maximum number of variables in the model. Useful for very large nvars, if a partial path is desired.
在模型中的变量的最大数量限制。用于非常大的nvars,如果部分路径是需要的。
参数:pmax
Limit the maximum number of variables ever to be nonzero
变量的最大数量限制以往任何时候都为非零
参数:ngamma
Number of gamma values, if gamma not supplied; default is 9.
伽玛值的数量,如果gamma不提供的,默认值为9。
参数:nlambda
Number of lambda values, if lambda not supplied; default is 50
lambda值的数量,如果lambda不提供的,默认为50
参数:max.gamma
Largest gamma value to be used, apart from infinity (lasso), if gamma not supplied; default is 150
最大的伽玛值被使用,除了从无限远(套索),如果gamma不提供的,默认为150
参数:min.gamma
Smallest value of gamma to use, and should be >1; default is 1.000001
伽玛使用的值最小的,并且应该是> 1,默认值是1.000001
参数:lambda.min.ratio
Smallest value for lambda, as a fraction of lambda.max, the (data derived) entry value (i.e. the smallest value for which all coefficients are zero). The default depends on the sample size nobs relative to the number of variables nvars. If nobs > nvars, the default is 0.0001, close to zero. If nobs < nvars, the default is 0.01. A very small value of lambda.min.ratio will lead to a saturated fit in the nobs < nvars case.
的一小部分,作为最小的值lambdalambda.max,得出的数据项的值(即最小的值,所有系数均为零)。缺省值取决于样本量nobs相对数的变量nvars。如果nobs > nvars,默认的是0.0001,接近于零。如果nobs < nvars,默认的是0.01。 lambda.min.ratio一个很小的值会导致到一个饱和的适合nobs < nvars情况下的。
参数:lambda
A user supplied lambda sequence, in decreasing order. Typical usage is to have the program compute its own lambda sequence based on nlambda and lambda.min.ratio. Supplying a value of lambda overrides this. WARNING: use with care. Do not supply a single value for lambda (for predictions after CV use predict() instead). Supply instead a decreasing sequence of lambda values. sparsenet relies on its warms starts for speed, and its often faster to fit a whole path than compute a single fit.
用户提供的lambda序列,降序排列。典型的用法是有计划,计算其自己的lambda序列的基础上nlambda和lambda.min.ratio。提供一个值lambda覆盖此。警告:请小心使用。不提供单后的预测值lambda(“CV利用predict()代替)。供应lambda值,而不是一个递减的序列。 sparsenet依赖于它的变暖,启动速度快,往往更快,以适应整个路径计算一个合适的。
参数:gamma
Sparsity parameter vector, with 1<gamma<infty. Gamma=1 corresponds to best-subset regression, gamma=infty to the lasso. Should be given in decreasing order.
稀疏性参数向量,1 <伽玛<infty的。 γ= 1对应的最佳子集回归,γ= infty的套索。应给予降序排列。
参数:parms
An optional three-dimensional array: 2x ngamma x nlambda. Here the user can supply exactly the gamma, lambda pairs that are to be traversed by the coordinate descent algorithm.
一个可选的三维数组:2×n灰度系数x nlambda。在这里,用户可以提供完全相同的γ,λ对由坐标下降算法遍历。
参数:warm
How to traverse the grid. Default is "lambda", meaning warm starts from the previous lambda with the same gamma. "gamma" means the opposite, previous gamma for the same lambda. "both" tries both warm starts, and uses the one that improves the criterion the most.
如何遍历网格。默认值是“拉姆达”,这意味着从以前的lambda相同的γ热启动。 “伽马”意味着相反,以前的伽玛值相同的lambda。 “既”试图既保暖又开始使用,提高了标准,最。
参数:thresh
Convergence threshold for coordinate descent. Each coordinate-descent loop continues until the maximum change in the objective after any coefficient update is less than thresh times the null Rss. Defaults value is 1E-5.
坐标下降的收敛阈值。每个坐标下降循环继续,直到任何系数更新后的目标的最大变化量小于thresh倍空RSS。默认设置值是1E-5。
参数:maxit
Maximum number of passes over the data for all lambda/gamma values; default is 10^6.
所有的lambda /伽玛值的数据传递的最大数量,默认为10 ^ 6。
Details
详细信息----------Details----------
This algorithm operates like glmnet, with its alpha parameter which moves the penalty between lasso and ridge; here gamma moves it between lasso and best subset. The algorithm traverses the two dimensional gamma/lambda array in a nested loop, with decreasing gamma in the outer loop, and decreasing lambda in the inner loop. Because of the nature of the MC+ penalty, each coordinate update is a convex problem, with a simple two-threshold shrinking scheme: beta< lambda set to zero; beta > lambda*gamma leave alone; beta inbetween, shrink proportionally. Note that this algorithm ALWAYS standardizes the columns of x and y to have mean zero and variance 1 (using the 1/N averaging) before it computes its fit. The coefficients reflect the original scale.
该算法的运作就像glmnet,它的alpha参数,套索和脊之间移动的处罚,在这里伽玛套索和最佳子集之间的移动。算法遍历二维伽玛/λ阵列在嵌套循环中,随着γ在外部循环中,并降低在内部循环的lambda。因为每个坐标更新的MC +惩罚的性质,是一个凸的问题,一个简单的双阈值萎缩方案:β<拉姆达设置为零;β>的lambda *伽玛独自离开;测试之间,按比例缩小。请注意,该算法ALWAYS x和y列的标准化已均值为零,方差为1(使用1 / N的平均),它计算它适合之前。系数反映了原有的规模。
值----------Value----------
An object of class "sparsenet", with a number of components. Mostly one will access the components via generic functions like coef(), plot(), predict() etc.
类"sparsenet",具有的部件数量的目的。大多是通过将访问组件的通用功能,如coef(),plot(),predict()等
参数:call
the call that produced this object
产生这个对象的调用
参数:rsq
The percentage variance explained on the training data; an ngamma x nlambda matrix.
解释的百分比差额训练数据;矩阵的n灰度系数x nlambda。
参数:jerr
error flag, for warnings and errors (largely for internal debugging).
错误标志,警告和错误(主要是内部调试)。
参数:coefficients
A coefficient list with ngamma elements; each of these is a coefficient list with various components: the matrix beta of coefficients, its dimension dim, the vector of intercepts, the lambda sequence, the gamma value, the sequence of df (nonzero coefficients) for each solution.
的系数n灰度系数的元素的列表,其中每个系数列表的各个组成部分:矩阵Beta系数,其尺寸暗淡,拦截的向量,λ序列,伽玛值,序列DF(非零系数)每个解决方案。
参数:parms
Irrespective how the parameters were input, the three-way array of what was used.
不论如何参数输入,什么三通阵列使用。
参数:gamma
The gamma values used
所使用的伽玛值
参数:lambda
The lambda values used
所用的lambda值
参数:max.lambda
The entry value for lambda
的入账价值为lambda
(作者)----------Author(s)----------
Rahul Mazumder, Jerome Friedman and Trevor Hastie
Maintainer: Trevor Hastie <hastie@stanford.edu>
参考文献----------References----------
参见----------See Also----------
glmnet package, predict, coef, print and plot methods, and the cv.sparsenet function.
glmnet包,predict,coef,print和plot方法,和cv.sparsenet功能。
实例----------Examples----------
train.data=gendata(100,1000,nonzero=30,rho=0.3,snr=3)
fit=sparsenet(train.data$x,train.data$y)
par(mfrow=c(3,3))
plot(fit)
par(mfrow=c(1,1))
fitcv=cv.sparsenet(train.data$x,train.data$y,trace.it=TRUE)
plot(fitcv)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|