R语言 speedglm包 speedlm()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-30 14:55:22

speedlm(speedglm)
speedlm()所属R语言包：speedglm

                                    Fitting Linear Models to Large Data Sets
                                       大型数据集的拟合线性模型

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

The functions of class 'speedlm' may speed up the fitting of LMs to large data sets. High performances can be obtained especially if R is linked against an optimized BLAS, such as ATLAS.
的类的speedlm“的功能可以加快LMS大型数据集的拟合。可以得到高的性能，特别是当R是对一个优化的BLAS，如ATLAS联系在一起。

用法----------Usage----------

# S3 method of class 'data.frame'
speedlm(formula,data,weights=NULL,offset=NULL,sparse=NULL,set.default=list(),...)

# S3 method of class 'matrix'
speedlm.fit(y,X,intercept=FALSE,offset=NULL,row.chunk=NULL,sparselim=.9,camp=.01,
                  eigendec=TRUE,tol.solve=.Machine$double.eps,sparse=NULL,
                  tol.values=1e-7,tol.vectors=1e-7, method = "eigen",...)

speedlm.wfit(y,X,w,intercept=FALSE,offset=NULL,row.chunk=NULL,sparselim=.9,camp=.01,
                  eigendec=TRUE,tol.solve=.Machine$double.eps,sparse=NULL,
                  tol.values=1e-7,tol.vectors=1e-7, method = "eigen",...)

# S3 method of class 'speedlm' (object) and 'data.frame' (data)
update.speedlm(object,data,weights=NULL,offset=NULL,sparse=NULL,all.levels=FALSE,
            set.default=list(),...)

参数----------Arguments----------

参数：formula
the same of function lm.
相同的功能lm。

参数：data
the same of function lm, but it must always specified.
相同的功能lm，但它必须总是指定。

参数：weights
the same of function lm, but it must be specified as data$weights.
相同的功能lm，但它必须指定为data$weights。

参数：w
the same of weights.
相同的weights。

参数：intercept
a logical value which indicates if an intercept is used.
一个逻辑值，该值指示如果截距使用。

参数：offset
the same of function lm.
相同的功能lm。

参数：X
the same of x in function lm.
相同的x在功能lm。

参数：y
the same of function lm.
相同的功能lm。

参数：sparse
logical. Is the model matrix sparse? By default is NULL, so a quickly sample survey will be made.
逻辑。是模型矩阵疏吗？默认情况下为NULL，所以很快抽样调查。

参数：set.default
a list in which to specify the parameters to pass to the functions cp, control and is.sparse.
一个列表中指定的参数传递的功能CP，控制和is.sparse的。

参数：sparselim
a value in the interval [0, 1]. It indicates the minimal  proportion of zeroes, in the model matrix X, in order to consider X as sparse.
在区间[0，1]的值。它表示的零的最小比例，模型中的矩阵X，为了考虑X为稀疏。

参数：camp
see function is.sparse.
请参阅功能is.sparse。

参数：eigendec
logical. Do you want to investigate on rank of X? You may set it to  false if you are sure that X is full rank.
逻辑。你想调查排名的X？您可以将其设置为false，如果你确定X是满秩。

参数：row.chunk
an integer, see the function cp for details.
一个整数，看到的功能cp的详细信息。

参数：tol.solve
see function solve.
请参阅功能解决。

参数：tol.values
see function control.
功能控制。

参数：tol.vectors
see function control.
功能控制。

参数：method
see function control.
功能控制。

参数：object
an object of class 'speedlm'.
对象类的speedlm。

参数：all.levels
are all levels of eventual factors present in each data chunk?  If so, set all.levels to true to speed up the fitting.
各级在每个数据块的最终因素是什么？如果是这样，设置all.levels为true，加快装修。

参数：...
further optional arguments.
进一步的可选参数。

Details

详细信息----------Details----------

Unlikely from the functions lm or biglm, the functions of class 'speedlm' do not use  the QR decomposition but directly solve the normal equations.  In some extreme case, this might have some problem of numerical stability but may take advantage from the use of  an optimized BLAS. The memory size of an object of class 'speedlm' is O(p^2), where p is the number of covariates. If an optimized BLAS library is not installed, an attempt to speed up calculations may be done by setting row.chunk  to some value, usually less than 1000, in set.default. See the function cp for details. Factors are permitted  without limitations.
不可能由函数流明或biglm的，功能类的speedlm“不使用QR分解，但直接求解正规方程组。在一些极端的情况下，这可能具有一些问题的数值稳定性，但可以利用从使用一种优化的BLAS。内存大小的类的speedlm“的对象是O(p^2)，p是协变量的数量。如果优化的BLAS库没有安装，企图加速计算可通过设置row.chunk一定的价值，通常不超过1000，在set.default。有关详细信息，请参阅功能CP。因素被允许而没有限制。

值----------Value----------

<table summary="R valueblock"> <tr valign="top"><td>coefficients</td> <td> the estimated coefficients.</td></tr> <tr valign="top"><td>df.residual</td> <td> the residual degrees of freedom.</td></tr> <tr valign="top"><td>XTX</td> <td> the product X'X (weighted, if the case).</td></tr> <tr valign="top"><td>A</td> <td> the product X'X (weighted, if the case) not checked for singularity.</td></tr> <tr valign="top"><td>Xy</td> <td> the product X'y (weighted, if the case).</td></tr> <tr valign="top"><td>ok</td> <td> the set of column indeces of the model matrix where the model has been fitted.</td></tr> <tr valign="top"><td>rank</td> <td> the numeric rank of the fitted linear model.</td></tr> <tr valign="top"><td>pivot</td> <td> see the function control.</td></tr> <tr valign="top"><td>RSS</td> <td> the estimated residual sums of squares of the fitted model.</td></tr> <tr valign="top"><td>sparse</td> <td> a logical value indicating if the model matrix is sparse.</td></tr> <tr valign="top"><td>deviance</td> <td> the estimated deviance of the fitted model.</td></tr> <tr valign="top"><td>weigths</td> <td> the weights used in the last updating.</td></tr> <tr valign="top"><td>zero.w</td> <td> the number of non-zero weighted observations.</td></tr> <tr valign="top"><td>n.obs</td> <td> the number of observations.</td></tr> <tr valign="top"><td>nvar</td> <td> the number of independent variables.</td></tr> <tr valign="top"><td>terms</td> <td> the terms object used.</td></tr> <tr valign="top"><td>intercept</td> <td> a logical value which indicates if an intercept has been used.</td></tr>  <tr valign="top"><td>call</td> <td> the matched call.</td></tr> <tr valign="top"><td>...</td> <td> others values necessary to update the estimation.</td></tr>
<table summary="R valueblock"> <tr valign="top"> <TD>coefficients </ TD> <TD>的估计系数。</ TD> </ TR> <TR VALIGN =“顶部“> <TD> df.residual </ TD> <TD>的剩余自由度。</ TD> </ TR> <tr valign="top"> <TD>XTX</ TD > <TD>产品的XX（加权，如果这种情况）。</ TD> </ TR> <tr valign="top"> <TD>A </ TD> <TD>的产品XX（加权的情况下），如果不检查奇异。</ TD> </ TR> <tr valign="top"> <TD>Xy </ TD> <TD>的产品X Y（加权的情况下）。</ TD> </ TR> <tr valign="top"> <TD>ok </ TD> <TD>的模型矩阵的列选取的一组该模型已安装。</ TD> </ TR> <tr valign="top"> <TD> rank</ TD> <TD>的数字排名拟合的线性模型。</ TD> </ TR> <tr valign="top"> <TD> pivot </ TD> <TD>的功能控制。</ TD> </ TR> <tr valign="top"> <TD >RSS</ TD> <TD>平方的拟合模型估计剩余款项。</ TD> </ TR> <tr valign="top"> <TD>sparse</ TD> <td>一个逻辑值，表示如果模型矩阵是稀疏的。</ TD> </ TR> <tr valign="top"> <TD> deviance</ TD> <TD>的估计偏差拟合模型。</ TD> </ TR> <tr valign="top"> <TD>weigths </ TD> <TD>的权重中的最后更新。</ TD> </ TR> <tr valign="top"> <TD> zero.w </ TD> <TD>数的非零加权观测。</ TD> </ TR> <tr valign="top"> <TD> n.obs </ TD> <TD>的若干意见。</ TD> </ TR> <tr valign="top"> <TD> nvar</ TD> <TD独立变量的数量。</ TD> </ TR> <tr valign="top"> <TD>terms </ TD> <TD>的terms对象。</ TD > </ TR> <tr valign="top"> <TD>intercept </ TD> <td>一个逻辑值，它表明一个拦截已被使用。</ TD> </ TR> <TR VALIGN =“”> <TD>call </ TD> <TD>匹配的呼叫。</ TD> </ TR> <tr valign="top"> <TD>... / TD> <TD>其他所需的值更新的估计。</ TD> </ TR>

</table>
</ TABLE>

注意----------Note----------

All the above functions make an object of class 'speedlm'.
所有上述功能的类的speedlm的对象。

（作者）----------Author(s)----------

Marco ENEA

参考文献----------References----------

In book of short papers, conference on “Statistical Methods for the analysis of large data-sets”, Italian Statistical Society, Chieti-Pescara, 23-25 September 2009, 411-414.<br>
Klotz, J.H. (1995) Updating Simple Linear Regression. Statistica Sinica, 5, 399-403.<br>
Bates, D. (2009) Comparing Least Square Calculations. Technical report. Available at  http://cran.rakanu.com/web/packages/Matrix/vignettes/Comparisons.pdf<br>
Lumley, T. (2009) biglm: bounded memory linear and generalized linear models. R package version 0.7 http://CRAN.R-project.org/package=biglm.

参见----------See Also----------

summary.speedlm,speedglm, lm, and biglm
summary.speedlm，speedglm，LM和biglm的

实例----------Examples----------

n <- 1000
k <- 3
y <- rnorm(n)
x <- round(matrix(rnorm(n * k), n, k), digits = 3)
colnames(x) <- c("s1", "s2", "s3")
da <- data.frame(y, x)
do1 <- da[1:300,]
do2 <- da[301:700,]
do3 <- da[701:1000,]

m1 <- speedlm(y ~ s1 + s2 + s3, data = do1)
m1 <- update(m1, data = do2)
m1 <- update(m1, data = do3)

m2 <- lm(y ~ s1 + s2 + s3, data = da)
summary(m1)
summary(m2)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册