找回密码
 注册
查看: 1363|回复: 0

R语言:saddle.distn()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-16 18:05:42 | 显示全部楼层 |阅读模式
saddle.distn(boot)
saddle.distn()所属R语言包:boot

                                         Saddlepoint Distribution Approximations for Bootstrap Statistics
                                         鞍点分布逼近的Bootstrap统计

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Approximate an entire distribution using saddlepoint methods.  This function can calculate simple and conditional saddlepoint distribution approximations for a univariate quantity of interest.  For the simple saddlepoint the quantity of interest is a linear combination of W where W is a vector of random variables.  For the conditional saddlepoint we require the distribution of one linear combination given the values of any number of other linear combinations. The distribution of W must be one of multinomial, Poisson or binary.  The primary use of this function is to calculate quantiles of bootstrap distributions using saddlepoint approximations. Such quantiles are required by the function control to approximate the distribution of the linear approximation to a statistic.
近似整个分布的鞍点方法。此功能可以简单的和有条件的鞍点分布近似计算的利息的单变量数量。对于简单的鞍利益的数量是W,其中W是一个随机变量的向量的线性组合。对于有条件的鞍我们需要的任何其他线性组合的价值,一个线性组合分布。 W的分配必须是一个多项式,泊松或二进制。这个函数的主要用途是引导利用鞍点逼近的分布计算位数。这样的位数需要的功能control近似线性近似的分布统计信息。


用法----------Usage----------


saddle.distn(A, u = NULL, alpha = NULL, wdist = "m",
             type = "simp", npts = 20, t = NULL, t0 = NULL,
             init = rep(0.1, d), mu = rep(0.5, n), LR = FALSE,
             strata = NULL, ...)



参数----------Arguments----------

参数:A
This is a matrix of known coefficients or a function which returns such a matrix.  If a function then its first argument must be the point t at which a saddlepoint is required.   The most common reason for A being a function would be if the statistic is not itself a linear combination of the W but is the solution to a linear estimating equation.  
这是一个已知的系数矩阵或一个函数返回一个矩阵。如果一个函数,那么它的第一个参数必须是点t其中一个鞍点是必需的。为最常见的原因,一个是一个功能,如果没有统计本身的W的线性组合,但解决线性估计方程。


参数:u
If A is a function then u  must also be a function returning a vector with length equal to the number of columns of the matrix returned by A. Usually all components other than the first will be constants as the other components are the values of the conditioning variables. If A is a matrix with more than one column (such as when wdist = "cond") then u should be a vector with length one less than ncol(A).  In this case u specifies the values of the conditioning variables.  If A is a matrix with one column or a vector then u is not used.  
A如果是一个函数,那么u还必须是一个函数返回一个向量长度等于由A返回矩阵的列数。一般第一以外的其他所有组件将作为其他组件的调节变量的值是常数。 A如果是一个具有多个列的矩阵(如当wdist = "cond")然后u应该是长度比ncol(A)少一个向量。在这种情况下u指定的调节变量的值。 A如果是一列或然后u不使用矢量矩阵。


参数:alpha
The alpha levels for the quantiles of the distribution which should be returned.  By default the 0.1, 0.5, 1, 2.5, 5, 10, 20, 50, 80, 90, 95, 97.5, 99, 99.5 and 99.9 percentiles are calculated.   
应当返还的分配位数的α水平。默认情况下,0.1,0.5,2.5,5,10,20,50,80,90,95,97.5,99,99.5和99.9百分点计算。


参数:wdist
The distribution of W.  Possible values are "m" (multinomial), "p" (Poisson), or "b" (binary).  
W的可能值的分布是"m"(多项),"p"(泊松分布),或"b"(二进制)。


参数:type
The type of saddlepoint to be used.  Possible values are "simp" (simple saddlepoint) and "cond" (conditional). If wdist is "m", type is set to "simp".  
鞍型使用。可能的值是"simp"(简单鞍点)和"cond"(有条件的)。 wdist如果是"m",type设置为"simp"。


参数:npts
The number of points at which the saddlepoint approximation should be calculated and then used to fit the spline.  
鞍点逼近计算,然后使用适合的样条的点数。


参数:t
A vector of points at which the saddlepoint approximations are calculated. These points should extend beyond the extreme quantiles required but still be in the possible range of the bootstrap distribution.  The observed value of the statistic should not be included in t as the distribution function approximation breaks down at that point.  The points should, however cover the entire effective range of the distribution including close to the centre. If t is supplied then npts is set to length(t). When t is not supplied, the function attempts to find the effective range of the distribution and then selects points to cover this range.  
一个向量鞍点近似计算点。这些点应该超出所需的极端分位数,但仍是引导分布在可能的范围。观测值的统计,不应包括在t分布函数逼近打破了这一点。点,但应覆盖整个分布的有效范围包括靠近市中心。如果t提供了npts设置为length(t)。当t不提供的功能,试图找到有效的分布范围,然后选择“点覆盖范围。


参数:t0
If t is not supplied then a vector of length 2 should be passed as t0. The first component of t0 should be the centre of the distribution and the second should be an estimate of spread (such as a standard error). These two are then used to find the effective range of the distribution. The range finding mechanism does rely on an accurate estimate of location in t0[1].  
t如果没有提供,则长度为2的向量应通过为t0。 t0第一部分应该是配送中心和第二个应该是一个传播的估计(如一个标准的错误)。然后用这两个找到有效的分配范围。测距机制是依靠准确的位置估计在t0[1]。


参数:init
When wdist is "m", this vector should contain the initial values to be passed to nlmin when it is called to solve the saddlepoint equations.  
当wdist是"m",这个向量应该包含的初始值被传递给nlmin当它被称为解决鞍点方程。


参数:mu
The vector of parameter values for the distribution.  The default is that the components of W are identically distributed.  
矢量分布的参数值。默认的是,同分布的W元件。


参数:LR
A logical flag.  When LR is TRUE the Lugananni-Rice cdf approximations are calculated and used to fit the spline. Otherwise the cdf approximations used are based on Barndorff-Nielsen's r*.  
一个逻辑标志。当LR是TRUE的Lugananni水稻CDF近似计算和使用,以适应样条。否则使用的CDF近似基于Barndorff尼尔森的R *。


参数:strata
A vector giving the strata when the rows of A relate to stratified data.  This is used only when wdist is "m".  
一个向量地层时行涉及到分层数据。这是只有当wdist是"m"。


参数:...
When A and u are functions any additional arguments are passed unchanged each time one of them is called.  
当A和u有任何额外的参数传递不变逐一被称为时间的功能。


Details

详情----------Details----------

The range at which the saddlepoint is used is such that the cdf approximation at the endpoints is more extreme than required by the extreme values of alpha.  The lower endpoint is found by evaluating the saddlepoint at the points t0[1]-2*t0[2], t0[1]-4*t0[2], t0[1]-8*t0[2] etc.  until a point is found with a cdf approximation less than min(alpha)/10, then a bisection method is used to find the endpoint which has cdf approximation in the range (min(alpha)/1000, min(alpha)/10). Then a number of, equally spaced, points are chosen between the lower endpoint and t0[1] until a total of npts/2 approximations have been made. The remaining npts/2 points are chosen to the right of t0[1] in a similar manner.  Any points which are very close to the centre of the distribution are then omitted as the cdf approximations are not reliable at the centre. A smoothing spline is then fitted to the probit of the saddlepoint distribution function approximations at the remaining points and the required quantiles are predicted from the spline.
其中鞍使用的范围是这样,在端点的CDF近似是更加极端的比alpha极端值的要求。较低的端点评估的鞍点t0[1]-2*t0[2],t0[1]-4*t0[2],t0[1]-8*t0[2]等,直到一个点发现一个CDF近似比min(alpha)/10,然后使用二分法找到端点范围内CDF近似(min(alpha)/1000,min(alpha)/10)。然后,等距点之间选择较低的端点和t0[1]直到总npts/2逼近已作出。其余npts/2点选择权利t0[1]以类似的方式。 CDF近似是不可靠的中心,这是非常接近的配送中心的任何点,然后省略。平滑样条,然后安装到鞍点分布函数逼近在余下的点和所需的位数从样条预测的概率。

Sometimes the function will terminate with the message "Unable to find range".  There are two main reasons why this may occur.  One is that the distribution is too discrete and/or the required quantiles too extreme, this can cause the function to be unable to find a point within the allowable range which is beyond the extreme quantiles.  Another possibility is that the value of t0[2] is too small and so too many steps are required to find the range. The first problem cannot be solved except by asking for less extreme quantiles, although for very discrete distributions the approximations may not be very good.  In the second case using a larger value of t0[2] will usually solve the problem.
有时,该函数将终止的消息"Unable to find range"。主要有两方面的原因,这可能会发生。一个是分布过于离散和/或过于极端所需的位数,这可能会导致无法允许的范围内找到一个点,这是超越的极端位数的功能。另一种可能性是,值t0[2]是太小了,太多的步骤,因此需要寻找的范围。第一个问题不能被解决,除了要求不太极端位数,虽然非常离散分布的近似可能不是很好。在第二的情况下,使用较大的值t0[2]通常会解决这个问题。


值----------Value----------

The returned value is an object of class "saddle.distn".  See the help file for saddle.distn.object for a description of such an object.
返回值是一个对象类"saddle.distn"。这样一个对象的描述,请参阅帮助文件saddle.distn.object。


参考文献----------References----------

saddlepoint approximations in generalized linear models.  Biometrika, 77, 787–796.
approximations to resampling distributions.  Computing Science and Statistics; Proceedings of the 28th Symposium on the Interface 248–253.
Bootstrap Methods and their Application. Cambridge University Press.


参见----------See Also----------

lines.saddle.distn, saddle, saddle.distn.object, smooth.spline
lines.saddle.distn,saddle,saddle.distn.object,smooth.spline


举例----------Examples----------


#  The bootstrap distribution of the mean of the air-conditioning [引导空调的平均分布]
#  failure data: fails to find value on R (and probably on S too)[故障数据:无法找到的R值(可能太)]
air.t0 <- c(mean(aircondit$hours), sqrt(var(aircondit$hours)/12))
## Not run: saddle.distn(A = aircondit$hours/12, t0 = air.t0)[#不能运行:saddle.distn(一个= aircondit美元hours/12,T0 = air.t0),]

# alternatively using the conditional poisson[交替使用条件泊松]
saddle.distn(A = cbind(aircondit$hours/12, 1), u = 12, wdist = "p",
             type = "cond", t0 = air.t0)

# Distribution of the ratio of a sample of size 10 from the bigcity [一个大小为10的样本比例从bigcity分布]
# data, taken from Example 9.16 of Davison and Hinkley (1997).[数据,采取从戴维森和欣克利(1997)9.16为例。]
ratio <- function(d, w) sum(d$x *w)/sum(d$u * w)
city.v <- var.linear(empinf(data = city, statistic = ratio))
bigcity.t0 <- c(mean(bigcity$x)/mean(bigcity$u), sqrt(city.v))
Afn <- function(t, data) cbind(data$x - t*data$u, 1)
ufn <- function(t, data) c(0,10)
saddle.distn(A = Afn, u = ufn, wdist = "b", type = "cond",
             t0 = bigcity.t0, data = bigcity)

# From Example 9.16 of Davison and Hinkley (1997) again, we find the [戴维森和欣克利(1997)再次从示例9.16中,我们发现]
# conditional distribution of the ratio given the sum of city$u.[有条件$ U城市的总和的比例分配。]
Afn <- function(t, data) cbind(data$x-t*data$u, data$u, 1)
ufn <- function(t, data) c(0, sum(data$u), 10)
city.t0 <- c(mean(city$x)/mean(city$u), sqrt(city.v))
saddle.distn(A = Afn, u = ufn, wdist = "p", type = "cond", t0 = city.t0,
             data = city)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-23 09:18 , Processed in 0.039592 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表