rtmvnorm(tmvtnorm)
rtmvnorm()所属R语言包:tmvtnorm
Sampling Random Numbers From The Truncated Multivariate Normal Distribution
采样被截断的多元正态分布的随机数
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This function generates random numbers from the truncated multivariate normal distribution with mean equal to mean and covariance matrix sigma (or alternatively precision matrix H), lower and upper truncation points lower and upper with either rejection sampling or Gibbs sampling.
这个函数生成随机数,从被截断的多元正态分布,平均等于mean和协方差矩阵sigma(或者精度矩阵H),上,下截断点lower upper要么拒绝抽样或Gibbs采样。
用法----------Usage----------
rtmvnorm(n, mean = rep(0, nrow(sigma)),
sigma = diag(length(mean)),
lower=rep(-Inf, length = length(mean)),
upper=rep( Inf, length = length(mean)),
D = diag(length(mean)),
H = NULL,
algorithm=c("rejection", "gibbs", "gibbsR"),
...)
rtmvnorm.sparseMatrix(n, mean = rep(0, nrow(H)),
H = sparseMatrix(i=1:length(mean), j=1:length(mean), x=1),
lower = rep(-Inf, length = length(mean)),
upper = rep( Inf, length = length(mean)),
...)
参数----------Arguments----------
参数:n
Number of random points to be sampled. Must be an integer >= 1.
以被采样的随机点数目。必须是一个整数>= 1。
参数:mean
Mean vector, default is rep(0, length = ncol(x)).
均值向量,默认是:rep(0, length = ncol(x))。
参数:sigma
Covariance matrix, default is diag(ncol(x)).
协方差矩阵,默认是diag(ncol(x))。
参数:lower
Vector of lower truncation points, default is rep(-Inf, length = length(mean)).
矢量较低的截断点,默认是rep(-Inf, length = length(mean))。
参数:upper
Vector of upper truncation points, default is rep( Inf, length = length(mean)).
向量上的截断点,默认是rep( Inf, length = length(mean))。
参数:D
Matrix for linear constraints, defaults to diagonal matrix.
矩阵的线性约束,默认为对角矩阵。
参数:H
Precision matrix, default is NULL.
精密矩阵,默认是NULL。
参数:algorithm
Method used, possible methods are rejection sampling ("rejection", default), the Fortan Gibbs sampler ("gibbs") and the old Gibbs sampler implementation in R ("gibbsR").
使用的方法,可能的方法是拒绝抽样(“拒绝”,默认情况下),FORTAN的吉布斯采样器(“吉布斯”)和老Gibbs抽样R(的“gibbsR”)的实施。
参数:...
additional parameters for Gibbs sampling, given to the internal method rtmvnorm.gibbs(), such as burn.in.samples, start.value and thinning, see details
Gibbs抽样的额外的参数,给rtmvnorm.gibbs(),如burn.in.samples,start.value和thinning,看到的内部方法
Details
详细信息----------Details----------
The generation of random numbers from a truncated multivariate normal distribution is done using either rejection sampling or Gibbs sampler.
从截断的多元正态分布的随机数的产生是通过使用任一拒绝抽样或Gibbs采样。
Rejection sampling<br> Rejection sampling is done from the standard multivariate normal distribution. So we use the function rmvnorm of the mvtnorm package to generate proposals which are either accepted if they are inside the support region or rejected. In order to speed up the generation of N samples from the truncated distribution, we first calculate the acceptance rate alpha from the truncation points and then generate N/alpha samples iteratively until we have got N samples. This typically does not take more than 2-3 iterations. Rejection sampling may be very inefficient when the support region is small (i.e. in higher dimensions) which results in very low acceptance rates alpha. In this case the Gibbs sampler is preferable.
拒绝抽样<BR>拒绝抽样,完成从标准的多元正态分布。因此,我们使用的功能rmvnorm的mvtnorm包来生成建议,这是可以接受的,如果他们是支持区域内或拒绝。为了加快一代的N个样本被截断的分布,我们首先计算录取率阿尔法截断点,然后生成N / alpha样本迭代,直到我们有N个样本。这通常不超过2-3次迭代。拒绝抽样时,可能是非常低效的支撑区域是小的(即在更高的层面),这将导致非常低的录取率阿尔法。在这种情况下,Gibbs采样是优选的。
Gibbs sampling<br> The Gibbs sampler samples from univariate conditional distributions, so all samples can be accepted except for a burn-in period. The number of burn-in samples to be discarded can be specified, as well as a start value of the chain. If no start value is given, we determine a start value from the support region using either lower bound or upper bound if they are finite, or 0 otherwise.
Gibbs抽样的参考Gibbs抽样样本从单变量的条件分布,因此,所有样品都可以被接受除烧伤的时期。的数目老化样品中可以指定被丢弃,以及链的开始值。如果没有启动值,确定一个初始值从支撑区域,可以使用下限或上限,如果他们是有限的,否则为0。
The Gibbs sampler has been reimplemented in Fortran 90 for performance reasons (algorithm="gibbs"). The old R implementation is still accessible through algorithm="gibbsR".
在Fortran 90的吉布斯采样器中重新实现了性能方面的原因(algorithm="gibbs")。旧的R实现仍然可以通过algorithm="gibbsR"。
The arguments to be passed along with algorithm="gibbs" or algorithm="gibbsR" are:
的参数一起传递algorithm="gibbs"或algorithm="gibbsR":
burn.in.samples number of samples in Gibbs sampling to be discarded as burn-in phase, must be non-negative.
burn.in.samples数目的样本的吉布斯采样被丢弃的燃烧阶段,必须为非负数。
start.value Start value (vector of length length(mean)) for the MCMC chain. If one is specified, it must lie inside the support region (lower <= start.value <= upper). If none is specified, the start value is taken componentwise as the finite lower or upper boundaries respectively,
start.value开始值(向量的长度length(mean))MCMC链。如果一个人被指定,它必须位于支撑区域内(lower <= start.value <= upper)。如果没有指定,开始值分别为有限或上,下边界的分支,
thinning Thinning factor for reducing autocorrelation of random points in Gibbs sampling. Must be an integer >= 1. We create a Markov chain of length (n*thinning) and take only those samples j=1n*thinning) where j %% thinning == 0
thinning稀化因子减少Gibbs抽样的随机点的自相关。必须是一个整数> = 1。我们创建了一个马尔可夫链的长度(n*thinning),只需要这些样本j=1n*thinning)j %% thinning == 0
Sampling with linear constraints<br> We extended the method to also simulate from a multivariate normal distribution subject to general linear constraints lower <= D x <= upper. For general D, both rejection sampling or Gibbs sampling according to Geweke (1991) are available.
采样与线性约束<BR>我们的方法还可以模拟从多元正态分布的主题一般的线性约束lower <= D x <= upper。对于一般的D,根据Geweke拒绝抽样或Gibbs抽样(1991)。
Gibbs sampler and the use of the precision matrix H<br> Why is it important to have a random sampler that works with the precision matrix? Especially in Bayesian and spatial statistics, there are a number of high-dimensional applications where the precision matrix H is readily available, but is sometimes nearly singular and cannot be easily inverted to sigma. Additionally, it turns out that the Gibbs sampler formulas are much simpler in terms of the precision matrix than in terms of the covariance matrix. See the details of the Gibbs sampler implementation in the package vignette or for example Geweke (2005), pp.171-172. (Thanks to Miguel Godinho de Matos from Carnegie Mellon University for pointing me to this.) Therefore, we now provide an interface for the direct use of the precision matrix H in rtmvnorm().
Gibbs抽样和它为什么是重要的是要有一个随机采样的精度矩阵的精度矩阵H参考使用吗?特别是在贝叶斯和空间的统计数据,也有一些高维的精度矩阵H是现成的,但有时几乎奇异,不容易倒到Sigma。此外,事实证明,Gibbs抽样公式是简单得多的精度矩阵的协方差矩阵。查看详细资料Gibbs抽样实现包中的小插曲,例如Geweke(2005年),国立成功大学资源工程学系硕士论文-172。 (米格尔·戈迪尼奥 - 马托斯指点我从卡内基·梅隆大学),因此,我们现在提供了一个接口的直接H使用的精度矩阵rtmvnorm()的。
Gibbs sampler with sparse precision matrix H<br> The size of the covariance matrix sigma or precision matrix H - if expressed as a dense matrix - grows quadratic with the number of dimensions d. For high-dimensional problems (such as d > 5000), it is no longer efficient and appropriate to work with dense matrix representations, as one quickly runs into memory problems.<br> It is interesting to note that in many applications the precision matrix, which holds the conditional dependencies, will be sparse, whereas the covariance matrix will be dense. Hence, expressing H as a sparse matrix will significantly reduce the amount of memory to store this matrix and allows much larger problems to be handled. In the current version of the package, the precision matrix (not sigma since it will be dense in most cases) can be passed to rtmvnorm.sparseMatrix() as a sparseMatrix from the Matrix package. See the examples section below for a usage example.
Gibbs采样精度与稀疏矩阵H参考的协方差矩阵的大小,sigma或精度矩阵H - 如果表现为密集的matrix - 尺寸d的数量与二次增长。对于高维问题(如D> 5000),它不再是有效的和适当的工作与密集的矩阵表示,作为一个快速运行到内存的问题。这是值得注意的,在许多应用中的精度持有的条件依赖关系的矩阵,该矩阵是稀疏的,而将是致密的协方差矩阵。因此,表示H作为一个稀疏矩阵将显著减少量的内存来存储这个矩阵,并允许更大的问题来处理。在当前版本的封装,精度矩阵(不sigma,因为它会在大多数情况下是密集的)可以传递给rtmvnorm.sparseMatrix()sparseMatrixMatrix包。请参阅下面的示例部分的使用范例。
警告----------Warning----------
A word of caution is needed for useRs that are not familiar with Markov Chain Monte Carlo methods like Gibbs sampling:
一个字谨慎是必要的Gibbs抽样马尔可夫链蒙特卡罗方法,如不熟悉的用户:
Rejection sampling is exact in the sense that we are sampling directly from the target distribution and the random samples generated are independent. So it is clearly the default method.
拒绝抽样是确切在这个意义上,我们直接从目标分布抽样和随机抽样产生的独立。所以,很明显,它是默认的方法。
Markov Chain Monte Carlo methods are only approximate methods, which may suffer from several problems:
马尔可夫链蒙特卡罗方法只是近似的方法,可能会受到几个问题:
Poor mixing
混合效果差
Convergence problems
收敛问题
Correlation among samples
样本之间的相关性
Diagnostic checks for Markov Chain Monte Carlo include trace plots, CUSUM plots and autocorrelation plots like acf. For a survey see for instance Cowles (1996).
马尔可夫链蒙特卡罗的诊断检查包括跟踪图,CUSUM图和自相关图像acf的。为的实例考尔斯(1996年)的一项调查。
That is, consecutive samples generated from rtmvnorm(..., algorithm=c("gibbs", "gibbsR")) are correlated (see also example 3 below). One way of reducing the autocorrelation among the random samples is "thinning" the Markov chain, that is recording only a subset/subsequence of the chain. For example, one could record only every 100th sample, which clearly reduces the autocorrelation and "increases the independence". But thinning comes at the cost of higher computation times, since the chain has to run much longer. We refer to autocorrelation plots in order to determine optimal thinning.
也就是说,连续的样本产生的rtmvnorm(..., algorithm=c("gibbs", "gibbsR"))是相关的(参见下面的示例3)。减少之间的自相关随机抽样的方法之一是“疏”,即只记录一个子集/序列链马尔可夫链。例如,可以只记录每第100个样品,这显然降低了自相关和“增加了独立的”。但细化在成本较高的计算时间,因为链条上的运行更长的时间。我们指自相关图,以确定最佳的减薄。
(作者)----------Author(s)----------
Stefan Wilhelm <Stefan.Wilhelm@financial.com>, Manjunath B G <bgmanjunath@gmail.com>
参考文献----------References----------
Multivariate Normal and t Distributions. R package version 0.9-7. URL http://CRAN.R-project.org/package=mvtnorm
Wiley & Sons, pp. 70–73
Journal of Multivariate Analysis, 94, 209–221
IEEE Computer Society, 1757–1760
Journal of the American Statistical Association, 91, 883–904
Subject to Linear Constraints Computer Science and Statistics. Proceedings of the 23rd Symposium on the Interface. Seattle Washington, April 21-24, 1991, 571–578
参见----------See Also----------
ptmvnorm, pmvnorm, rmvnorm, dmvnorm
ptmvnorm,pmvnorm,rmvnorm,dmvnorm
实例----------Examples----------
################################################################################[################################################## #############################]
#[]
# Example 1: [实施例1:]
# rejection sampling in 2 dimensions [拒绝抽样2维]
#[]
################################################################################[################################################## #############################]
sigma <- matrix(c(4,2,2,3), ncol=2)
x <- rtmvnorm(n=500, mean=c(1,2), sigma=sigma, upper=c(1,0))
plot(x, main="samples from truncated bivariate normal distribution",
xlim=c(-6,6), ylim=c(-6,6),
xlab=expression(x[1]), ylab=expression(x[2]))
abline(v=1, lty=3, lwd=2, col="gray")
abline(h=0, lty=3, lwd=2, col="gray")
################################################################################[################################################## #############################]
#[]
# Example 2: [实施例2:]
# Gibbs sampler for 4 dimensions[Gibbs抽样4个维度]
#[]
################################################################################[################################################## #############################]
C <- matrix(0.8, 4, 4)
diag(C) <- rep(1, 4)
lower <- rep(-4, 4)
upper <- rep(-1, 4)
# acceptance rate alpha[录取率阿尔法]
alpha <- pmvnorm(lower=lower, upper=upper, mean=rep(0,4), sigma=C)
alpha
# Gibbs sampler[Gibbs采样]
X1 <- rtmvnorm(n=20000, mean = rep(0,4), sigma=C, lower=lower, upper=upper,
algorithm="gibbs", burn.in.samples=100)
# Rejection sampling[拒绝抽样]
X2 <- rtmvnorm(n=5000, mean = rep(0,4), sigma=C, lower=lower, upper=upper)
colMeans(X1)
colMeans(X2)
plot(density(X1[,1], from=lower[1], to=upper[1]), col="red", lwd=2,
main="Kernel density estimates from random samples
generated by Gibbs vs. Rejection sampling")
lines(density(X2[,1], from=lower[1], to=upper[1]), col="blue", lwd=2)
legend("topleft",legend=c("Gibbs Sampling","Rejection Sampling"),
col=c("red","blue"), lwd=2, bty="n")
################################################################################[################################################## #############################]
#[]
# Example 3: [实施例3:]
# Autocorrelation plot for Gibbs sampler[Gibbs抽样自相关图]
# with and without thinning[带和不带变薄]
#[]
################################################################################[################################################## #############################]
sigma <- matrix(c(4,2,2,3), ncol=2)
X1 <- rtmvnorm(n=10000, mean=c(1,2), sigma=sigma, upper=c(1,0),
algorithm="rejection")
acf(X1)
# no autocorrelation among random points[没有随机点之间的自相关]
X2 <- rtmvnorm(n=10000, mean=c(1,2), sigma=sigma, upper=c(1,0),
algorithm="gibbs")
acf(X2)
# exhibits autocorrelation among random points[展品随机点之间的自相关]
X3 <- rtmvnorm(n=10000, mean=c(1,2), sigma=sigma, upper=c(1,0),
algorithm="gibbs", thinning=2)
acf(X3)
# reduced autocorrelation among random points[随机点之间的自相关性降低]
plot(density(X1[,1], to=1))
lines(density(X2[,1], to=1), col="blue")
lines(density(X3[,1], to=1), col="red")
################################################################################[################################################## #############################]
#[]
# Example 4: Univariate case[例4:单因素的情况下]
#[]
################################################################################[################################################## #############################]
X <- rtmvnorm(100, mean=0, sigma=1, lower=-1, upper=1)
################################################################################[################################################## #############################]
#[]
# Example 5: Linear Constraints[例5:线性约束]
#[]
################################################################################[################################################## #############################]
mean <- c(0, 0)
sigma <- matrix(c(10, 0,
0, 1), 2, 2)
# Linear Constraints[线性约束]
#[]
# a1 <= x1 + x2 <= b2[A1 <= X1 + X2 <= B2]
# a2 <= x1 - x2 <= b2[A2 <= X1 - X2 <= B2]
#[]
# [ a1 ] <= [ 1 1 ] [ x1 ] [b1][[A1] <= [1 1] [×1] [B1]]
# [ a2 ] [ 1 -1 ] [ x2 ] [b2][[A2] [1 -1]到[x2] [B2]]
a <- c(-2, -2)
b <- c( 2, 2)
D <- matrix(c(1, 1,
1, -1), 2, 2)
X <- rtmvnorm(n=10000, mean, sigma, lower=a, upper=b, D=D, algorithm="gibbsR")
plot(X, main="Gibbs sampling for multivariate normal with linear constraints according to Geweke (1991)")
# mark linear constraints as lines[线性约束标记为线]
for (i in 1:nrow(D)) {
abline(a=a[i]/D[i, 2], b=-D[i,1]/D[i, 2], col="red")
abline(a=b[i]/D[i, 2], b=-D[i,1]/D[i, 2], col="red")
}
################################################################################[################################################## #############################]
#[]
# Example 6: Using precision matrix H rather than sigma[例6:使用精度矩阵H,而不是西格玛]
#[]
################################################################################[################################################## #############################]
lower <- c(-1, -1)
upper <- c(1, 1)
mean <- c(0.5, 0.5)
sigma <- matrix(c(1, 0.8, 0.8, 1), 2, 2)
H <- solve(sigma)
D <- matrix(c(1, 1, 1, -1), 2, 2)
X <- rtmvnorm(n=1000, mean=mean, H=H, lower=lower, upper=upper, D=D, algorithm="gibbs")
plot(X, main="Gibbs sampling with precision matrix and linear constraints")
################################################################################[################################################## #############################]
#[]
# Example 7: Using sparse precision matrix H in high dimensions[例7:使用稀疏高尺寸精度矩阵H]
#[]
################################################################################[################################################## #############################]
## Not run: [#不运行:]
d <- 1000
I_d <- sparseMatrix(i=1:d, j=1:d, x=1)
W <- sparseMatrix(i=c(1:d, 1d-1)), j=c(1:d, (2:d)), x=0.5)
H <- t(I_d - 0.5 * W)
lower <- rep(0, d)
upper <- rep(2, d)
# Gibbs sampler generates n=100 draws in d=1000 dimensions[Gibbs采样生成n = 100利用在d = 1000尺寸]
X <- rtmvnorm.sparseMatrix(n=100, mean = rep(0,d), H=H, lower=lower, upper=upper,
burn.in.samples=100)
colMeans(X)
cov(X)
## End(Not run)[#(不执行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|