R语言 rsae包 makedata()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-28 21:15:39

makedata(rsae)
makedata()所属R语言包：rsae

                                       Synthetic data generation for the basic unit-level SAE model (incl. outlier contamination)
                                       合成数据生成的基本单元级SAE模型（包括离群污染）

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

This function serves for synthetically generating data with area-level variation. It has been written to test several estimating methods. In addition, one may introduce contamination to the laws of the model- and/or random effects (see Details, below).
此功能用于综合数据区域级别的变化。它已被写入测试几种估算方法。此外，可能会引入污染的法律的模型和/或随机效应（见下面的详细信息，）。

用法----------Usage----------

makedata(seed=1024, intercept=1, beta=1, n=4, g=20, areaID=NULL,
      ve=1, ve.contam=41, ve.epsilon=0, vu=1, vu.contam=41,
      vu.epsilon=0)

参数----------Arguments----------

参数：seed
an integer, defining the set.seed (default seed=1024)
一个整数，定义set.seed（默认seed=1024）

参数：intercept
either a scalar as intercept of the fixed-effects model or NULL (default: intercept=1)
一个标量的截距固定效应模型或NULL（默认：intercept=1）

参数：beta
scalar or vector defining the fixed-effect coefficients (default: beta=1). For each given coefficient, a vector of realizations is drawn from the standard normal distribution.
标量或矢量定义的固定效应系数（默认：beta=1“）。对于每个给定的系数，实现来自标准正态分布的矢量。

参数：n
integer, defining the number of units per area in balanced-data setups (default: n=4)
整数，单位面积的单位数定义的在平衡数据设置（默认：n=4）

参数：g
integer, defining the number of areas (default: g=20)
整数，定义的区域的数量（默认：g=20）

参数：areaID
by default areaID=NULL. If one attempts to generate synthetic unbalanced data, one may call makedata with a vector, the elements of which area identifiers. This vector should contain a series of (integer valued) area IDs. The number of areas is set equal to the number unique IDs; see the rsae Vignette for more details.
默认情况下，areaID=NULL。如果有人试图生成合成的非平衡数据，可致电makedata与一个向量，其中的元素区域标识符。这的向量应该包含了一系列区域ID（整数值）。区域的数目等于唯一的ID编号;，看看rsae更多细节的小插曲。

参数：ve
scalar, defining the model/ residual variance
标量，定义模型/剩余方差

参数：ve.contam
scalar, defining the model variance of the outlier part in a mixture distribution (Tuckey-Huber-type contamination model). e = (1-h)*N(0, ve) + h*N(0, ve.contam)
标量，定义模型的方差的离群值的混合物分布（Tuckey胡贝尔型污染模型）。 E =（1-H）* N（0，VE）+ H * N（0，ve.contam）

参数：ve.epsilon
scalar, defining the relative number of outliers (i.e., epsilon or h in the contamination mixture distribution). Typically, it takes values between 0 and 0.5 (but it is not restricted to this interval)
标，确定相对数量的异常值（即，ε或h的污染混合分布）。通常情况下，需要（在0和0.5之间的值，但它不局限于此时间间隔）

参数：vu
scalar, defining the (area-level) random-effect variance
标量，限定（区级）随机效应方差

参数：vu.contam
scalar, defining the (area-level) random-effect variance of the outlier part in the contamination mixture distribution (cf., ve.contam)
标，定义（区级）的离群值在污染混合分布的随机效应方差（参见，ve.contam）

参数：vu.epsilon
scalar, defining the relative number of outliers in the contamination mixture distribution of the (area-level) random effects (cf., ve.epsilon)
标量，限定离群点（区级的污染混合物分布的相对数量）随机效应（参见，ve.epsilon）

Details

详细信息----------Details----------

The function makedata generates synthetic datasets that may be used to study the behavior of different estimating methods. Let y_i denote an area-specific n_i-vector of the response variable for the areas i=1,...,g. Define a (n_i \times p)-matrix X_i of realizations from the std. normal distribution, N(0,1), and let β denote a p-vector of regression coefficients. Now, the y_i are drawn using the law y_i \sim N(X_iβ, v_e I_i + v_u J_i) with v_e and v_u the variances of the model error and random-effect variance, respectively, and I_i and J_i denoting the identity matrix and matrix of ones, respectively.
的功能makedata生成合成数据集，可用于研究不同的推定方法的行为。让我们y_i一个特定区域的n_i向量的响应变量的领域i=1,...,g表示。定义一个(n_i \times p)矩阵X_i的std实现。正态分布，N(0,1)，并让我们β表示一个p向量回归系数的。现在，y_i绘制法y_i \sim N(X_iβ, v_e I_i + v_u J_i)v_e和v_u方差模型误差和随机效应的方差，和I_i和J_i表示单位矩阵，矩阵的分别。

In addition, we allow the distribution of the model/residual and area-level random effect to be contaminated (cf. Stahel and Welsh, 1997). Notably, the laws of e_{i,j} and u_i are replaced by the Tukey-Huber contamination mixture:
此外，我们允许被污染的分布模型/残余和区级随机效果（参见Stahel和威尔士，1997年）。值得一提的是，法律的e_{i,j}和u_i杜克的的胡贝尔污染混合物的替代：

e_{i,j} \sim (1-ε^{ve})N(0,v_e) + ε^{ve}N(0, v_e^{ε}),
e_{i,j} \sim (1-ε^{ve})N(0,v_e) + ε^{ve}N(0, v_e^{ε})，

u_{i} \sim (1-ε^{vu})N(0,v_u) + ε^{vu}N(0, v_u^{ε}),
u_{i} \sim (1-ε^{vu})N(0,v_u) + ε^{vu}N(0, v_u^{ε})，

where ε^{ve} and ε^{vu} regulate the degree of contamination; v_e^{ε} and v_e^{ε} define the variance of the contamination part of the mixture distribution.
其中ε^{ve}和ε^{vu}规范的污染程度; v_e^{ε}和v_e^{ε}定义的混合分布的方差的污染。

Four different contamination setups are possible:
四种不同的污染的设置是可能的：

no contamination (i.e., ve.epsilon=vu.epsilon=0),
没有污染（即ve.epsilon=vu.epsilon=0）

contaminated model error (i.e., ve.epsilon!=0 and vu.epsilon=0),
受污染的模型误差（即，ve.epsilon!=0和vu.epsilon=0）

contaminated random effect (i.e., ve.epsilon=0 and vu.epsilon!=0),
受污染的随机效应（即，ve.epsilon=0和vu.epsilon!=0）

both are conaminated (i.e., ve.epsilon!=0 and vu.epsilon!=0).
两者都conaminated（即，ve.epsilon!=0和vu.epsilon!=0）。

值----------Value----------

Instance of the class saemodel.
之类的saemodel的实例。

（作者）----------Author(s)----------

Tobas Schoch

参考文献----------References----------

实例----------Examples----------

#generate synthetic data[生成合成数据]
mymodel <- makedata()

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 rsae包 makedata()函数中文帮助文档(中英文对照)

浏览过的版块