simEUSILC(simPopulation)
simEUSILC()所属R语言包:simPopulation
Simulate EU-SILC population data
模拟EU-SILC人口的数据
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Simulate population data for the European Statistics on Income and Living Conditions (EU-SILC).
模拟欧洲收入和生活条件统计(EU-SILC)的人口数据。
用法----------Usage----------
simEUSILC(dataS, hid = "db030", wh = "db090", wp = "rb050",
hsize = NULL, strata = "db040", pid = NULL, age = "age",
gender = "rb090", categorizeAge = TRUE, breaksAge = NULL,
categorical = c("pl030", "pb220a"),
income = "netIncome", method = c("multinom", "twostep"),
breaks = NULL, lower = NULL, upper = NULL,
equidist = TRUE, probs = NULL, gpd = TRUE,
threshold = NULL, est = "moments", const = NULL,
alpha = 0.01, residuals = TRUE,
components = c("py010n", "py050n", "py090n",
"py100n", "py110n", "py120n", "py130n", "py140n"),
conditional = c(getCatName(income), "pl030"),
keep = TRUE, maxit = 500, MaxNWts = 1500,
tol = .Machine$double.eps^0.5, seed)
参数----------Arguments----------
参数:dataS
a data.frame containing EU-SILC survey data.
data.frame包含欧盟的的SILC调查数据。
参数:hid
a character string specifying the column of dataS that contains the household ID.
一个字符串指定列的dataS包含家庭ID。
参数:wh
a character string specifying the column of dataS that contains the household sample weights.
一个字符串指定列的dataS包含家庭样本权重。
参数:wp
a character string specifying the column of dataS that contains the personal sample weights.
一个字符串,指定列dataS,包含的个人样本权重。
参数:hsize
an optional character string specifying a column of dataS that contains the household size. If NULL, the household sizes are computed.
一个可选的字符串,指定列的dataS包含家庭大小。如果NULL,住户家庭人数计算。
参数:strata
a character string specifying the column of dataS that define strata. Note that this is currently a required argument and only one stratification variable is supported.
一个字符串指定列的dataS定义阶层。请注意,这是目前一个必要的参数,并支持只有一个分层变量。
参数:pid
an optional character string specifying a column of dataS that contains the personal ID.
一个可选的字符串指定一列dataS,包含个人ID。
参数:age
a character string specifying the column of dataS that contains the age of the persons (to be used for setting up the household structure).
指定dataS的列的字符字符串包含的人的年龄(要用于设立的家庭结构)。
参数:gender
a character string specifying the column of dataS that contains the gender of the persons (to be used for setting up the household structure).
指定dataS的列的字符字符串包含的人的性别(要用于设立的家庭结构)。
参数:categorizeAge
a logical indicating whether age categories should be used for simulating additional categorical and continuous variables to decrease computation time.
逻辑指示,年龄类别是否应该使用额外的分类和连续变量的仿真,以减少计算时间。
参数:breaksAge
numeric; if categorizeAge is TRUE, an optional vector of two or more break points for constructing age categories, otherwise ignored.
数字;如果categorizeAgeTRUE,一个可选的向量的两个或两个以上年龄组别的建设的破发点,否则忽略。
参数:categorical
a character vector specifying additional categorical variables of dataS that should be simulated for the population data.
一个字符向量确定的其他分类变量的dataS应该是模拟的人口数据。
参数:income
a character string specifying the variable of dataS that contains the personal income (to be simulated for the population data).
一个字符串,指定变量的dataS包含个人收入(要模拟的人口数据)。
参数:method
a character string specifying the method to be used for simulating personal income. Accepted values are "multinom" (for using multinomial log-linear models combined with random draws from the resulting ategories) and "twostep" (for using two-step regression models combined with random error terms).
一个字符的字符串指定的方法,用于模拟个人收入。可接受的值是"multinom"(使用多项对数线性模型,结合随机抽取产生的ategories)和"twostep"(采用两步回归模型,结合随机误差项)。
参数:breaks
if method is "multinom", an optional numeric vector of two or more break points for categorizing the personal income. If missing, break points are computed using weighted quantiles.
如果method是"multinom",一个可选的数字矢量进行分类的个人所得税的两个或更多个破发点。如果缺少,破发点使用加权位数计算。
参数:lower, upper
numeric values; if method is "multinom" and breaks is NULL, these can be used to specify lower and upper bounds other than minimum and maximum, respectively. Note that if gpd is TRUE (see below), upper defaults to Inf.
数字值,如果method是"multinom"和breaks是NULL,这些可以用来指定上限和下限以外的最小值和最大值,分别为。需要注意的是,如果gpd是TRUE(见下文),upper默认为Inf的。
参数:equidist
logical; if method is "multinom" and breaks is NULL, this indicates whether the (positive) default break points should be equidistant or whether there should be refinements in the lower and upper tail (see getBreaks).
逻辑,如果method是"multinom"和breaks是NULL,这表明(正)默认破发点是否应该是等距离的,还是应该有改进,在较低的和上尾(见getBreaks“)。
参数:probs
numeric vector with values in [0, 1]; if method is "multinom" and breaks is NULL, this gives probabilities for quantiles to be used as (positive) break points. If supplied, this is preferred over equidist.
数字矢量[0, 1]; method如果是"multinom"和breaks是NULL,这给位数的概率被用来作为(正)破发点中的值。如果提供,这是优于equidist。
参数:gpd
logical; if method is "multinom", this indicates whether the upper tail of the personal income should be simulated by random draws from a (truncated) generalized Pareto distribution rather than a uniform distribution.
逻辑,如果method是"multinom",这表明(截断),而不是广义帕累托分布是均匀分布的随机从上尾的个人所得税是否应模拟。
参数:threshold
a numeric value; if method is "multinom", values for categories above threshold are drawn from a (truncated) generalized Pareto distribution.
一个数值,如果method是"multinom",threshold都来自(截断)广义帕累托分布的上述类别的值。
参数:est
a character string; if method is "multinom", the estimator to be used to fit the generalized Pareto distribution (see fitgpd).
一个字符串,如果method是"multinom",估计被用来适应广义帕累托分布(见fitgpd)。
参数:const
numeric; if method is "twostep", this gives a constant to be added before log transformation.
数字;如果method是"twostep",这给出了一个常量前加上数转换。
参数:alpha
numeric; if method is "twostep", this gives trimming parameters for the sample data. Trimming is thereby done with respect to the variable specified by additional. If a numeric vector of length two is supplied, the first element gives the trimming proportion for the lower part and the second element the trimming proportion for the upper part. If a single numeric is supplied, it is used for both. With NULL, trimming is suppressed.
数字;如果method是"twostep",这给了微调参数的样本数据。修剪从而完成指定的additional的变量。如果供给的长度为2的一个数值向量时,第一元件给出的下部和所述第二元件的上部的比例修剪修剪比例。如果被提供了一个单一的数字,它是用于两个。用NULL,修整抑制。
参数:residuals
logical; if method is "twostep", this indicates whether the random error terms should be obtained by draws from the residuals. If FALSE, they are drawn from a normal distribution (median and MAD of the residuals are used as parameters).
逻辑,如果method是"twostep",这表明随机误差项是否应通过借鉴的残差。如果FALSE,他们是来自正态分布(中位数和MAD的残差作为参数)。
参数:components
a character vector specifying the income components in dataS (to be simulated for the population data).
指定字符向量的收入组成部分dataS(模拟的人口数据)。
参数:conditional
an optional character vector specifying categorical contitioning variables for resampling of the income components. The fractions occurring in dataS are then drawn from the respective subsets defined by these variables.
一个可选的字符向量指定分类contitioning的变量,用于重新采样的收入组成部分。发生的馏分在dataS然后来自这些变量所定义的各自的子集。
参数:keep
a logical indicating whether variables computed internally in the procedure (such as the original IDs of the corresponding households in the underlying sample, age categories or income categories) should be stored in the resulting population data.
逻辑指示是否内部计算的变量的程序(如原始ID底层样品中的相应的家庭中,的年龄类别或收入类别)应存放在人口所得数据。
参数:maxit, MaxNWts
control parameters to be passed to multinom and nnet. See the help file for nnet.
控制参数被传递到multinom和nnet。请参阅帮助文件nnet。
参数:tol
if method is "twostep", a small positive numeric value or NULL (see simContinuous).
method如果是"twostep",一个小的正数值或NULL(见simContinuous)。
参数:seed
optional; an integer value to be used as the seed of the random number generator, or an integer vector containing the state of the random number generator to be restored.
可选的,一个整数的值被用作种子的随机数发生器,或一个整数矢量包含随机数发生器的状态,以被恢复。
值----------Value----------
A data.frame containing the simulated EU-SILC population data.
Adata.frame包含模拟EU-SILC人口数据。
注意----------Note----------
This is a wrapper calling simStructure, simCategorical, simContinuous and simComponents.
这是一个包装调用simStructure,simCategorical,simContinuous和simComponents。
(作者)----------Author(s)----------
Andreas Alfons and Stefan Kraft
参见----------See Also----------
simStructure, simCategorical, simContinuous, simComponents
simStructure,simCategorical,simContinuous,simComponents
实例----------Examples----------
## Not run: [#不运行:]
## these take some time and are not run automatically[#这需要一定的时间,并没有自动运行]
## copy & paste to the R command line[#复制和粘贴到R命令行]
set.seed(1234) # for reproducibility[可重复性]
data(eusilcS) # load sample data[加载示例数据]
# multinomial model with random draws[多项式模型与随机抽取]
eusilcM <- simEUSILC(eusilcS, upper = 200000, equidist = FALSE)
summary(eusilcM)
# two-step regression[两步回归]
eusilcT <- simEUSILC(eusilcS, method = "twostep")
summary(eusilcT)
## End(Not run)[#(不执行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|