R语言 sampling包 postest()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-29 21:49:11

postest(sampling)
postest()所属R语言包：sampling

                                    The poststratified estimator
                                       poststratified估计

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Computes the poststratified estimator of the population total.
计算的poststratified的估计的人口总数。

用法----------Usage----------

postest(data, y, pik, NG, description=FALSE)

参数----------Arguments----------

参数：data
data frame or data matrix; its number of rows is n, the sample size.
数据框或数据矩阵，其行数为n，样本大小。

参数：y
vector of the variable of interest; its length is equal to n, the sample size.
感兴趣的变量的矢量，其长度为n，样本大小是相等的。

参数：pik
vector of the first-order inclusion probabilities for the sampled units; its length is equal to n, the sample size.
矢量的一阶夹杂物采样单元的概率，其长度是等于到n，样本大小。

参数：NG
vector of population frequency in each group G; for stratified sampling with poststratification,  NG is a matrix of population frequency in each cell GH.
向量的人口频率在每G组;分层抽样事后分层，NG是一个人口频率矩阵中的每一个单元GH。

参数：description
if TRUE, the estimator is printed for each poststratum; by default, FALSE.
如果为true，估计每个poststratum打印，默认情况下，FALSE。

参见----------See Also----------

poststrata
poststrata

实例----------Examples----------

############[＃＃＃＃＃＃＃＃＃＃＃]
## Example 1[＃示例1]
############[＃＃＃＃＃＃＃＃＃＃＃]
#stratified sampling and poststratification[分层抽样和事后分层]
# Swiss municipalities data base[瑞士直辖市数据库]
data(swissmunicipalities)
attach(swissmunicipalities)
# the variable 'REG' has 7 categories in the population[变量的REG“在人群中，有7个类别，]
# it is used as stratification variable[它是用来作为分层变量]
# Computes the population stratum sizes[计算人口阶层大小]
table(swissmunicipalities$REG)
# do not run[不运行]
#  1 2 3 4 5 6 7 [1 2 3 4 5 6 7]
# 589 913 321 171 471 186 245 [589 913 321 171 471 186 245]
# the sample stratum sizes are given by size=c(30,20,45,15,20,11,44)[样品层尺寸大小= C（30,20,45,15,20,11,44）]
# the method is simple random sampling without replacement [方法很简单随机抽样，无需更换]
st=strata(swissmunicipalities,stratanames=c("REG"),
size=c(30,20,45,15,20,11,44), method="srswor")
# extracts the observed data[提取所观察到的数据]
# the order of the columns is different from the order in the initial database[的初始数据库中的列的顺序不同的顺序]
x=getdata(swissmunicipalities, st)
px=poststrata(x,"REG")
ct=unique(px$data$REG)
yy=numeric(length(ct))
for(i in 1:length(ct))
  {xx=swissmunicipalities[REG==ct[i],]
yy[i]=nrow(xx)
  }
yy
postest(px$data,y=px$data$Pop020,pik=px$data$Prob,NG=diag(yy),description=TRUE)
HTstrata(x$Pop020,x$Prob,x$Stratum)
#the two estimators are equal[这两个估计都是平等的]
############[＃＃＃＃＃＃＃＃＃＃＃]
## Example 2[＃示例2]
############[＃＃＃＃＃＃＃＃＃＃＃]
# systematic sampling and poststratification[系统抽样事后分层]
# Belgian municipalities data base[比利时直辖市数据库]
data(belgianmunicipalities)
Tot=belgianmunicipalities$Tot04
name=belgianmunicipalities$Commune
pik=inclusionprobabilities(Tot,200)
#selects a sample[选择一个样本]
s=UPsystematic(pik)
#the sample is[样品是]
as.vector(name[s==1])
# extracts the observed data[提取所观察到的数据]
b=getdata(belgianmunicipalities,s)
attach(belgianmunicipalities)
pb=poststrata(b,"Province")
#computes the population frequency in each group[计算各组中的人口频率]
ct=unique(pb$data$Province)
yy=numeric(length(ct))
for(i in 1:length(ct))
  {xx=belgianmunicipalities[Province==ct[i],]
yy[i]=nrow(xx)
  }
postest(pb$data,y=pb$data$TaxableIncome,pik=pik[s==1],NG=yy,description=TRUE)
HTestimator(pb$data$TaxableIncome,pik=pik[s==1])
############[＃＃＃＃＃＃＃＃＃＃＃]
## Example 3[＃示例3]
############[＃＃＃＃＃＃＃＃＃＃＃]
#cluster sampling and postratification[整群抽样和postratification]
# Swiss municipalities data base[瑞士直辖市数据库]
data(swissmunicipalities)
# the variable 'REG' has 7 categories in the population[变量的REG“在人群中，有7个类别，]
# it is used as clustering variable[它是用来作为聚类变量]
# the sample size is 3; the method is simple random sampling without replacement[的样本量是3，无需更换的方法很简单随机抽样]
cl=cluster(swissmunicipalities,clustername=c("REG"),size=3,method="srswor")
# extracts the observed data [提取所观察到的数据]
# the order of the columns is different from the order in the initial database[的初始数据库中的列的顺序不同的顺序]
c=getdata(swissmunicipalities, cl)
pc=poststrata(c,"CT")
#computes the population frequency in each group[计算各组中的人口频率]
ct=unique(pc$data$CT)
yy=numeric(length(ct))
for(i in 1:length(ct))
  {xx=swissmunicipalities[CT==ct[i],]
yy[i]=nrow(xx)
  }
postest(pc$data,y=pc$data$Pop020,pik=pc$data$Prob,NG=yy,description=TRUE)
############[＃＃＃＃＃＃＃＃＃＃＃]
## Example 4[＃例4]
############[＃＃＃＃＃＃＃＃＃＃＃]
#postratification with two criteria[postratification这两条标准]
#artificial data frame[人工数据框]
data=rbind(matrix(rep("nc",165),165,1,byrow=TRUE),matrix(rep("sc",70),70,1,byrow=TRUE))
data=cbind.data.frame(data,c(rep(1,100), rep(2,50), rep(3,15), rep(1,30),rep(2,40)),
1000*runif(235))
names(data)=c("state","region","income")
# computes the population stratum sizes[计算人口阶层大小]
table(data$region,data$state)
# not run[不运行]
#    nc  sc[NC SC]
#  1 100  30[1 100 30]
#  2  50  40[2 50 40]
#  3  15 0[3月15日0]
#selects a sample of size 10[选择的样本大小为10]
s=srswor(10,nrow(data))
# postratification using region and state[postratification区域和国家]
ps=poststrata(data[s==1,],c("region","state"))
#computes the population frequency in each group[计算各组中的人口频率]
ct=unique(ps$data$poststratum)
yy=numeric(length(ct))
for(i in 1:length(ct))
  {
xy=ps$data[ps$data$poststratum==ct[i],]
xstate=unique(xy$state)
ystate=unique(xy$region)
xx=data[data$state==xstate & data$region==ystate,]
yy[i]=nrow(xx)
  }
postest(ps$data,y=ps$data$income,pik=rep(10/nrow(data),10),NG=yy,description=TRUE)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册