R语言 SeleMix包 sel.edit()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-30 00:29:56

sel.edit(SeleMix)
sel.edit()所属R语言包：SeleMix

                                       Influential Error Detection
                                       有影响力的错误检测

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Computes the score function and identifies influential errors
计算得分函数，并确定有影响力的错误

用法----------Usage----------

   sel.edit (y, ypred, wgt=rep(1,nrow(as.matrix(y ))),
               tot=colSums(ypred * wgt), t.sel=0.01)

参数----------Arguments----------

参数：y
matrix or data frame containing the response variables
矩阵或数据框包含响应变量

参数：ypred
matrix of predicted values for y variables
矩阵y变量的预测值

参数：wgt
optional vector of sampling weights (default=1)
可选的取样权重向量（默认为1）

参数：tot
optional vector containing reference estimates of totals for the y variables.  If omitted, it is computed as the (possibly weighted) sum of predicted values
可选的向量参考估计总数为y变量。如果省略，作为预测值（可能加权）总和计算

参数：t.sel
threshold value for selective editing (default=0.01)
选择性编辑的阈值（默认值= 0.01）

Details

详细信息----------Details----------

This function ranks observations (rank) according to the importance of their potential errors.  The order is made with respect to the global score function values (global.score). The function also selects the units to be edited (sel) so that the expected residual error of all variables is below a prefixed level of accuracy (t.sel).  The global score (global.score)  is the maximum of the local scores computed for each variable  (y1.score, y2.score,...).  The local scores are defined as a weighted (weights) absolute difference between the observed  (y1, y2,...) and the predicted values (y1.p, y2.p,...) standardised with respect to  the reference total estimates (tot).<br>
此功能排名观察（rank），根据其潜在的错误的重要性。订单方面的全球得分函数值（global.score）。该函数也选择的单位进行编辑（sel），使得所有变量预期残余误差低于预定水平的准确度（t.sel）。的全局得分（global.score）是最大的地方为每个变量计算分数（y1.score, y2.score,...）。当地的分数被定义为加权（weights）所观察到的绝对差值（y1, y2,...）和预测值（y1.p, y2.p,...）相对于参考总估计（标准化<X >）。参考

The selection of the units to be edited because affected by an influential error (sel=1) is  made according to a two-step algorithm:<br> 1) order the observations with respect to the global.score (decreasing order);<br> 2) select the first k units such that, from the (k+1)th to the last observation, all the  residual errors (y1.reserr, y2.reserr,...) for each variable are below t.sel.<br> <br> The function provides also an indicator function (y1.sel, y2.sel,...) reporting  which variables contain an influential errors in a unit selected for the revision.
单位的选择要编辑的，因为受影响力的错误（sel=1）是根据两个步骤的算法：参考1）命令的观察与尊重的global.score（减少次序）;参考2），选择第k个单位，例如，从第（k +1）的最后的观察，对于每个变量的所有的残余误差（y1.reserr, y2.reserr,...）下面的t.sel参考参考该功能还提供了一个指标函数（y1.sel, y2.sel,...）报告的变量包含一个有影响力的错误选择一个单位的修订。

值----------Value----------

sel.edit returns a data matrix containing the following columns:<br>
sel.edit返回的数据矩阵包含以下几列：参考

<table summary="R valueblock"> <tr valign="top"><td>y1, y2,...</td> <td> observed variables</td></tr> <tr valign="top"><td>y1.p, y2.p,...</td> <td> predictions of y variables</td></tr>
<table summary="R valueblock"> <tr valign="top"> <TD> y1, y2,...</ TD> <TD>观察到的变量</ TD> </ TR> <tr valign="top"> <TD> y1.p, y2.p,... </ TD> <TD> y变量的预测</ TD> </ TR>

<tr valign="top"><td>weights</td> <td> sampling weights</td></tr> <tr valign="top"><td>y1.score, y2.score,...</td> <td> local scores</td></tr> <tr valign="top"><td>global.score</td> <td> global score</td></tr> <tr valign="top"><td>y1.reserr, y2.reserr,...</td> <td> residual errors</td></tr> <tr valign="top"><td>y1.sel, y2.sel,...</td> <td> influential error flags</td></tr>       <tr valign="top"><td>rank</td> <td> rank according to global score</td></tr> <tr valign="top"><td>sel</td> <td> 1 if the observation contains an influential error, 0 otherwise</td></tr>  </table>
<tr valign="top"> <TD> weights </ TD> <TD>取样权重</ TD> </ TR> <tr valign="top"> <TD>y1.score, y2.score,... / TD> <TD>当地分数</ TD> </ TR> <tr valign="top"> <TD>global.score </ TD> <TD>全球得分</ TD> </ TR> < TR VALIGN =“顶”> <TD>y1.reserr, y2.reserr,... </ TD> <TD>剩余误差</ TD> </ TR> <tr valign="top"> <TD>y1.sel, y2.sel,...</ TD> <TD>有影响力的错误标志</ TD> </ TR> <tr valign="top"> <TD> rank </ TD> <TD>根据全球得分排名</ TD> </ TR> <tr valign="top"> <TD> sel </ TD> <TD> 1，如果的观察包含一个有影响力的错误，否则为0 </ TD> </ TR> </表>

（作者）----------Author(s)----------

M. Teresa Buglielli <bugliell@istat.it>, Ugo Guarnera <guarnera@istat.it>

参考文献----------References----------

Di Zio, M., Guarnera, U., Luzi, O. (2008) "Contamination models for the detection of outliers and influential errors in continuous multivariate data", UNECE Work Session on Statistical Data Editing, Vienna, 21-23 Aprile 2008 (http://www.unece.org/stats/documents/2008.04.sde.htm).<br>
Buglielli, M.T., Di Zio, M., Guarnera, U. (2010) "Use of Contamination Models for Selective Editing",  European Conference on Quality in Survey Statistics Q2010, Helsinki, 4-6 May 2010.

实例----------Examples----------

# Example 1[例1]
data(ex1.data)
ml.par <- ml.est(y=ex1.data[,"Y1"], x=ex1.data[,"X1"])
sel <- sel.edit(y=ex1.data[,"Y1"], ypred=ml.par$ypred)
head(sel)
sum(sel[,"sel"])
# orders results for decreasing importance of score    [订单重要性递减的分数]
sel.ord <- sel[order(sel[,"rank"]),  ]
# adds columns to data[添加列的数据]
ex1.data <- cbind(ex1.data, tau=ml.par$tau, outlier=ml.par$outlier,
                  sel[,c("rank", "sel")])
# plot of data with outliers and influential errors [图数据的异常值和影响力的错误]
sel.pairs(ex1.data[,c("X1","Y1")],outl=ml.par$outlier, sel=sel[,"sel"])
# Example 2[例2]
data(ex2.data)
ml.par <- ml.est(y=ex2.data)
sel <- sel.edit(y=ex2.data, ypred=ml.par$ypred)
sel.pairs(ex2.data,outl=ml.par$outlier, sel=sel[,"sel"])

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 SeleMix包 sel.edit()函数中文帮助文档(中英文对照)

浏览过的版块