contaminate(simFrame)
contaminate()所属R语言包:simFrame
Contaminate data
污染数据
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Generic function for contaminating data.
污染数据的通用功能。
用法----------Usage----------
contaminate(x, control, ...)
## S4 method for signature 'data.frame,ContControl'
contaminate(x, control, i)
参数----------Arguments----------
参数:x
the data to be contaminated.
被污染的数据。
参数:control
a control object of a class inheriting from the virtual class "VirtualContControl" or a character string specifying such a control class (the default being "DCARContControl").
的控制对象从一个类继承的虚拟类"VirtualContControl"或一个字符串指定控制类(默认是"DCARContControl"“)。
参数:i
an integer giving the element of the slot epsilon of control to be used as contamination level.
插槽中的元素的一个整数,给出epsiloncontrol被用来作为污染程度。
参数:...
if control is a character string or missing, the slots of the control object may be supplied as additional arguments. See "DCARContControl" and "DARContControl" for details on the slots.
如果control是一个字符串或丢失,控制对象的插槽,可以提供额外的参数。 "DCARContControl"和"DARContControl"的插槽。
Details
详细信息----------Details----------
With the control classes implemented in simFrame, contamination is modeled as a two-step process. The first step is to select observations to be contaminated, the second is to model the distribution of the outliers.
随着实施simFrame控制类,污染被建模为一个两步的过程。第一步是选择观测到被污染,二是模型的异常值的分布。
In order to extend the framework by a user-defined control class "MyContControl" (which must extend "VirtualContControl"), a method contaminate(x, control, i) with signature 'data.frame, MyContControl' needs to be implemented. In case the contaminated observations need to be identified at a later stage of the simulation, e.g., if conflicts with inserting missing values should be avoided, a logical indicator variable ".contaminated" should be added to the returned data set.
为了扩展该框架由用户定义的控件类"MyContControl"(必须扩大"VirtualContControl"),方法contaminate(x, control, i)签名'data.frame, MyContControl'需要实现。需要被识别的仿真在稍后的阶段,例如,如果插入缺失值的冲突,应避免的情况下,受污染的观测,一个逻辑的指示符变量".contaminated"应返回的数据集添加到。
值----------Value----------
A data.frame containing the contaminated data. In addition, the column ".contaminated", which consists of logicals indicating the contaminated observations, is added to the data.frame.
Adata.frame包含的污染数据。此外,列".contaminated",它包括逻辑值指示受污染的观测,被添加到data.frame。
方法----------Methods----------
x = "data.frame", control = "character" contaminate data using a control class specified by the character string control. The
x = "data.frame", control = "character"污染使用控制类指定的字符串control的数据。 “
x = "data.frame", control = "ContControl" contaminate data as
x = "data.frame", control = "ContControl"污染数据
x = "data.frame", control = "missing" contaminate data using a control object of class "ContControl". Its slots may be supplied as
x = "data.frame", control = "missing"污染数据使用的控制对象类"ContControl"。它的时隙可以提供
注意----------Note----------
Since version 0.3, contaminate no longer checks if the auxiliary variable with probability weights are numeric and contain only finite positive values (sample still throws an error in these cases). This has been removed to improve computational performance in simulation studies.
自0.3版以来,contaminate,“不再检查是否使用辅助变量的概率权重数字只包含有限的正值(sample在这种情况下仍然会抛出一个错误)。这已被删除,以提高计算性能模拟研究。
(作者)----------Author(s)----------
Andreas Alfons
参考文献----------References----------
Statistical Simulation: The R Package <code>simFrame</code>. Journal of Statistical Software, 37(3), 1–36. URL http://www.jstatsoft.org/v37/i03/.
Package <code>simFrame</code> for Statistical Simulation. In Aivazian, S., Filzmoser, P. and Kharin, Y. (editors) Computer Data Analysis and Modeling: Complex Stochastic Data and Systems, volume 2, 178–181. Minsk. ISBN 978-985-476-848-9.
Multivariate Outlier Detection in Incomplete Survey Data. Survey Methodology, 34(1), 91–103.
Data. 57th Session of the International Statistical Institute, Durban.
参见----------See Also----------
"DCARContControl", "DARContControl", "ContControl", "VirtualContControl"
"DCARContControl","DARContControl","ContControl","VirtualContControl"
实例----------Examples----------
## distributed completely at random[#完全随机分布]
data(eusilcP)
sam <- draw(eusilcP[, c("id", "eqIncome")], size = 20)
# using a control object[使用控制对象]
dcarc <- ContControl(target = "eqIncome", epsilon = 0.05,
dots = list(mean = 5e+05, sd = 10000), type = "DCAR")
contaminate(sam, dcarc)
# supply slots of control object as arguments[电源插槽的控制对象作为参数]
contaminate(sam, target = "eqIncome", epsilon = 0.05,
dots = list(mean = 5e+05, sd = 10000))
## distributed at random[#随机分布]
require(mvtnorm)
mean <- rep(0, 2)
sigma <- matrix(c(1, 0.5, 0.5, 1), 2, 2)
foo <- generate(size = 10, distribution = rmvnorm,
dots = list(mean = mean, sigma = sigma))
# using a control object[使用控制对象]
darc <- DARContControl(target = "V2",
epsilon = 0.2, fun = function(x) x * 100)
contaminate(foo, darc)
# supply slots of control object as arguments[电源插槽的控制对象作为参数]
contaminate(foo, "DARContControl", target = "V2",
epsilon = 0.2, fun = function(x) x * 100)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|