MAsim.smyth(OCplus)
MAsim.smyth()所属R语言包:OCplus
Simulate two-sample microarray data
模拟两个样本的微阵列数据
译者:生物统计家园网 机器人LoveR
描述----------Description----------
These functions simulate two-sample microarray data from various different models.
这些功能可以模拟各种不同型号的两个样本的微阵列数据。
用法----------Usage----------
MAsim(ng = 10000, n = 10, n1 = n, n2 = n, D = 1, p0 = 0.9, sigma = 1)
MAsim.var(ng = 10000, n = 10, n1 = n, n2 = n, D = 1, p0 = 0.9)
MAsim.smyth(ng = 10000, n = 10, n1 = n, n2 = n, p0 = 0.9, d0 = 4,
s2_0 = 4, v0 = 2)
MAsim.real(xdat, grp, n, n1, n2, D = 1, p0 = 0.9, replace = TRUE)
参数----------Arguments----------
参数:ng
number of genes
基因数目
参数:n, n1, n2
number of samples per group; by default balanced, except for MAsim.real.
每组样本数;默认情况下,均衡,除了MAsim.real。
参数:p0
proportion of differentially expressed genes
比例差异表达的基因
参数:D
effect size for differentially expressed genes, in units of the gene-specific standard deviation (sigma in MAsim).
差异表达基因的基因特异的标准差为单位的规模效应,(sigmaMAsim)。
参数:sigma
standard deviation, constant for all genes
标准偏差,不断为所有的基因
参数:d0, s2_0, v0
prior parameters for effect size and variability across genes in Smyth's model, see Details.
前的规模效应和变异基因跨在史密斯的模型参数,查看详情。
参数:xdat, grp
expression data and grouping variable for an existing microarray data set, as specified in EOC.
表达数据和现有的微阵列数据集的变量分组,指定在EOC。
参数:replace
logical switch indicating whether to sub-sample (replace=FALSE) or bootstrap (replace=TRUE) from the existing data. Note that the specified group-sizes have to be smaller than the real group sizes in case of sub-sampling.
逻辑开关指示是否子样本(replace=FALSE)或引导(replace=TRUE)从现有的数据。请注意,指定的组大小必须比真正的群体规模较小的子采样的情况下。
Details
详情----------Details----------
MAsim simulates normal data with constant standard deviation sigma across genes and fixed effect size D; the sign of the effect is equally and randomly split between up- and down-regulation, and effects are added to the second group. MAsim.var does the same, but instead of relying on a fixed variance across genes, it simulates gene-specific variances from a standard exponential distribution.
MAsim模拟正常与恒定的标准偏差数据sigma整个基因和固定的规模效应D;影响的标志和下调之间的平等和随机分裂,影响添加到小组第二。 MAsim.var不相同,但是,而不是依靠固定在整个基因变异,它模拟特定基因的差异,从一个标准的指数分布。
MAsim.smyth simulates from the model suggested in Smyth (2004), using a normal error distribution. The variances are assumed to follow an inverse chisquared distribution with d0 degrees of freedom and are scaled by s2_0; consequently, large values of d0 lead to similar gene-wise variances across genes, whereas small values lead to very different variances between genes. The effect sizes for differentially expressed genes are assumed to follow a normal distribution with mean zero and variance v0 times the previously simulated gene-specific variance; consequently, large values of v0 lead to large effects in the model.
MAsim.smyth史密斯(2004)提出的模型来模拟,使用正常的误差分布。方差的假设遵循逆用d0自由度chisquared分布和缩放,因此,s2_0导致类似的基因明智的跨基因的差异,大值,而由d0小值导致非常不同的基因之间的差异。假设遵循正态分布零均值和方差v0倍模拟特定的基因变异;差异表达基因的影响大小,因此,v0模型中的带头大的影响较大值。
MAsim.real finally uses existing real or simulated existing data sets to generate simulated data with fixed effect sizes: for each group, the specified number of samples is sampled either with or without replacement from the columns of xdat; for each gene, the group means are subtracted from the resampled data, so that the groupwise and overall mean for each gene is zero. Then, noise from an appropriate t-distribution is added to each group to break the sum-to-zero constraint in a consistent manner. The specified effect (evenly split between up- and down-regulation) for the differentially expressed genes is again added to the second group.
MAsim.real终于使用现有的真实或模拟的现有数据集生成固定效应大小的模拟数据:每个组,指定数量的样品进行采样或没有从更换xdat列;每个基因组的方式,减去的重采样的数据,使GroupWise和整体平均每个基因是零。然后,从适当的t分布的噪声添加到每个组,打破以一致的方式的总和为零的约束。第二组再次被添加到指定的效果(均匀地之间的分裂和下调)的差异表达基因。
值----------Value----------
The functions all return a matrix with ng rows and n1+n2 columns, except for MAsim.real, where the default is to return a matrix of the same dimensions as xdat. The group membership of each column is given by its column name. The matrix has additionally the attribute DE, which is a logical vector specifying for each gene whether or not it was assumed to be differentially expressed in the simulation.
所有的函数返回一个ng行n1+n2列矩阵,除了MAsim.real,其中默认是返回一个xdat尺寸相同的矩阵。每列组成员给予其列名。矩阵有额外的属性DE,这是一个逻辑向量,指定每个与否,它被假定要在模拟差异表达的基因。
参考文献----------References----------
<h3>See Also</h3>
举例----------Examples----------
# Small examples only[只有小例子]
sim1 = MAsim(ng=1000, n=10, p0=0.8)
sim2 = MAsim.var(ng=1000, n1=15, n2=5, p0=0.8)
sim3 = MAsim.smyth(ng=1000, n=10, p0=0.8)
# Assess FDR[评估FDR]
eoc1 = EOC(sim1, colnames(sim1), plot=FALSE)
eoc2 = EOC(sim2, colnames(sim2), plot=FALSE)
eoc3 = EOC(sim3, colnames(sim3), plot=FALSE)
# Show[显示]
par(mfrow=c(2,2))
plot(eoc1)
plot(eoc2)
plot(eoc3)
OCshow(eoc1, eoc2, eoc3)
# The truth will make you fret[真理会使你烦恼]
table(eoc1$FDR<0.1, attr(sim1, "DE"))
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|