rTranscriptData(simone)
rTranscriptData()所属R语言包:simone
Simulation of artificial transcriptomic data
模拟人工转录组数据
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Simulates a Gaussian sample that mimics transcriptomic data, according to a given network, either steady-state or time-course data. When several networks are given, multiple samples
模拟高斯样本,可以模仿转录组数据,根据一个给定的网络,无论是稳态或时间过程的数据。当几个网络,多个样本
用法----------Usage----------
graph,
...,
mu = rep(0, p),
sigma = 0.1)
参数----------Arguments----------
参数:n
integer or vector of integer indicating the sample sizes of each task
整数或向量的整数,表示每个任务的样本规模,
参数:graph
a simone.network object typically generated either by rNetwork or coNetwork
simone.network对象通常产生rNetwork或coNetwork
参数:...
additional simone.network objects in case of multiple sample generation
额外的simone.network在多个样品一代的情况下对象
参数:mu
if the network(s) is(are) directed, mu is the offset of the VAR(1) model that is used to generate the time-course data; if undirected, mu is the offset of the Gaussian vector.
如果网络(s)是()执导,mu是VAR(1)模型,该模型被用来生成的时间过程数据的偏移量,如果无向,mu的偏移量高斯矢量。
参数:sigma
standard deviation of the noise term used in the simulation process
在模拟过程中使用的噪声项的标准偏差
Details
详细信息----------Details----------
If the network is directed, time-course data are simulated according to a VAR(1) model. If the network is undirected, steady-state data are generated by simulating an independent, identically distributed sample of a Gaussian vector.
如果网络是导演,时间当然是根据一个VAR(1)模型模拟数据。如果网络是无向的,稳态模拟一个独立,同分布的样本的高斯矢量数据所产生的。
In both cases, samples are generated on the basis of Θ, as provided by graph$Theta.
在这两种情况下,对样品进行Θ的的基础上产生的,如所提供graph$Theta。
If the network is directed, samples are generated according to the following VAR(1) process:
如果网络的指示,产生样本,根据以下的VAR(1)过程:
<center> <table> <tr> <td> X<sub>0</sub> follows N(0,σ) </td> </tr> <tr> <td> X<sub>t</sub> = μ + Θ X<sub>t-1</sub> + ε<sub>t</sub>, for all t= 1,..., n</td> </tr> <tr> <td> ε<sub>t</sub> follows N(0,σ).</td> </tr> </table> </center>
<CENTER> <TABLE>文章快照所述<SUB> 0 </ sub>的如下N(0,σ)</ TD> </ TR>文章快照所述<SUB> T </子> =μ+Θ所述<SUB> T-1 </ SUB> +ε<SUB> T </ sub>的,对所有的t = 1,...,N </ TD> </ TR>文章快照ε<SUB> T </ sub>的如下N(0,σ)。</ TD> </ TR> </ TABLE> </ CENTER>
If the network is undirected, samples are generated according to the following Gaussian vector: <center> <table> <tr> <td> X<sub>i</sub> = μ + t(Θ<sup>-1/2</sup>) U<sub>i</sub> + ε<sub>i</sub>, for all i in 1, ..., n,</td> </tr> <tr> <td> U<sub>i</sub> follows N(0,1) </td> </tr> <tr> <td> ε<sub>i</sub> follows N(0,σ).</td> </tr> </table> </center> Numerically, Θ<sup>-1/2</sup> is computed with the Cholesky decomposition of the pseudo-inverse of Θ.
如果网络是无向的,样本是根据下面的高斯矢量<CENTER> <TABLE>文章快照:X <SUB>我</ SUB> =μ+ T(Θ<SUP> -1 / 2 </ SUP>)U <SUB>我</ SUB> +ε<SUB> </ sub>的,所有我在1,...,N,</ TD> </ TR> <TR > <TD>ü<SUB>我</ sub>的如下N(0,1)</ TD> </ TR>文章快照ε<SUB> </ sub>的如下N(0,与SIGMA ;) </ TD> </ TR> </ TABLE> </中心>数值,Θ<SUP> -1 / 2 </ sup>的计算Cholesky分解的伪逆Θ。
值----------Value----------
Returns a list comprising : <table summary="R valueblock"> <tr valign="top"><td>X</td> <td> matrix of simulated gene expression data, n observations in rows, genes in columns</td></tr> <tr valign="top"><td>tasks</td> <td> factor indicating the tasks corresponding to the simulated gene expression data in case of multiple networks.</td></tr> </table>
返回一个列表,包括:<table summary="R valueblock"> <tr valign="top"> <TD> X</ TD> <TD>矩阵的模拟基因表达数据,n观察行,列中的基因</ TD> </ TR> <tr valign="top"> <TD> tasks </ TD> <TD>因素对应于模拟基因表达数据的情况下,指示的任务多个网络。</ TD> </ TR> </ TABLE>
(作者)----------Author(s)----------
J. Chiquet, C. Charbonnier
参见----------See Also----------
rNetwork, coNetwork.
rNetwork,coNetwork。
实例----------Examples----------
## time-Course data generation[#课程时间数据生成]
##-----------------------------[#-----------------------------]
# generate a directed network[生成有向网络]
n <- 20
p <- 5
g <- rNetwork(p, pi=5, directed=TRUE)
# Generate the data, data2 noisier than data1[生成的数据,噪音比DATA1 DATA2]
data1 <- rTranscriptData(n,g)
data2 <- rTranscriptData(n,g,sigma=1)
matplot(1:n, data1$X,type= "l", xlab = "time points",
ylab = "level of expression", col=rainbow(n,start=2/6,end = 3/6),
ylim = range(c(data1$X,data2$X)),
main="data2 (blue) generated with more noise than data1 (green)")
matlines(1:n,data2$X,type= "l",col = rainbow(n,start=4/6,end=5/6))
## steady-state data generation[#稳态数据生成]
##-----------------------------[#-----------------------------]
# generate an undirected network[生成一个无向网络]
p <- 10
g <- rNetwork(p, pi=10)
data <- rTranscriptData(n=1000,g, sigma=0)
attach(data)
# Inference of Theta (here without dimension problems since p << n)[推理的Theta(这里没有尺寸问题,因为p << N)]
b <- sapply(1:p,function(x){
tmp <- -solve(t(X[,-x]) %*% X[,-x]) %*% t(X[,-x]) %*% X[,x]
res <- rep(NA,10)
res[-x] <- tmp
res[x] <- 1
return(res)
}
)
detach(data)
# comparison of theoretical Theta and inferred Theta[理论西塔的比较和推断的Theta]
par(mfrow=c(1,2))
image(g$Theta, main = "Theoretical Theta")
image(b, main = "Inferred Theta")
## time-course multitask data generation[时间当然多任务数据生成]
##--------------------------------------[#--------------------------------------]
# start by generating the networks[从产生的网络]
ancestor <- rNetwork(p=5, pi=5, name="ancestor", directed=TRUE)
child1 <- coNetwork(ancestor, 1, name = "child 1")
child2 <- coNetwork(ancestor, 1, name = "child 2")
# generate the data[生成的数据]
n <- c(20,20)
data <- rTranscriptData(n,child1,child2)
attach(data)
par(mfrow=c(2,1))
matplot(1 n[1]),X[tasks ==1,],type= "l", main="Dataset from child 1",
xlab = "time points", ylab = "level of expression")
matplot(1 n[2]),X[tasks == 2,], type= "l", main="Dataset from child 2",
xlab = "time points", ylab = "level of expression")
detach(data)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|