R语言 timecourse包 mb.long()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-26 15:37:36

mb.long(timecourse)
mb.long()所属R语言包：timecourse

                                    Multivariate Empirical Bayes Statistics for Longitudinal Replicated Developmental Microarray Time
                                       多元的经验Bayes纵向复制的发展芯片时间统计

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Computes the \tilde{T}^2 statistics and/or the  MB-statistics of differential expression for longitudinal replicated developmental microarray time course  data by multivariate empirical Bayes shrinkage of gene-specific sample variance-covariance matrices
计算\tilde{T}^2统计数据和/或纵向发展芯片复制多元的特定基因样本协方差矩阵的经验Bayes收缩当然数据差表达MB统计

用法----------Usage----------

mb.long(object, method = c("1D", "paired", "2D"), type = c("none", "robust"),
times, reps, prior.df = NULL, prior.COV = NULL,
prior.eta = NULL, condition.grp = NULL, rep.grp = NULL, time.grp = NULL,
one.sample = FALSE, ref = NULL, p = 0.02, out.t = FALSE,
tuning = 1.345, HotellingT2.only=TRUE)

参数----------Arguments----------

参数：object
Required. An object of class matrix, MAList, marrayNorm, or ExpressionSet containing log-ratios or log-values of expression for a series of microarrays.
必需的。一个类的对象matrix，MAList，marrayNorm或ExpressionSet包含log比率或一系列微阵列表达log值。

参数：method
a character string, "1D" for the one-sample case where genes of interest are those which change over time, "paired" for the one-sample case where genes of interest are those whose expected temporal profiles do not stay 0, for example, cDNA microarrays,  or the paired two-sample case where genes of interest are those with different expected temporal profiles across 2 biological conditions, "2D" for the independent two-sample case where genes of interest are those with different expected temporal profiles across 2 biological conditions. The default is "1D".
一个字符串，"1D"一个样品的情况下，利益的基因改变，随着时间的推移，"paired"一个样本的情况下，利益的基因其预期的时间分布不留0，例如，基因芯片，或成对双样本的情况下，利益的基因是那些与预期不同的时间分布在2个生物条件，"2D"独立两样本的情况下，利益的基因是那些与预期不同的时间分布在2个生物条件。默认"1D"。

参数：type
a character string, indicating whether possible outliers should be down-weighted.
一个字符串，指示是否可能离群应该向下加权的。

参数：times
Required. A positive integer giving the number of time points.
必需的。一个正整数，给予的时间点。

参数：reps
Required. A numeric vector or matrix corresponding to the sample sizes for all  genes across different biological conditions, when biological conditions are sorted in  ascending order. If a matrix, rows represent genes while columns represent biological conditions.
必需的。一个数值向量或矩阵对应的所有基因在不同的生理条件，生物条件是按升序排序时的样本大小。如果一个矩阵，行代表基因，列代表生物条件。

参数：prior.df
an optional positive value giving the degrees of moderation.
一个可选的正面价值，给予适度的程度。

参数：prior.COV
an optional numeric matrix giving the common covariance matrix to which the gene-specific sample covariances are smoothed toward.
一个可选的数字矩阵，共同协方差矩阵，特定的基因样本方差，平滑向。

参数：prior.eta
an optional numeric value giving the scale parameter for the covariance matrix for the expected time course profile.
一个可选的数值，使预期的时间当然个人资料的协方差矩阵为尺度参数。

参数：condition.grp
a numeric or character vector with length equals to the number of arrays, assigning the biological condition group of each array. Required if  method=2D.
与数字或字符的矢量长度等于阵列的数量，分配给每个阵列生物条件组。如果method=2D需要。

参数：rep.grp
an optional numeric or character vector with length equals to the number of arrays, assigning the replicate group of each array.
一个可选的数字或字符向量长度等于阵列的数量，分配给每个阵列的复制组。

参数：time.grp
an optional numeric vector with length equals to the number of arrays, assigning the time point group of each array.
可选数值向量长度等于阵列的数量，分配给每个阵列的时间点组。

参数：one.sample
Is it a one-sample problem? Only specify this argument when method=paired.  The default is FALSE which means it is a paired two-sample problem.
这是一个样的问题呢？唯一指定此参数时method=paired。默认的是FALSE这意味着它是成对双样本问题。

参数：ref
an optional numeric value or character specifying the name of reference biological condition. The default uses the first element of condition.grp. Only specify this argument when method=paired and one.sample is FALSE.
一个可选的数字值或字符指定的参考生物条件的名称。默认使用condition.grp的第一要素。唯一指定此参数，当method=paired和one.sample是FALSE。

参数：p
a numeric value between 0 and 1, assumed proportion of genes which are differentially expressed.
0和1之间的数值，假定这些差异表达基因的比例。

参数：out.t
logical. Should the moderated multivariate t-statistics be outputed? The default is  FALSE.
逻辑。应放缓多元t-统计outputed？默认FALSE。

参数：tuning
the tuning constant for the Huber weight function with a default 1.345.
调整默认1.345胡伯权重函数的常量。

参数：HotellingT2.only
logical. Should only the HotellingT2 statistics be outputed? This should be  set as TRUE (default) when the sample size(s) are the same across genes, in order to reduce computational time.
逻辑。应该只有HotellingT2统计outputed？这应设置为TRUE（默认）时，样本大小（S）是跨相同的基因，以减少计算时间。

Details

详情----------Details----------

This function implements the multivariate empirical Bayes statistics described in Tai and Speed (2004), to rank genes in the order of interest from longitudinal replicated developmental microarray time course experiments. It calls one of the following functions,  depending on which method is used: mb.1D,  mb.paired, and mb.2D.
此功能实现多元的经验Bayes大和速度（2004年），描述的统计排名的利益为了从纵向发展芯片复制时间课程实验的基因。它调用以下功能之一取决于method使用：mb.1D，mb.paired，mb.2D。。

The arguments condition.grp, rep.grp, and time.grp, if specified, should have lengths equal to the number of arrays. The i_th elements of these three arguments should correspond to the biological condition, replicate, and time for the i_th column (array) in the expression value matrix of the input object, respectively. The default assumes the columns of M are in the ascending order of condition.grp first, and then rep.grp, and finally time.grp.
的论点condition.grp，rep.grp，time.grp，如果指定的话，应该有长度等于阵列数量。 i_th这三个参数的元素相对应的生物状态，复制，i_th列在输入对象的表达式的值矩阵（数组）的时间，分别为。的默认假定Mcondition.grp第一升序，然后rep.grp，终于time.grp的列。

Arguments one.sample and ref are for method=paired only.
参数one.sample和refmethod=paired只。

When type=robust, the numerator of the \tilde{T}^2 statistic is calculated using the weighted average time course vector(s), where the weight at each data point  is determined using Huber's weight function with the default tuning constant 1.345.
当type=robust，\tilde{T}^2统计的分子计算的加权平均时间当然向量（S），在每个数据点的重量决定使用默认的时间常数1.345 Huber的重量功能。

Warning: When there are only 2 replicates within conditions, type="robust" produces the same rankings as type="none"  since there is no consensus on gene expression values.  Check the output weights for these outliers.
警告：当只有2个重复，在条件type="robust"产生相同的排名type="none"对基因表达的值，因为没有共识。检查这些离群的输出重量。

值----------Value----------

Object of MArrayTC.
对象MArrayTC。

作者（S）----------Author(s)----------

Yu Chuan Tai  <a href="mailto:yuchuan@stat.berkeley.edu">yuchuan@stat.berkeley.edu</a>

参考文献----------References----------

for replicated microarray time course data. Annals of Statistics 34(5):2387-2412.
Microarrays, U. Nuber (ed.), BIOS Scientific Publishers Limited, Taylor & Francis, 4 Park Square, Milton  Park, Abingdon OX14 4RN, Chapter 20.
Wiley series in probability and mathematical statistics.

参见----------See Also----------

timecourse Vignette.
timecourse小插曲。

举例----------Examples----------

data(fruitfly)
colnames(fruitfly) ## check if arrays are arranged in the default order[＃检查，如果阵列中的默认顺序排列]
gnames <- rownames(fruitfly)
assay <- rep(c("A", "B", "C"), each = 12)
time.grp <- rep(c(1:12), 3)
size <- rep(3, nrow(fruitfly))

out1 <- mb.long(fruitfly, times=12, reps=size, rep.grp = assay, time.grp = time.grp)
summary(out1)
plotProfile(out1, type="b", gnames=gnames, legloc=c(2,15), pch=c("A","B","C"), xlab="Hour")

## Simulate gene expression data[＃模拟基因表达数据]
## Note: this simulation is for demonstration purpose only,[＃注：此模拟是只用于演示目的，]
## and does not necessarily reflect the real [＃，并不一定反映真实的]
## features of longitudinal time course data[＃纵向的时间当然数据的功能]

## one biological condition, 5 time points, 3 replicates[一个生物学条件下，5个时间点，重复3次]
## 500 genes, 10 genes change over time[＃500个基因，10个基因随时间变化]

SS <- matrix(c( 0.01, -0.0008, -0.003,    0.007,  0.002,
            -0.0008, 0.02, 0.002, -0.0004, -0.001,
               -0.003, 0.002,    0.03, -0.0054, -0.009,
               0.007, -0.0004, -0.00538,    0.02, 0.0008,
               0.002,  -0.001, -0.009, 0.0008,  0.07), ncol=5)

sim.Sigma <- function()
{
S <- matrix(rep(0,25),ncol=5)
x <- mvrnorm(n=10, mu=rep(0,5), Sigma=10*SS)
for(i in 1:10)
   S <- S+crossprod(t(x[i,]))

solve(S)

}

sim.data1 <- function(x, indx=1)
{
mu <- rep(runif(1,8,x[1]),5)
if(indx==1) res <- as.numeric(t(mvrnorm(n=3, mu=mu+rnorm(5,sd=4), Sigma=sim.Sigma())))
if(indx==0) res <- as.numeric(t(mvrnorm(n=3, mu=mu, Sigma=sim.Sigma())))
res
}

M1 <- matrix(rep(14,500*15), ncol=15)
M1[1:10,] <- t(apply(M1[1:10,],1,sim.data1))
M1[11:500,] <- t(apply(M1[11:500,],1,sim.data1, 0))

## Which genes are nonconstant?[＃基因是非常数吗？]
MB.1D1 <- mb.long(M1, times=5, reps=rep(3, 500))
MB.1D1$percent  # check the percent of moderation[检查适度％]

plotProfile(MB.1D1,type="b") # plots the no. 1 gene[图没有。 1基因]
plotProfile(MB.1D1,type="b",ranking=10) # plots the no. 10 gene[图没有。 10基因]
genenames <- as.character(1:500)
plotProfile(MB.1D1, type="b", gid="8", gnames=genenames) #plots the gene with ID "8"[绘制基因身份证“8”]

## [＃]
MB.1D1.r <- mb.long(M1, type="r", times=5, reps=rep(3, 500))
plotProfile(MB.1D1.r,type="b",gnames=genenames)
plotProfile(MB.1D1.r,type="b", gid="1", gnames=genenames) #plots the gene with ID "1" [绘制基因ID为“1”]

## assign the following labellings to columns of M1[＃指定以下标号的货币供应量M1的列]
## which is actually the same as the default[＃这实际上是作为默认的相同]
## Not Run[＃不运行]
trt <- rep("wildtype", 15)
assay <- rep(c("A","B","C"), rep(5,3))
time.grp <- rep(c(0, 1, 3, 4, 6), 3)

## MB.1D2 should give the same results as MB.1D1[＃MB.1D2应给予相同的结果作为MB.1D1]
#MB.1D2 <- mb.long(M1, times=5, reps=rep(3, 500), condition.grp = trt, rep.grp = assay, [MB.1D2 < -  mb.long（M1，时间= 5，代表= REP（3 500），condition.grp = TRT，rep.grp =检测，]
#time.grp=time.grp)[time.grp = time.grp）]

## suppose now the replicates are in this order instead[＃假设现在的重复这个命令，而不是在]
assay <- rep(c("A","C","B"), rep(5,3))

## then[＃然后]
MB.1D3 <- mb.long(M1, times=5, reps=rep(3, 500), condition.grp = trt, rep.grp = assay, time.grp=time.grp)
MB.1D3$rep.group  #check the replicate and time group[检查复制和时间组]
MB.1D3$time.group

## Now let's simulate another dataset with two biological conditions[＃现在，让我们的模拟与两个生物条件的另一个数据集]
## 500 genes also, 10 of them have different expected time course profiles[＃500基因，其中10人有不同的预期时间当然概况]
## between these two biological conditions  [＃之间的生物这两个条件]
## 3 replicates, 5 time points for each condition[＃3个重复，每个条件的5个时间点]

sim.data2 <- function(x, indx=1)
{
mu <- rep(runif(1,8,x[1]),5)
if(indx==1)
   res <- c(as.numeric(t(mvrnorm(n=3, mu=mu+rnorm(5,sd=5), Sigma=sim.Sigma()))),
         as.numeric(t(mvrnorm(n=3, mu=mu+rnorm(5,sd=3.2), Sigma=sim.Sigma()))))

if(indx==0) res <- as.numeric(t(mvrnorm(n=6, mu=mu+rnorm(5,sd=3), Sigma=sim.Sigma())))
res
}

M2 <- matrix(rep(14,500*30), ncol=30)
M2[1:10,] <- t(apply(M2[1:10,],1,sim.data2))
M2[11:500,] <- t(apply(M2[11:500,],1,sim.data2, 0))

## assume it is a paired two-sample problem[＃假设它是成对双样本问题]
trt <- rep(c("wt","mt"),each=15)
assay <- rep(rep(c("1.2.04","2.4.04","3.5.04"),each=5),2)
size <- matrix(3, nrow=500, ncol=2)
MB.paired <- mb.long(M2, method="paired", times=5, reps=size, condition.grp=trt, rep.grp=assay)
MB.paired$con.group # check the condition, replicate and time groups[检查的情况下，复制和时间组]
MB.paired$rep.group
MB.paired$time.group

plotProfile(MB.paired, type="b")
genenames <- as.character(1:500)
plotProfile(MB.paired, gid="12", type="b", gnames=genenames) #plots the gene with ID "12"[绘制基因ID为“12”]

### assume it is a unpaired two-sample problem[＃假设它是一个未配对的两样本问题]
assay <- rep(c("1.2.04","2.4.04","3.5.04","5.21.04","7.17.04","8.4.04"),each=5)
MB.2D <- mb.long(M2, method="2", times=5, reps=size, condition.grp=trt, rep.grp=assay)
MB.2D$con.group # check the condition, replicate and time groups[检查的情况下，复制和时间组]
MB.2D$rep.group
MB.2D$time.group

plotProfile(MB.2D,type="b", gnames=genenames) # plot the no. 1 gene[绘制无。 1基因]

## Now let's simulate another dataset with two biological conditions[＃现在，让我们的模拟与两个生物条件的另一个数据集]
## 500 genes also, 10 of them have different expected time course profiles[＃500基因，其中10人有不同的预期时间当然概况]
## between these two biological conditions[＃之间的生物这两个条件]
## the first condition has 3 replicates, while the second condition has 4 replicates, [＃第一个条件有3个重复，而第二个条件有4个重复，]
## 5 time points for each condition[＃5个时间点为每个条件]

sim.data3 <- function(x, indx=1)
{
mu <- rep(runif(1,8,x[1]),5)
if(indx==1)
   res <- c(as.numeric(t(mvrnorm(n=3, mu=mu+rnorm(5,sd=5), Sigma=sim.Sigma()))),
         as.numeric(t(mvrnorm(n=4, mu=mu+rnorm(5,sd=3.2), Sigma=sim.Sigma()))))

if(indx==0) res <- as.numeric(t(mvrnorm(n=7, mu=mu+rnorm(5,sd=3), Sigma=sim.Sigma())))
res
}

M3 <- matrix(rep(14,500*35), ncol=35)
M3[1:10,] <- t(apply(M3[1:10,],1,sim.data3))
M3[11:500,] <- t(apply(M3[11:500,],1,sim.data3, 0))

assay <- rep(c("1.2.04","2.4.04","3.5.04","5.21.04","7.17.04","9.10.04","12.1.04"),each=5)
trt <- c(rep(c("wildtype","mutant"),each=15),rep("mutant",5))
## Note that "mutant" < "wildtype", the sample sizes are (4, 3)[＃注意，“突变”<“野生”，样本大小（4，3）]
size <- matrix(c(4,3), nrow=500, ncol=2, byrow=TRUE)
MB.2D.2 <- mb.long(M3, method="2", times=5, reps=size, rep.grp=assay, condition.grp=trt)
MB.2D.2$con.group # check the condition, replicate and time groups[检查的情况下，复制和时间组]
MB.2D.2$rep.group
MB.2D.2$time.group

plotProfile(MB.2D.2, type="b") # plot the no. 1 gene[绘制无。 1基因]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册