找回密码
 注册
查看: 646|回复: 0

R语言 puma包 createContrastMatrix()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-26 11:32:57 | 显示全部楼层 |阅读模式
createContrastMatrix(puma)
createContrastMatrix()所属R语言包:puma

                                        Automatically create a contrast matrix from an ExpressionSet and optional design matrix
                                         自动创建一个对比矩阵,从ExpressionSet和可选的设计矩阵

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

To appear
出现


用法----------Usage----------


createContrastMatrix(eset, design=NULL)



参数----------Arguments----------

参数:eset
An object of class ExpressionSet.  
对象类ExpressionSet。


参数:design
A design matrix  
一个设计矩阵


Details

详情----------Details----------

The puma package has been designed to be as easy to use as possible, while not compromising on power and flexibility. One of the most difficult tasks for many users, particularly those new to microarray analysis, or statistical analysis in general, is setting up design and contrast matrices. The puma package will automatically create such matrices, and we believe the way this is done will suffice for most users' needs.
puma包已被设计成尽可能容易使用,而不是妥协的力量和灵活性。对于许多用户,尤其是那些新的微阵列分析,或在一般的统计分析,最困难的任务之一,成立设计和对比度矩阵。 puma包会自动创建这样的矩阵,我们相信这样做的方法,将足以满足大多数用户的需求。

It is important to recognise that the automatic creation of design and contrast matrices will only happen if appropriate information about the levels of each factor is available for each array in the experimental design. This data should be held in an AnnotatedDataFrame class. The easiest way of doing this is to ensure that the AnnotatedDataFrame object holding the raw CEL file data has an appropriate phenoData slot. This information will then be passed through to any ExpressionSet object created, for example through the use of mmgmos. The phenoData slot of an ExpressionSet object can also be manipulated directly if necessary.
重要的是要认识到,自动生成的设计和对比度矩阵只会发生,如果各因素水平的适当的信息是在实验设计为每个阵列。这些数据应在一个AnnotatedDataFrame类举行。这样做的最简单的方法是,以确保AnnotatedDataFrame对象的原材料为CEL文件中的数据有一个合适的phenoData插槽。这些信息将被传递到任何ExpressionSet创建的对象,例如,通过使用mmgmos。 phenoDataExpressionSet对象的插槽也可以直接操作,如果必要的话。

Design and contrast matrices are dependent on the experimental design. The simplest experimental designs have just one factor, and hence the phenoData slot will have a matrix with just one column. In this case, each unique value in that column will be treated as a distinct level of the factor, and hence pumaComb will group arrays according to these levels. If there are just two levels of the factor, e.g. A and B, the contrast matrix will also be very simple, with the only contrast of interest being A vs B. For factors with more than two levels, a contrast matrix will be created which reflects all possible combinations of levels. For example, if we have three levels A, B and C, the contrasts of interest will be A vs B, A vs C and B vs C. In addition, if the others argument is set to TRUE, the following additional contrasts will be created: A vs other (i.e. A vs B \& C), B vs other and C vs other. Note that these additional contrasts are experimental, and not currently recommended for use in calculating differential expression.
设计和对比度矩阵是依赖于实验设计。最简单的实验设计只是其中的一个因素,因此phenoData插槽将有一列的矩阵。在这种情况下,该列中的每一个独特的价值,将被视为不同水平的因素,因此pumaComb组阵列将根据这些水平。如果有两个层面的因素,例如对比矩阵A和B,也将是非常简单的利益因素,对于有两个以上的水平,A与B。只有对比,对比矩阵将被创建,这反映了所有可能的组合水平。例如,如果我们有三个级别,A,B和C,利益的反差将是A对B,A与C和B和C。此外,如果others参数设置为TRUE,下面将创建额外的对比:A与其他(即A对B \和C),B对C与其他。请注意,这些额外的对比实验,目前不建议使用计算差异表达。

If we now consider the case of two or more factors, things become more complicated. There are now two cases to be considered: factorial experiments, and non-factorial experiments. A factorial experiment is one where all the combinations of the levels of each factor are tested by at least one array (though ideally we would have a number of biological replicates for each combination of factor levels). The estrogen case study from the package vignette is an example of a factorial experiment.
如果我们现在考虑的两个或两个以上因素的情况下,事情变得更加复杂。现在有两起情况被认为是:因试验,非因试验。析因实验是一个各因素水平的所有组合都至少有一个数组(虽然理想,我们将不得不为每个因子水平的组合复制的生物数量)测试。 estrogen从包暗角案例研究是一个阶乘实验的例子。

A non-factorial experiment is one where at least one combination of levels is not tested. If we treat the example used in the puma-package help page as a two-factor experiment (with factors “level” and “batch”), we can see that this is not a factorial experiment as we have no array to test the conditions “level=ten” and “batch=B”. We will treat the factorial and non-factorial cases separately in the following sections.
非因试验是其中至少有一个水平的组合没有进行测试。如果我们把puma-package帮助页中作为两个因素的实验因素“水平”和“批处理”的例子,我们可以看到,这不是一个阶乘的实验,因为我们有没有数组测试条件“级= 10”和“批处理= B的”。在下面的章节中,我们将分别对待阶乘和非因素的情况下。

Factorial experiments
因试验

For factorial experiments, the design matrix will use all columns from the phenoData slot. This will mean that pumaComb will group arrays according to a combination of the levels of all the factors.
对于因试验设计矩阵将使用phenoData插槽的所有列。这将意味着,pumaComb组阵列将根据所有因素水平的组合。

Non-factorial designs
非因设计

For non-factorial designed experiments, we will simply ignore columns (right to left) from the phenoData slot until we have a factorial design or a single factor. We can see this in the example used in the puma-package help page. Here we have ignored the “batch” factor, and modelled the experiment as a single-factor experiment (with that single factor being “level”).
对于非因子设计的实验中,我们将简单地忽略phenoData插槽列(右至左),直到我们有一个阶乘设计或单一因素。我们可以看到在puma-package帮助页中使用的例子。在这里,我们都忽略了“批”的因素,作为一个单因素试验(“级别”,单因子)和模拟实验。


值----------Value----------

The result is a matrix. See the code below for an example.
结果是一个矩阵。看到下面的一个例子的代码。


作者(S)----------Author(s)----------


Richard D. Pearson



参见----------See Also----------

Related methods createDesignMatrix and pumaDE
相关方法createDesignMatrix和pumaDE


举例----------Examples----------


# This is a simple example based on a real data set. Note that this is an "unbalanced" design, the "level" factor has two replicates of the "twenty" condition, but only one replicate of the "ten" condition. Also note that the second factor, "batch" is not used in the design or contrast matrices, as we don't have every combination of the levels of "level" and "batch" (there is no array for level=twenty and batch=B).[这是基于真实数据集上的一个简单的例子。请注意,这是一个“不平衡”的设计,“水平”的因素有两个复制的“二十”的条件,但只有一个复制的“十大”条件。也请注意,第二个因素,“批”是不是在设计或对比矩阵,因为我们没有每一个“级别”的水平相结合和“批处理”(有没有水平数组=二十和批处理=)。]

#        Next 4 lines commented out to save time in package checks, and saved version used[接下来的4行注释掉包检查,以节省时间,并保存版本使用]
# if (require(affydata)) {[(要求(affydata)){]
#        data(Dilution)[数据(稀释)]
#        eset_mmgmos &lt;- mmgmos(Dilution)[< -  mmgmos eset_mmgmos(稀释)]
# }[}]
data(eset_mmgmos)
createContrastMatrix(eset_mmgmos)

# The following shows a set of 15 synthetic data sets with increasing complexity. We first create the data sets, then look at the contrast matrices.[下面显示了一个日益复杂的集15集合成数据。我们首先创建数据集,然后再看看对比矩阵。]

# single 2-level factor[单2级的因素]
eset1 <- new("ExpressionSet", exprs=matrix(0,100,4))
pData(eset1) <- data.frame("class"=c(1,1,2,2))

# single 2-level factor - unbalanced design[单2级不平衡因素 - 设计]
eset2 <- new("ExpressionSet", exprs=matrix(0,100,4))
pData(eset2) <- data.frame("class"=c(1,2,2,2))

# single 3-level factor[单3级因子]
eset3 <- new("ExpressionSet", exprs=matrix(0,100,6))
pData(eset3) <- data.frame("class"=c(1,1,2,2,3,3))

# single 4-level factor[单4级的因素]
eset4 <- new("ExpressionSet", exprs=matrix(0,100,8))
pData(eset4) <- data.frame("class"=c(1,1,2,2,3,3,4,4))

# 2x2 factorial[2x2的阶乘]
eset5 <- new("ExpressionSet", exprs=matrix(0,100,8))
pData(eset5) <- data.frame("fac1"=c("a","a","a","a","b","b","b","b"), "fac2"=c(1,1,2,2,1,1,2,2))

# 2x2 factorial - unbalanced design[2x2的因子 - 不平衡设计]
eset6 <- new("ExpressionSet", exprs=matrix(0,100,10))
pData(eset6) <- data.frame("fac1"=c("a","a","a","b","b","b","b","b","b","b"), "fac2"=c(1,2,2,1,1,1,2,2,2,2))

# 3x2 factorial[3x2的阶乘]
eset7 <- new("ExpressionSet", exprs=matrix(0,100,12))
pData(eset7) <- data.frame("fac1"=c("a","a","a","a","b","b","b","b","c","c","c","c"), "fac2"=c(1,1,2,2,1,1,2,2,1,1,2,2))

# 2x3 factorial[的2x3阶乘]
eset8 <- new("ExpressionSet", exprs=matrix(0,100,12))
pData(eset8) <- data.frame(
        "fac1"=c("a","a","a","a","a","a","b","b","b","b","b","b")
,        "fac2"=c(1,1,2,2,3,3,1,1,2,2,3,3) )

# 2x2x2 factorial[2x2x2的因子]
eset9 <- new("ExpressionSet", exprs=matrix(0,100,8))
pData(eset9) <- data.frame(
        "fac1"=c("a","a","a","a","b","b","b","b")
,        "fac2"=c(1,1,2,2,1,1,2,2)
,        "fac3"=c("X","Y","X","Y","X","Y","X","Y") )

# 3x2x2 factorial[3x2x2因子]
eset10 <- new("ExpressionSet", exprs=matrix(0,100,12))
pData(eset10) <- data.frame(
        "fac1"=c("a","a","a","a","b","b","b","b","c","c","c","c")
,        "fac2"=c(1,1,2,2,1,1,2,2,1,1,2,2)
,        "fac3"=c("X","Y","X","Y","X","Y","X","Y","X","Y","X","Y") )

# 3x2x2 factorial[3x2x2因子]
eset11 <- new("ExpressionSet", exprs=matrix(0,100,12))
pData(eset11) <- data.frame(
        "fac1"=c("a","a","a","a","a","a","b","b","b","b","b","b")
,        "fac2"=c(1,1,2,2,3,3,1,1,2,2,3,3)
,        "fac3"=c("X","Y","X","Y","X","Y","X","Y","X","Y","X","Y") )

# 3x2x2 factorial[3x2x2因子]
eset12 <- new("ExpressionSet", exprs=matrix(0,100,18))
pData(eset12) <- data.frame(
        "fac1"=c("a","a","a","a","a","a","b","b","b","b","b","b","c","c","c","c","c","c")
,        "fac2"=c(1,1,2,2,3,3,1,1,2,2,3,3,1,1,2,2,3,3)
,        "fac3"=c("X","Y","X","Y","X","Y","X","Y","X","Y","X","Y","X","Y","X","Y","X","Y") )

# 2x2x2x2 factorial[2x2x2x2阶乘]
eset13 <- new("ExpressionSet", exprs=matrix(0,100,16))
pData(eset13) <- data.frame(
        "fac1"=c("a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b")
,        "fac2"=c(0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1)
,        "fac3"=c(2,2,3,3,2,2,3,3,2,2,3,3,2,2,3,3)
,        "fac4"=c("X","Y","X","Y","X","Y","X","Y","X","Y","X","Y","X","Y","X","Y") )

# "Un-analysable" data set - all arrays are from the same class[“联合国的可分析数据集 - 所有数组是从同一类]
eset14 <- new("ExpressionSet", exprs=matrix(0,100,4))
pData(eset14) <- data.frame("class"=c(1,1,1,1))

# "Non-factorial" data set - there are no arrays for fac1="b" and fac2=2. In this case only the first factor (fac1) is used.[“非阶乘”数据集 - 有为fac1没有阵列=“b”和fac2 = 2。在这种情况下,只有第一个因素(fac1)。]
eset15 <- new("ExpressionSet", exprs=matrix(0,100,6))
pData(eset15) <- data.frame("fac1"=c("a","a","a","a","b","b"), "fac2"=c(1,1,2,2,1,1))

createContrastMatrix(eset1)
createContrastMatrix(eset2)
createContrastMatrix(eset3)
createContrastMatrix(eset4)
createContrastMatrix(eset5)
createContrastMatrix(eset6)
createContrastMatrix(eset7)
createContrastMatrix(eset8)
createContrastMatrix(eset9)
# For the last 4 data sets, the contrast matrices get pretty big, so we'll just show the names of each contrast[在过去的4个数据集,对比矩阵得到相当大的,所以我们只显示每个对比的名字]
colnames(createContrastMatrix(eset10))
colnames(createContrastMatrix(eset11))
# Note that the number of contrasts can rapidly get very large for multi-factorial experiments![注意对比,可以迅速得到非常大的多因子实验!]
colnames(createContrastMatrix(eset12))
# For this final data set, note that the puma package does not currently create interaction terms for data sets with 4 of more factors. E-mail the author if you would like to do this.[对于这最后的数据集,请注意,PUMA包目前没有创建数据集的互动方面,与更多的因素4。电子邮件的作者,如果你想做到这一点。]
colnames(createContrastMatrix(eset13))

# "Un-analysable" data set - all arrays are from the same class - gives an error.  Note that we've commented this out so that we don't get errors which would make the package fail the Bioconductor checks![“联合国可分析”数据集 - 从同一类的所有阵列 - 提供了一个错误。请注意,我们已经评论这出来,使我们没有得到错误,这将使包失败Bioconductor检查!]
# createContrastMatrix(eset14)[createContrastMatrix(eset14)]
# "Non-factorial" data set - there are no arrays for fac1="b" and fac2=2. In this case only the first factor (fac1) is used.[“非阶乘”数据集 - 有为fac1没有阵列=“b”和fac2 = 2。在这种情况下,只有第一个因素(fac1)。]
createContrastMatrix(eset15)


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-1 00:53 , Processed in 0.021444 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表