R语言 OCplus包 tMixture()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-26 08:03:15

tMixture(OCplus)
tMixture()所属R语言包：OCplus

                                    Fit a mixture of t-distributions
                                       适合T-分布的混合物

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

For a vector of individual genewise t-statistics, this functions fits a distribution of central and non-central t-distributions, with the primary goal of estimating the proportion p0 of non-differentially expressed genes.
对于个别2-6。t-统计量的向量，此功能符合中央和非中央的T-分布的分布，估计比例p0非差异表达的基因主要目标。

用法----------Usage----------

tMixture(tstat, n1 = 10, n2 = n1, nq, p0, p1, D, delta, paired = FALSE,
      tbreak, ext = TRUE, threshold.delta=0.75, ...)

参数----------Arguments----------

参数：tstat
the vector of genewise t-statistics
向量t-统计量的2-6。

参数：n1
number of samples in the first group
在第一组的样本数

参数：n2
number of samples in the second group
在第二组的样本数

参数：nq
the number of components in the mixture that is fitted
混合物中安装的组件

参数：p0
a starting value for the proportion of non-differentially expressed genes.
非差异表达基因的比例为起始值。

参数：p1
a vector with starting values for the proportions of genes that are differentially expressed with effect size D.
与启动基因的比例值的差异与规模效应D表示的向量。

参数：D
a vector of starting values for the effect sizes of the differentially expressed genes, corresponding to the proportions p1.
一开始的差异表达基因的影响大小的值，相应的比例p1向量。

参数：delta
a vector of starting values for the effect sizes of the differentially expressed genes, expressed as non-centrality parameters; this is just a different way of specifying D, though if both are given, delta will get priority.
这是一个起点的差异表达基因的影响大小表示，作为非核心参数的值的向量;只是一个不同的方式指定D，但如果两者都给予delta将得到优先。

参数：paired
a logical value indicating whether the t-statistics are two-sample or paired.
一个逻辑值，指明是否t-统计量两个样本或配对。

参数：tbreak
either the number of equally spaced bins for tabulating tstat, or the explicit break points for the bins, very much like the argument breaks to function cut; the default value is the square root of the number of genes.
制表tstat，或明确垃圾箱破发点，很喜欢breaks函数cut的说法，无论是数量相等间隔的垃圾箱;默认值的平方根基因数目。

参数：ext
a logical value indicating whether to extend the bins, i.e. to set the lowest bin limit to -infinity and the largest bin limit to inifinity.
一个逻辑值，指示是否延长箱，即设置最低斌限制到无穷大和最大的bin到inifinity限制中。

参数：threshold.delta
mixture components with an estimated absolute non-centrality parameter delta below this value are considered to be too small for independent estimation; these components and their corresponding p1 are pooled with the null-component and p0, see Details.
估计绝对非集中参数delta低于此值的混合物成分被认为是独立的估计太小;这些组件和其相应的p1零组件汇集和p0 ，看到详细信息。

参数：...
additional arguments that are passed to optim to control the optimization.
附加参数传递optim控制的优化的。

Details

详情----------Details----------

The minimum parameter that needs to be specified is nq - if nothing else is given, the proportions are equally distributed between p0 and the p1, and the noncentrality parameters are set up symmetrically around zero, e.g. nq=5 leads to equal proportions of 0.2 and noncentrality parameters -2, -1, 1, and 2. If any of p1, D, or delta is specified, nq is redundant and will be ignored (with a warning). tMixture will in general make a valiant effort to deduce valid starting values from any combination of nq, p0, p1, D, and delta specified by the user, and will complain if that is not possible.
最低需要指定的参数是nq - 如果不出意外，这个比例是平均分配p0和p1之间的noncentrality参数对称零附近设立，例如： nq=50.2和noncentrality参数-2，-1，1，2的比例相等。如果有任何p1，D或delta指定nq是多余的，将被忽略（警告）。 tMixture将在总体上是一个勇敢的努力推断任何nq，p0，p1，D，delta组合有效的起始值由用户指定，会抱怨，如果这是不可能的。

The fitting problem that this function tries to solve is badly conditioned, and will in general depend on the precise set of starting values. Multiple runs from different starting values are usually a good idea. We have found however, that the model seems fairly robust towards misspecification of the number of components, at least when estimating p0. What happens when too many components are specified is that some of the nominally noncentral t-distributions describing the behaviour of differentially expressed genes are fitted with noncentrality parameters very close to zero, and the true p0 gets spread out between the nominal p0 and the almost-central components. Adding up these different contributions usually gives a similar solution to re-fitting the model with fewer components. The cutoff for the size of  non-centrality parameters that can be estimated realistically is specified via threshold.delta, whose default value is based on a small simulation study reported in Pawitan et al. (2005); see Examples. (Note that the AIC can also be helpful in determining the number of components.)
拟合这个函数试图解决的问题严重制约，一般依赖于初始值的精确集。多个运行不同的初始值通常是一个好主意。然而，我们发现，该模型似乎相当强劲，对误设的元件数量，至少在估计p0。时会发生什么太多的指定组件是一些描述的差异表达基因的行为名义上非中心T-分布安装noncentrality参数非常接近于零，而真正的p0传播之间的名义 p0“几乎的核心组成部分。通常将这些不同的贡献给出了一个类似的解决方案，重新装修用更少的元件模型。通过的threshold.delta，其默认值是基于一个小的模拟研究报告在Pawitan等被指定为大小非核心参数，可以实事求是地估计截止。（2005年）;看到的例子。（请注意，工商行政管理机关也可以有助于确定的元件数量。）

值----------Value----------

A list with the following components:
以下组件列表：

参数：p0.est
the estimated proportion of non-differentially expressed genes, after collapsing components with estimated non-centrality sizes below threshold.delta.
估计非差异表达基因的比例，在倒塌与下面threshold.delta大小估计非核心组件。

参数：p0.raw
the estimated proportion before collapsing the components.
前估计的比例倍数组件。

参数：p1
the estimated proportions of differentially expressed genes corresponding to the effect sizes, relating to p0.raw.
相应的影响大小的差异表达基因的比例估计，有关p0.raw。

参数：D
effect sizes of the differentially expressed genes in multiples of the gene-by-gene standard deviation.
效果大小的基因差异表达的基因通过基因标准差的倍数。

参数：delta
effect sizes of the differentially expressed genes expressed as the noncentrality parameter of the corresponding noncentral t-distribution.
noncentrality相应的非中心t分布参数的差异表达基因的影响大小表示。

参数：AIC
the AIC value for the maximum likelihood fit.
最大的可能性适合AIC值。

参数：opt
The output from optim, giving details about the optimization process.
从optim输出，使优化过程的详细信息。

作者（S）----------Author(s)----------

Y. Pawitan and A. Ploner

参考文献----------References----------

参见----------See Also----------

tstatistics, EOC, optim
tstatistics，EOC，optim

举例----------Examples----------

# We simulate a small example with 5 percent regulated genes and[我们模拟一个小例子，用5％的调节基因和]
# a rather large effect size[一个相当大的规模效应]
set.seed(2011)
xdat = matrix(rnorm(50000), nrow=1000)
xdat[1:25, 1:25] = xdat[1:25, 1:25] - 2
xdat[26:50, 1:25] = xdat[26:50, 1:25] + 2
grp = rep(c("Sample A","Sample B"), c(25,25))
# Use a helper function for the test statistics[使用一个辅助功能的测试统计]
myt = tstatistics(xdat, grp)$tstat
r1 = tMixture(myt, n1=25, nq=3)
r1

# Equivalently, we could have specified the same set of starting values [等价地，我们可以指定的初始值相同的一组]
# as follows:[如下：]
# r1 = tMixture(myt, n1=25, p0=1/3, p1=c(1/3, 1/3), delta=c(-1,1))[R1 = tMixture（MYT，N1 = 25，P0，P1 = 1/3 = C（1/3，1/3），δ= C（1,1））]

# Alternative starting value for p0, other starting values are filled in[替代的起点P0值，其他的起始值填写]
r2 = tMixture(myt, n1=25, nq=3, p0=0.80)
r2

# Specification of too many components usually leads to spurious[太多的组件规范通常导致杂散]
# noncentral components like here - note the difference between[喜欢这里的非中心组件 - 注意区别]
# p0.est and p0.raw![p0.est和p0.raw！]
r3 = tMixture(myt, n1=25, nq=5)
r3

# We simulate a data in a paired setting[我们在配对设置模拟数据]
# Note the arrangement of the columns[请注意列的安排]
set.seed(2012)
xdat = matrix(rnorm(50000), nrow=1000)
ndx1 = seq(1,50, by=2)
xdat[1:25, ndx1] = xdat[1:25, ndx1] - 2
xdat[26:50, ndx1] = xdat[26:50, ndx1] + 2
grp = rep(c("Sample A","Sample B"), 25)
# Use a helper function for the test statistics[使用一个辅助功能的测试统计]
myt = tstatistics(xdat, grp, paired=TRUE)$tstat
r1p = tMixture(myt, n1=25, nq=3)
r1p

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 OCplus包 tMixture()函数中文帮助文档(中英文对照)

浏览过的版块