找回密码
 注册
查看: 435|回复: 0

R语言 htSeqTools包 fdrEnrichedCounts()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-25 22:02:43 | 显示全部楼层 |阅读模式
fdrEnrichedCounts(htSeqTools)
fdrEnrichedCounts()所属R语言包:htSeqTools

                                        Posterior probability that a certain number of repeats are higher than expected by chance.
                                         后验概率,重复的若干高于预期,一个偶然的机会。

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Given a vector of number of repeats (e.g. there are 100 sequences appearing once, 50 sequences appearing twice etc.) the function computes the false discovery rate that each number of repeats is unusually high.
鉴于50序列的重复次数(例如,有100个序列出现一次的向量,出现两次等)的函数计算错误发现率,每个重复的次数是不寻常的高。


用法----------Usage----------


fdrEnrichedCounts(counts,use=1:10,components=0,mc.cores=1)



参数----------Arguments----------

参数:counts
vector with observed frequencies. The vector must have names. tabDuplReads function can be used for this purpose.
向量与观测频率。向量必须有名称。 tabDuplReads函数可以用于此目的。


参数:use
number of repeats to be used when estimating the null distribution. The number of repeats expected if no unusually high  repeats are present. The first 10 are used by default.
估计空分布时要使用重复的次数。重复的次数预计,如果没有异常高的重复。首批10所使用的默认。


参数:components
number of negative binomials that will be used to fit the null distribution. The default value is 1. This value has to be between 0 and 4. If 0 is given the optimal number of negative biomials is chosen using the Bayesian information criterion (BIC)
数将用来满足空分布的负二项式。默认值是1。此值必须是介于0和4。如果0给出负biomials的最佳数量,选择使用贝叶斯信息标准(BIC)


参数:mc.cores
number of cores to be used to compute calculations. This parameter will be passed bt to mclappply
被用来计算计算的核心。此参数将通过BTmclappply


Details

详情----------Details----------

The null distribution is a combination of n negative binomials where. n is assigned  through the  components parameter. If components is equal to 0 the optimal number of negative binomials is choosen using the Bayesian information criterion (BIC). The parameters of the null distribution are estimated from the number of observations with as many repeats as told in the use parameter. If use is 1:10 the null distribution will be estimated using repeats that appear 1 time, 2 times, ... or 10 times.
空分布是其中负二项式的n个组合。 n是通过components参数分配。 components如果等于0负二项式的最佳choosen使用贝叶斯信息标准(BIC)。 use参数告诉许多重复,空分布的参数估计的若干意见。如果使用的是1:10,空分布将使用重复出现1次,2次,估计...或10倍。

False discovery rate for usually high number of repeats is done following an empirical Bayes scheme similar to that in Efron et al.  Let f0(x) be the null distribution, f(x) be the overall distribution and (1-pi0) the proportion of unusually high repeats. We assume the two component mixture f(x)= pi0 f0(x) + (1-pi0)f1(x). Essentially, f(x) is estimated from the data (imposing that f(x) must be monotone decreasing after its mode using isoreg from packabe base,  to improve the estimate in the tails). Currently pi0 is set to 1, i.e. its maximum possible value, which provides an upper bound for the FDR. The estimated false discovery rate for enrichment is 1-pi0*(1-cumsum(f0(x)))/(1-cumsum(f(x))). A monotone regression (isoreg) is applied to remove small random fluctuations in the estimated FDR and to guarantee that it decreases with x.
埃弗龙等类似的经验Bayes计划完成后,错误发现率通常高重复次数。让F0(x)是空分布,F(X)的总体分布和(1 PI0)的比例异常高的重复。我们假设这两个组件的混合物F(X)= PI0 F0(X)+(1-PI0)F1(X)。从本质上讲,F(X)估计数据(气势,F(X)必须是单调减少其模式后,使用isoregpackabebase,提高估计的尾巴)。目前PI0设置为1,即其最大可能的价值,它提供的FDR的上限。估计富集虚假的发现率是1-pi0*(1-cumsum(f0(x)))/(1-cumsum(f(x)))。单调回归(isoreg)应用于去除估计FDR的随机波动小,以保证它与x降低。


值----------Value----------

data.frame with the following columns:
data.frame下面的列:


参数:pdfH0
vector with pdf under the null hypothesis of no enrichment
没有富集的零假设下与PDF格式的矢量


参数:pdfOverall
vector with pdf for mixture distribution
与PDF向量混合分布


参数:fdrEnriched
vector with false discovery rate that each count is significantly enriched
虚假的发现率,极大地丰富向量,每项罪名


参考文献----------References----------




举例----------Examples----------


#Generate 1000 sequences repeated once, on the average[生成1000序列上的平均重复一次,]
nrepeats <- c(rpois(10^4,1),rpois(10,10))
nrepeats <- nrepeats[nrepeats>0]
counts <- table(nrepeats)
barplot(counts) -&gt; xaxis #observe bimodality around 10[观察周围10双峰]
fdrest <- fdrEnrichedCounts(counts,use=1:5,components=1)
cutoff <- xaxis[which(fdrest$fdrEnriched<0.95)[1]]
abline(v=cutoff,col=2)
text(cutoff,counts[1]/2,'cut-off',col=2)
head(fdrest)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-6 00:56 , Processed in 0.020921 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表