找回密码
 注册
查看: 794|回复: 0

R语言 htSeqTools包 giniCoverage()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-25 22:03:04 | 显示全部楼层 |阅读模式
giniCoverage(htSeqTools)
giniCoverage()所属R语言包:htSeqTools

                                         Compute Gini coefficient.
                                         计算基尼系数。

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Calculate Gini coefficient of High-throughput Sequencing aligned reads. The index provides a measure of "inequality" in  read coverage which can be used for quality control purposes (see details).
计算高通量测序对齐的基尼系数读取。该指数提供了一个覆盖在读的“不平等”的措施,可用于质量控制的目的(见详情)。


用法----------Usage----------


giniCoverage(sample, mc.cores = 1, mk.plot = FALSE, seqName = "missing", species="missing", chrLengths="missing", numSim="missing")



参数----------Arguments----------

参数:sample
A RangedData or RangedDataList object
一个RangedData或RangedDataList对象


参数:seqName
If sample is a RangedData, name of sequence to use in plots  
如果样品是1 RangedData,序列名称中使用图


参数:mk.plot
Logical. If TRUE, logarithm of coverage values' histogram and Lorenz Curve plot are plotted.  
逻辑。如果是TRUE,覆盖值的直方图和洛伦茨曲线图的对数绘制。


参数:mc.cores
If mc.cores is greater than 1, computations are performed in parallel for each element in the IRangesList object.
如果mc.cores的是大于1,计算每个元素进行并行在IRangesList对象。


参数:chrLengths
An integer array with lengths of chromosomes in sample for simluations of uniformily distributed reads.
读取sampleuniformily分布simluations的长度与染色体的整数数组。


参数:species
A BSgenome species to obtain chromosome lengths for simluations of uniformily distributed reads.
一个BSgenome种获得为uniformily分布读取simluations染色体长度。


参数:numSim
Number of simulations to perform in order to find the expected Gini coefficient.
模拟执行,以便找到预期的基尼系数的数目。


Details

详情----------Details----------

The Gini coefficient provides a measure of "inequality" in  read coverage. This can be used in any sequencing experiment where the goal is to find peaks, i.e. unusual accumulation of reads in some genomic regions. For instance, Chip-Seq etc. Typically these experiments will consist of samples of interest (e.g. immuno-precipitated) and controls. The samples of interest should exhibit higher peaks, whereas reads in the controls should show a more uniform distribution. Since the Gini coefficient can be seen as a measure of departure from uniformity, the coefficient should present smaller values in the control samples. Since the Gini coefficient depends on the number of reads per sample, a correction is performed by substracting the Gini index from a sample with uniformily distributed reads.
基尼系数提供了一个在读覆盖的“不平等”的措施。这可以在任何测序实验的目标是要找到峰,即不寻常的积累读取在一些基因组区域。例如,芯片SEQ等,通常情况下,这些实验将包括利益的样品(如免疫沉淀)和控制。感兴趣的样品应表现出更高的山峰,而在对照组的读取应该表现出更均匀的分布。由于基尼系数作为衡量从均匀性出发可以看出,系数应提出控制样品中的较小值。由于基尼系数取决于读取每个样品的数量,校正uniformily分布读取的基尼系数由减去从样品。


值----------Value----------

If mk.plot==FALSE, the Gini index and adjusted Gini index for each element in the RangedDataList or RangedData object.
如果mk.plot==FALSE,基尼系数调整后的基尼系数在RangedDataList或RangedData对象,并为每个元素。

If mk.plot==TRUE, a plot is produced showing the logarithm of coverage values' histogram and Lorenz Curve plot.
如果mk.plot==TRUE,有一个图是覆盖值的直方图和洛伦茨曲线图显示的对数。


方法----------Methods----------

Analize a single RangeData object with 'chrLengths' used for simulations ('Species' is ignored).
Analize RangeData一个用于模拟(种被忽略)“chrLengths的对象。

Analize a single RangeData object with chromosome lengths for simulations taken from BSgenome 'species' (package must be installed).
与染色体长度从BSgenome“物种”(必须安装包)采取模拟Analize的单RangeData对象。

Analize a single RangeData object with 'chrLengths' used as chromosome lengths in simulations.
与染色体长度在模拟使用chrLengths Analize的单RangeData对象。

Analize all RangeData objects from sample (RangedDataList) with hromosome lengths for simulations taken as the largest end position of reads in each chromosome of all samples.
Analize hromosome采取的最大读取所有样品在每个染色体的末端位置的模拟长度从样品RangeData对象(RangedDataList)的。

Analize all RangeData objects from sample (RangedDataList) with 'chrLengths' used as chromosome lengths in simulations ('Species' is ignored).
作为模拟染色体长度(种被忽略)“chrLengths”Analize从样品RangeData对象(RangedDataList)的。

Analize all RangeData objects from sample (RangedDataList) with chromosome lengths for simulations taken from BSgenome 'species' (package must be installed).
染色体长度从BSgenome“种”(必须安装包)采取模拟Analize从样品RangeData对象(RangedDataList)的。

Analize all RangeData objects from sample (RangedDataList) with 'chrLengths' used as chromosome lengths in simulations.
作为染色体长度在模拟chrLengths“Analize从样品RangeData对象(RangedDataList)的。

Analize all RangeData objects from sample (RangedDataList) with chromosome lengths for simulations taken as the largest end position of reads in each chromosome of sample.
染色体长度作为最大的末端位置,在每个染色体的样本读取模拟Analize从样品RangeData对象(RangedDataList)的。


作者(S)----------Author(s)----------



Camille Stephan-Otto




参考文献----------References----------

http://en.wikipedia.org/wiki/Gini_coefficient

参见----------See Also----------

ssdCoverage for another measure of inequality in coverage.
ssdCoverage覆盖不平等的另一措施。


举例----------Examples----------


set.seed(1)
peak1 <- round(rnorm(500,100,10))
peak1 <- RangedData(IRanges(peak1,peak1+38),space='chr1')
peak2 <- round(rnorm(500,200,10))
peak2 <- RangedData(IRanges(peak2,peak2+38),space='chr1')
ip <- rbind(peak1,peak2)
bg <- runif(1000,1,300)
bg <- RangedData(IRanges(bg,bg+38),space='chr1')

rdl <- RangedDataList(ip,bg)
ssdCoverage(rdl)
giniCoverage(rdl)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-5 22:45 , Processed in 0.025369 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表