R语言 GenomicRanges包 countGenomicOverlaps()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 19:28:30

countGenomicOverlaps(GenomicRanges)
countGenomicOverlaps()所属R语言包：GenomicRanges

                                    Count Read Hits in Genomic Features
                                       基因组特征数阅读次数

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Count read hits per exon or transcript and resolve multi-hit reads.
计数读取命中每个外显子或成绩单和解决多受灾读取。

用法----------Usage----------

  ## S4 method for signature 'GRangesList,GRangesList'
countGenomicOverlaps(
query, subject,
type = c("any", "start", "end", "within", "equal"),
resolution = c("none", "divide", "uniqueDisjoint"),
ignore.strand = FALSE, splitreads = TRUE, ...)

参数----------Arguments----------

参数：query
A GRangesList, or a GRanges of genomic features. These are the annotations that define the genomic regions and will often  be the result of calling "exonsBy" or "transcriptsBy" on a  TranscriptDb object. If a GRangesList is  provided, each top level of the list represents a "super" such as a gene  and each row is a "sub" such as an exon or transcript.  When query is a GRanges all rows are considered to be of the same level (e.g., all genes, all exons or all transcripts).
一个GRangesList，或基因组功能1农庄。这些注解定义的基因组区域，往往会调用的“exonsBy”或“transcriptsBy”上TranscriptDb对象的结果。如果GRangesList提供，每个列表的顶级代表，如一个基因的“超级”，每一行是一个“子”，如外显子或转录。当query是一个所有行被认为是相同的水平（例如，所有的基因，所有外显子或全部成绩单）农庄。

参数：subject
A GRangesList, GRanges, or GappedAlignments representing the data (e.g., reads). List structures as the subject are used to represent reads with multiple parts (i.e., gaps in the CIGAR). When a GappedAlignments is provided  it is coerced to a GRangesList object. If any of the reads in the GappedAlignments have gaps, the corresponding  GRangesList will have multiple elements for that top level list. When subject is a GRanges, it is assumed that all reads are simple and do not have multiple parts.
一个GRangesList，农庄，或GappedAlignments代表的数据（例如，读取）。列表结构subject用来表示多个部分（即，在CIGAR的差距）读取。提供GappedAlignments当它被裹挟到GRangesList对象。如果在GappedAlignments读取任何有差距，的的相应GRangesList将有多个元素，顶级列表。当subject是一个农庄，它假定所有读取很简单，没有多个部分。

参数：type
See findOverlaps in the IRanges package for a description of this argument.
看到findOverlaps IRanges包这一论点的描述。

参数：resolution
A character(1) string of "none", "divide", or "uniqueDisjoint". These rule sets are used to distribute read hits when  multiple queries are hit by the same subject.
一个character(1)字符串“无”，“分化”，或“uniqueDisjoint”。这些规则集用于分发多个查询时，击中同一主题的读取命中。

"none" : No conflict resolution is performed. All subjects that hit more than 1 query are dropped.
“无”：无冲突的决议执行。打超过1查询所有科目都将被丢弃。

"divide" : The hit from a single subject is divided equally among  all queries that were hit. If a subject hit 4 queries  each query is assigned 1/4 of a hit.
“鸿沟”：从单一主体的命中分为同样在所有被击中的查询。如果一个主题命中4查询，每个查询分配一击1/4。

"uniqueDisjoint" : Queries hit by a common subject are  partitioned into disjoint intervals. Any regions that are shared  between the queries are discarded. If the read overlaps one of  these remaining unique disjoint regions the hit is assigned to  that feature. If the read overlaps both or none of the regions,  no hit is assigned. Therefore, unlike the divide option,  uniqueDisjoint does not resolve multi-hit conflict in  all situations.
“uniqueDisjoint”：一个共同的主题触及的查询划分成不相交的区间。查询之间共享的任何区域都将被丢弃。如果读的重叠，这些剩下的唯一的不相交的区域之一的命中被分配到该功能。如果只读重叠或区域分配，没有命中。因此，不同的是divide选项，uniqueDisjoint不解决多打在所有情况下的冲突。

参数：ignore.strand
A logical value indicating if strand should be considered when matching.
匹配时，应被视为一个逻辑值，指出如果链。

参数：splitreads
A logical value indicating if split reads should be included.
一个逻辑值，表明如果分裂读取，应包括在内。

参数：...
Additional arguments, perhaps used by methods defined on this generic.
额外的参数，也许这个通用定义的方法。

Details

详情----------Details----------

The countGenomicOverlaps methods use the findOverlaps  function in conjunction with a resolution method to identify overlaps and resolve subjects (reads) that match multiple queries (annotation regions). The usual type argument of findOverlaps is used to specify the type of overlap. The resolution argument is used to select a method to resolve the conflict when a subject hits more than 1 query. Here the term "hit" means an overlap identified by findOverlaps.
countGenomicOverlaps方法在一项决议的方法来识别重叠和解决科目（读取），匹配多个查询（注释区域）的结合使用findOverlaps功能。通常typefindOverlaps参数用于指定类型的重叠。 resolution参数用来选择一个方法来解决冲突的一个主题时，点击超过1查询。在这里，打指findOverlaps确定重叠。

The primary difference in the handling of split reads vs simple reads (i.e., no gap in the CIGAR) is the portion of the read hit each split read fragment has to contribute. All reads, whether simple or split, have an overall value of 1 to contribute to a query they hit. In the case of the split reads, this value is further divided by the number of fragments in the read. For example, if a split read has 3 fragments (i.e., two gaps in the CIGAR) each fragment has a value of 1/3 to contribute to the query  they hit. As with the simple reads, depending upon the resolution chosen the value may be divided, fully  assigned or discarded.
读取和简单的读取（即，没有CIGAR的差距）在处理分裂的主要区别是在读的部分击中读取每个分割片段贡献。所有内容，无论是简单或分割，有一个整体价值，有助于他们打的查询。在分裂读取的情况下，这个值被进一步划分在读的片段。例如，如果分裂读有3个片段（即两个差距在CIGAR），每个片段有1/3的价值，有助于他们打的查询。与简单的读取，这取决于resolution选择价值可分为完全分配或丢弃。

More detailed examples can be found in the countGenomicOverlaps vignette.
更详细的例子可以发现在countGenomicOverlaps小插曲。

值----------Value----------

A vector of counts
计数的向量

作者（S）----------Author(s)----------

Valerie Obenchain and Martin Morgan

举例----------Examples----------

## Not run: [＃无法运行：]
rng1 <- function(s, w)
GRanges(seq="chr1", IRanges(s, width=w), strand="+")

rng2 <- function(s, w)
GRanges(seq="chr2", IRanges(s, width=w), strand="+")

query <- GRangesList(A=rng1(1000, 500),
                  B=rng2(2000, 900),
                  C=rng1(c(3000, 3600), c(500, 300)),
                  D=rng2(c(7000, 7500), c(600, 300)),
                  E1=rng1(4000, 500), E2=rng1(c(4300, 4500), c(400, 400)),
                  F=rng2(3000, 500),
                  G=rng1(c(5000, 5600), c(500, 300)),
                  H1=rng1(6000, 500), H2=rng1(6600, 400))

subj <- GRangesList(a=rng1(1400, 500),
                  b=rng2(2700, 100),
                  c=rng1(3400, 300),
                  d=rng2(7100, 600),
                  e=rng1(4200, 500),
                  f=rng2(c(3100, 3300), 50),
                  g=rng1(c(5400, 5600), 50),
                  h=rng1(c(6400, 6600), 50))

## Overlap type = "any"[＃重叠类型=“任何”]
none <- countGenomicOverlaps(query, subj,
                           type="any", resolution="none")
divide <- countGenomicOverlaps(query, subj,
                           type="any", resolution="divide")
uniqueDisjoint <- countGenomicOverlaps(query, subj, type="any",
                                    resolution="uniqueDisjoint")
data.frame(none = none,
         divide = divide,
         uniqDisj = uniqueDisjoint)

## Split read with 4 fragments :[＃分裂与4个片段阅读：]
splitreads <- GRangesList(c(rng1(c(3000, 3200, 4000), 100), rng1(5400, 300)))
## Unlist both the splitreads and the query to see [＃不公开的splitreads和看到的查询。]
## - read fragments 1 and 2 both hit query 3[ - 读片段1和2都命中查询3]
## - read fragment 3 hits query 7[＃ - 读片段3分命中查询7]
## - read fragment 4 hits query 11 and 12 [ - 读片段4点击查询11日和12日]
findOverlaps(unlist(query), unlist(splitreads))

## Use countGenomicOverlaps to avoid double counting.[＃使用countGenomicOverlaps的以避免重复计算。]
## Because this read has 4 parts each part contributes a count of 1/4.[读＃因为这有4个部分，每个部分贡献了1/4的计数。]
## When resolution="none" only reads that hit a single region are counted. [＃当分辨率=“无”只读取，创下了单区域计数。]
split_none <- countGenomicOverlaps(query, splitreads, type="any",
                                 resolution="none")
## When resolution="divide" all reads are counted by dividing their count [＃当分辨率=“鸿沟”的所有读取计数其数量除以]
## evenly between the regions they hit. Region 3 of the query was hit[＃之间的区域，他们打均匀。查询第3区被击中]
## by two reads each contributing a count of 1/4. Region 7 was hit[＃读取由两个各贡献了1/4的计数。七区被击中]
## by one read contributing a count of 1/4. Regions 11 and 12 were both[＃一读贡献了1/4计数。 11日和12区均]
## hit by the same read resulting in having to share (i.e., "divide") the [＃受到相同的阅读分享（即“鸿沟”）]
## single 1/4 hit read 4 had to contribute.[＃单读4 1/4命中了贡献。]
split_divide <- countGenomicOverlaps(query, splitreads,
                                 type="any", resolution="divide")

data.frame(none = split_none,
         divide = split_divide)

## End(Not run)[＃结束（不运行）]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 GenomicRanges包 countGenomicOverlaps()函数中文帮助文档(中英文对照)

浏览过的版块