R语言 iSeq包 mergetag()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 22:34:40

mergetag(iSeq)
mergetag()所属R语言包：iSeq

                                    Aggregate sequence tags into dynamic genomic windows/bins and count the number of
                                       总结成动态的基因组窗口/桶序列标签和计数的数量

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

A function to aggregate sequence tags into genomic windows/bins with dynamic length specified by the user and count the number of tags falling in the dynamic windows/bins.
到基因组动态长度由用户指定的窗口/桶的总序列标签和计数标签数量动态的窗口/桶下降的一个功能。

用法----------Usage----------

mergetag(chip,control,maxlen=80,minlen=10,ntagcut=10)

参数----------Arguments----------

参数：chip
A n by 3 matrix or data frame. The Rows correspond to sequence tags. chip[,1] contains chromosome IDs; chip[,2] contains the genomic positions of sequence tags matched to the reference genome. For each tag, in order to accurately infer the true binding sites, we suggest using the middle positions of the tags as the tags' positions on the chromosomes.  Note a genomic position must be an integer. chip[,3] contains the direction indicators of the sequence tags.  The user can basically use any symbols to represent the forward or reverse chains. Function 'mergetag' use integer 1 and 2 to represent the directions of the chains by doing as.numeric(as.factor(chip[,3])). Therefore, the user should know the directions referred by integer 1 and 2. For example, if the forward and reverse chains are represented by 'F' and 'R', respectively, then chains 1 and 2 will refer to the forward and reverse chain, respectively.  In the output, the tag counts are summarized for chains 1 and 2, respectively (see the below for details).
一个由3矩阵或数据框N。行对应序列标签。芯片[1]包含染色体的ID芯片[2]包含匹配的参考基因组序列标签基因组的位置。对于每一个标签，以准确地推断出真正的结合位点，我们建议使用标签的中间位置标记在染色体上的位置。注意：一个基因的位置必须是一个整数。芯片[3]包含序列标签的方向指标。用户基本上可以使用任何符号来表示正向或反向链。函数mergetag使用整数1和2代表做as.numeric的链方向（as.factor（芯片[3]））。因此，用户应该知道整数1和2中所指的方向。例如，如果是“F”和“R”，分别代表正向和反向的链，链1和2将参照正向和反向链，分别。在输出中，标签计数总结链分别为1和2，（详情请参见下面）。

参数：control
A n by 3 matrix or data frame. The column names of control must be the same as the column names of chip.
一个由3矩阵或数据框N。控制的列名必须是相同的芯片列名。

参数：maxlen
The maximum length of the genomic window/bin into which sequence tags are aggregated.
基因组的序列标签聚合的窗口/ bin中的最大长度。

参数：minlen
The minimum length of the genomic window/bin into which sequence tags are aggregated.
基因组的序列标签聚合的窗口/ bin中的最小长度。

参数：ntagcut
The tag count cutoff value for triggering bin size change. For example, suppose L_i and C_i are the length and tag count for bin i, respectively.  If C_i >= ntagcut, the length for bin i+1 will be min(L_i/2,minlen); if C_i < ntagcut, the length for bin i+1 will be max(2*L_i, maxlen).  Note, by default, the bin sizes decrease/increase by a factor of 2.  Thus, the user should let maxlen = (2^n)*minlen.
标签计数触发的bin大小变化的临界值。例如，假设L_I和C_i为斌的长度和标记计数，分别。如果C_i> = ntagcut，长滨I +1将分（L_I / 2，minlen的）; C_i <ntagcut，的长度为斌I +1将是最大的（2 * L_I，MAXLEN）。请注意，默认情况下的bin大小减小/增加了2倍。因此，应该让用户最多maxlen =（2 ^ N）* minlen的。

值----------Value----------

A data frame with rows corresponding to the bins and columns corresponding to the following:
对应以下的垃圾箱和列与行相应的数据框：

参数：chr
Chromosome IDs.
染色体的ID。

参数：gstart
The start position of the bin.
垃圾桶的开始位置。

参数：gend
The end position of the bin.
垃圾桶的结束位置。

参数：ct12
For one-sample analysis, where only the ChIP data are available, ct12 = ipct1 + ipct2. For two-sample analysis, where both the ChIP and control data are available.  ct12 = maximum(ipct1+ipct2-conct1-conct2,0).
对于一个样品的分析，其中仅芯片的数据可用CT12 = ipct1 + ipct2。“两样本分析，其中在芯片和控制数据。 CT12 =最大的（ipct1 + ipct2-conct1 conct2，0）。

参数：ipct1
The number of sequence tags for the chain 1 of the ChIP data.
1芯片的数据链序列标签的数量。

参数：ipct2
The number of sequence tags for the chain 2 of the ChIP data.
2芯片的数据链序列标签的数量。

参数：conct1
The number of sequence tags for the chain 1 of the control data.
为控制数据链序列标签的数量。

参数：conct2
The number of sequence tags for the chain 2 of the control data.
为控制数据链序列标签的数量。

作者（S）----------Author(s)----------

Qianxing Mo <a href="mailto:moq@mskcc.org">moq@mskcc.org</a>

参考文献----------References----------

data analysis. Biostatistics, Advance Access published September 13, 2011. doi:10.1093/biostatistics/kxr029

参见----------See Also----------

iSeq1, iSeq2, peakreg,plotreg
iSeq1，iSeq2，peakreg，plotreg

举例----------Examples----------

data(nrsf)
chip = rbind(nrsf$chipFC1592,nrsf$chipFC1862,nrsf$chipFC2002)
mock = rbind(nrsf$mockFC1592,nrsf$mockFC1862,nrsf$mockFC2002)

tagct = mergetag(chip=chip,control=mock,maxlen=80,minlen=10,ntagcut=10)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册