export-tracks(rtracklayer)
export-tracks()所属R语言包:rtracklayer
Export tracks
出口轨道
译者:生物统计家园网 机器人LoveR
描述----------Description----------
These functions output RangedData instances in various formats.
这些函数的输出RangedData实例在各种不同的格式。
用法----------Usage----------
export.gff(object, con, version = c("1", "2", "3"), source =
"rtracklayer", append = FALSE, ...)
export.gff1(object, con, ...)
export.gff2(object, con, ...)
export.gff3(object, con, ...)
export.bed(object, con, variant = c("base", "bedGraph", "bed15"),
color = NULL, append = FALSE, ...)
export.bed15(object, con, expNames = NULL, ...)
export.bedGraph(object, con, ...)
export.wig(object, con,
dataFormat = c("auto", "variableStep", "fixedStep"), ...)
export.ucsc(object, con, subformat = c("auto", "gff1", "wig", "bed",
"bed15", "bedGraph"), append = FALSE, ...)
## not yet supported on Windows
export.bw(object, con,
dataFormat = c("auto", "variableStep", "fixedStep", "bedGraph"),
seqlengths = GenomicRanges::seqlengths(object), compress = TRUE, ...)
参数----------Arguments----------
参数:object
The object to export, such as a RangedData, or anything coercible to a RangedData. If a UCSCData, the track line information is output. In the case of export.bed15, export.bedGraph, export.wig, and export.ucsc, a RangedDataList object with possibly multiple tracks is supported.
出口对象,如RangedData或任何一个RangedData强制转换。如果一个的UCSCData,轨道线的信息输出。在export.bed15,export.bedGraph,export.wig,export.ucsc,RangedDataList可能有多个音轨的支持对象。
参数:con
The connection to which the object is exported.
连接该对象用于出口。
参数:version
The <acronym>GFF</acronym> version, either "1", "2" or "3" (default is "1").
<acronym>GFF</的缩写版本,无论是“1”,“2”或“3”(默认为“1”)。
参数:source
The source of the GFF information, for <acronym>GFF</acronym>.
GFF信息源,为<acronym>GFF> </的缩写。
参数:variant
Which variant of BED lines to output, not for the user.
其中输出线床的变种,而不是用户。
参数:color
Recycled vector of colors, as interpreted by col2rgb for BED features. If NULL, the color column in the featureData is used, if any.
色彩再生的向量,作为解释col2rgb床的功能。如果NULL列color“featureData使用,如果有的话。
参数:dataFormat
The format of the data lines for <acronym>WIG</acronym> tracks, see references. The "auto" format uses the most efficient format possible.
为<acronym>WIG</>的缩写轨道数据线的格式,见参考文献。 “自动”的格式,使用最有效的格式。
参数:subformat
The format of the tracks within the <acronym>UCSC</acronym> container. If "auto", the type is determined from the trackline. If object is not a UCSCData, this essentially means "wig" or "bedGraph" (depending on the density) if there is a numeric score, else "bed".
格式UCSC</首字母缩写>容器内<acronym>轨道。如果“自动”,该类型从trackline决定。如果object不UCSCData,这基本上是指“假发”或的“bedGraph”(视密度),如果有一个数字得分,否则“床”。
参数:expNames
Names of the columns in object that hold the experimental data. Defaults to all column names, unless object is a UCSCData, in which case the expNames field is taken from the track line, if it exists.
object,列名持有的实验数据。所有列名的默认值,除非object是UCSCData,在这种情况下expNames领域采取从轨道线,如果它存在。
参数:seqlengths
The lengths of each sequence in object. If seqinfo(object) is missing sequence lengths, an attempt is made to retrieve the sequence lengths from an installed BSgenome package or UCSC, as long as there is a matching genome identifier.
object每个序列的长度。如果seqinfo(object)缺少序列的长度,试图从安装BSgenome包或UCSC的检索序列的长度,只要有一个匹配的基因组标识符。
参数:append
Logical, whether to append the output to the connection
逻辑,是否追加输出的连接
参数:compress
Logical, indicating whether to compress the bigWig output
逻辑,表示是否压缩要人输出,
参数:...
For export.gff1, export.gff2 and export.gff3: arguments to pass to export.gff. For export.bed: arguments to pass to methods. For export.bed15, export.bedGraph and export.wig: arguments to pass to export.ucsc. For export.ucsc: arguments to pass to export.subformat or to set on the slots of the TrackLine subclass corresponding to subformat.
为export.gff1,export.gff2和export.gff3:参数传递给export.gff。 export.bed:参数传递给方法。为export.bed15,export.bedGraph和export.wig:参数传递给export.ucsc。 export.ucsc:参数传递给export.subformat或TrackLine到subformat相应子类的插槽设置。
Details
详情----------Details----------
The following is some advice for choosing a file format.
下面是一些选择的文件格式的建议。
<acronym>GFF</acronym> The General Feature Format is meant to represent any set of genomic features, with application-specific columns represented as “attributes”. There are three principal versions (1, 2, and 3). This is a good format for interoperating with other genomic tools. UCSC supports GFF1, but it needs to be encapsulated in the UCSC metaformat,
<acronym>GFF</的缩写>一般特征格式是代表任何基因组功能集与应用程序特定的列表示为“属性”,。有三个主要版本(1,2和3)。这是一个良好的互操作与其他基因组工具格式。 UCSC的支持GFF1,但它需要在UCSC的metaformat封装,
<acronym>BED</acronym> The Browser Extended Display format is for displaying tracks in a genome browser, in particular UCSC. There are many options to control the appearance of the track, see GraphTrackLine. To output a track line when object is not a UCSCData,
<acronym>BED</首字母缩写>浏览器扩展显示格式显示在基因组浏览器的轨道,特别是加州大学圣克鲁兹分校,是。有很多选项来控制轨道的外观,看到GraphTrackLine。输出轨道线时object是不是UCSCData
<acronym>Bed15</acronym> An extension of BED with 15 columns, Bed15 is meant to represent data from microarray experiments. Multiple samples/columns are supported, and the data is displayed as a compact heatmap. With 15 columns per feature, this format is probably too verbose for e.g. ChIP-seq coverage (use multiple WIG
<acronym>Bed15</首字母缩写>延长15列,Bed15床是代表芯片实验数据。支持多个样品/列,作为一个紧凑型热图显示的数据。 15%列功能,这种格式可能是过于冗长的如芯片SEQ覆盖(使用多个地效翼船
<acronym>bedGraph</acronym> A variant of BED that represents experimental data more compactly than <acronym>BED</acronym> and especially <acronym>Bed15</acronym>, although only one sample is supported. The data is displayed as a bar or line graph. For dense data, WIG is preferred.
<acronym>bedGraph</的缩写>更紧凑比<acronym>BED</首字母缩写>,尤其是<acronym> Bed15> </的缩写,表示实验数据,一张床的变种,虽然只有一个样本的支持。条形或线图显示的数据。对于密集的数据,WIG是首选。
<acronym>WIG</acronym> The Wiggle format is meant for storing dense numerical data, such as the coverage from a ChIP-seq experiment. The data is displayed as a bar or line graph.
<acronym>WIG</的缩写>存储密集的数值数据,如覆盖从芯片跳台实验,摆动格式的意思。条形或线图显示的数据。
In summary, <acronym>BED</acronym> is usually best for displaying qualitative features or sparse quantiative features (like ChIP-seq peaks), while <acronym>WIG</acronym> is usually best for displaying dense data like coverage.
总之,<acronym>BED> </的缩写显示定性特征或稀疏的定量特征(如芯片SEQ峰)通常是最好的,而<acronym> WIG</首字母缩写是通常情况下,最好显示密集的数据,如覆盖。
In general, columns in the RangedData are mapped to the column in the track format of the same name. For example, a column named “itemRgb” will be mapped to the corresponding column in BED-formatted output, while it is ignored for other formats. Missing values are mapped between NA in R and the format-specific missing value indicator, usually “.”. The following describes how the RangedData object is mapped to each track format. Default values for columns are given in parentheses.
在一般情况下,列在RangedData映射到列中的名称相同的磁道格式。例如,一列名为“itemRgb”将被映射到相应的列在床格式化输出,而忽略其他格式。 NAR和特定格式的遗漏值指标,通常以“。”遗漏值映射。下面介绍如何RangedData对象映射到每个磁道格式。括号中列的默认值。
Maps columns named “source” (“rtracklayer”), “feature” (“sequence”), “score” (“.”), “strand” (“.”), “frame” (“.”), and (version 1 only) “group” (seqname). In GFF versions 2 and 3, extra columns are mapped to attributes.
图列名为“源”,(“rtracklayer”),“功能”(“序”),“得分”(“”),“链”(“”),“ ;框架“(”“),(第1版)”本集团“(seqname)。在GFF版本2和3,额外的列映射到属性。
Maps columns named “name” (“.”), “score” (“.”), “strand” (“.”), “thickStart” (start), “thickEnd” (end), “itemRgb” (“0,0,0”), “blockSizes”, and “blockStarts”. Note that the BED field “blockCounts” is derived automatically. The intervals specified by “thickStart”, “thickEnd” and “blockStarts” are 0-based, half-open as in BED. Note that this is different from the chromosome start/end stored in the Ranges object (1-based, closed). The “itemRgb” column should be specified in a format understood by col2rgb.
图列名为“名”(“”),“得分”(“”),“链”(“”),“thickStart”(start) ,“thickEnd”(end),“itemRgb”(“0,0,0”),“块大小”,“blockStarts。注意床领域“blockCounts”自动派生。由“thickStart”指定的时间间隔,的“thickEnd”和“blockStarts”是基于0的,在床上半开的。请注意,这是从染色体开始/结束Ranges对象(1为基础,封闭)存储不同。应指定在col2rgb可以理解的格式“itemRgb”列。
In addition to the behavior for <acronym>BED</acronym> above, encodes columns named by the expNames parameter into the fields “expCount”, “expIds” and “expScores”.
除了为<acronym>的行为BED</首字母缩写>以上,编码expNames参数命名进入领域的“expCount”,“expIds”和“expScores”的列。
The “score” column is used for the quantitative values.
“得分”列用于定量值。
The “score” column is used for the quantitative values.
“得分”列用于定量值。
The graph formats do not encode a strand. Thus, when targeting the UCSC format, if a track contains features from multiple strands, one track will be output for each strand. The string "m", "p" or "NA" is appended to the base track name for the minus, plus and NA/* strand, respectively.
图形格式不编码链。因此,针对UCSC的格式时,如果包含多股轨道的功能,一首曲目将是每个链的输出。 “M”,“P”或“不适用”追加碱基曲目名称为负,加和NA / *链,分别字符串。
值----------Value----------
If con is missing, a character vector containing the string output, otherwise nothing.
con如果缺少,包含字符串输出,否则没有一个字符向量。
作者(S)----------Author(s)----------
Michael Lawrence
参考文献----------References----------
参见----------See Also----------
See export for the high-level interface to these
看到export为这些高层次的接口
举例----------Examples----------
dummy <- file() # dummy file connection for demo[虚拟文件连接演示]
track <- import(system.file("tests", "bed.wig", package = "rtracklayer"))
## output a track as GFF2[#输出为GFF2轨道]
export.gff(track, dummy, version = "2")
## equivalently[#等价]
export.gff2(track, dummy)
## output as WIG string in variableStep format[#假发variableStep格式字符串输出]
wig <- export.wig(track, dummy, dataFormat = "variableStep")
## output multiple tracks in UCSC meta-format[#输出元格式在UCSC的多首曲目]
track2 <- import(system.file("tests", "v1.gff", package = "rtracklayer"))
## output to WIG[#输出到地效翼船]
library(IRanges) # for the RangedDataList() constructor[RangedDataList()构造]
export.ucsc(RangedDataList(track, track2), dummy, subformat = "wig")
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|