找回密码
 注册
查看: 849|回复: 0

R语言 Rsamtools包 VcfInput()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-26 13:30:27 | 显示全部楼层 |阅读模式
VcfInput(Rsamtools)
VcfInput()所属R语言包:Rsamtools

                                         Operations on ‘VCF’ or ‘BCF’ (variant call) files.
                                         VCF的或的BCF(变种呼叫)文件的操作。

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Import, coerce, or index variant call files in text or binary format.
进口,强迫,或指数变种在调用文本或二进制格式的文件。


用法----------Usage----------



scanBcfHeader(file, ...)
## S4 method for signature 'character'
scanBcfHeader(file, ...)

scanBcf(file, ...)
## S4 method for signature 'character'
scanBcf(file, index = file, ..., param=ScanBcfParam())

asBcf(file, dictionary, destination, ...,
      overwrite=FALSE, indexDestination=TRUE)
## S4 method for signature 'character'
asBcf(file, dictionary, destination, ...,
      overwrite=FALSE, indexDestination=TRUE)

indexBcf(file, ...)
## S4 method for signature 'character'
indexBcf(file, ...)

scanVcfHeader(file, ...)
## S4 method for signature 'character'
scanVcfHeader(file, ...)

scanVcf(file, ..., param)
## S4 method for signature 'character,ANY'
scanVcf(file, ..., param)
## S4 method for signature 'character,missing'
scanVcf(file, ..., param)
## S4 method for signature 'connection,missing'
scanVcf(file, ..., param)

unpackVcf(x, hdr, ..., info=TRUE, geno=TRUE)
## S4 method for signature 'list,missing'
unpackVcf(x, hdr, ..., info=TRUE, geno=TRUE)
## S4 method for signature 'list,character'
unpackVcf(x, hdr, ..., info=TRUE, geno=TRUE)
## S4 method for signature 'list,TabixFile'
unpackVcf(x, hdr, ..., info=TRUE, geno=TRUE)



参数----------Arguments----------

参数:file
For scanBcf and scanBcfHeader, the character() file name of the "VCF" or "BCF" file to be processed, or an instance of class BcfFile. For scanVcf and scanVcfHeader, the character() file name, TabixFile, or class connection ( file() or bgzip()) of the "VCF" file to be processed.
scanBcf和scanBcfHeader“的”VCF的或的BCF“的字符()的文件名的文件进行处理,或类BcfFile的一个实例。 scanVcf和scanVcfHeader,文件名,TabixFile,或类connection(file()或bgzip())的“字符() VCF的文件进行处理。


参数:index
The character() file name(s) of the "BCF" index to be processed.
要处理的文件名字符()“的BCF指数(S)。


参数:dictionary
a character vector of the unique “CHROM” names in the VCF file.
一个独特的“CHROM VCF文件名字符向量。


参数:destination
The character(1) file name of the location where the BCF output file will be created. For asBcf this is without the “.bcf” file suffix.
(1)在BCF输出文件将创建的位置字符文件名。 asBcf无“BCF”的文件后缀是。


参数:param
A instance of ScanBcfParam or ScanVcfParam influencing which records are parsed and the "INFO" and "GENO" information returned.
一个ScanBcfParam或ScanVcfParam影响哪些记录的分析和“信息”和“基因型”的信息返回的实例。


参数:...
Additional arguments, e.g., for scanBcfHeader,character-method, mode of BcfFile.
额外的参数,如scanBcfHeader,character-method,modeBcfFile。


参数:overwrite
A logical(1) indicating whether the destination can be over-written if it already exists.
逻辑(1)目标是否可以超额书面如果它已经存在。


参数:indexDestination
A logical(1) indicating whether the created destination file should also be indexed.
逻辑(1)创建的目标文件是否也应该被索引。


参数:x
A list() resulting from scanVcf.
一个列表()scanVcf。


参数:hdr
A character(1) or TabixFile instance from which scanBamHeader can extract information on the structure of INFO and FORMAT specifications.
一个字符(1)或TabixFile实例,从scanBamHeaderINFO和FORMAT规格结构上可以提取信息。


参数:info, geno
For non-“missing” methods of unpackVcf, a logical(1) indicating whether the "INFO" or "GENO" fields of x should be expanded. If TRUE, then scanVcfHeader(hdr) is consulted for the description of INFO and / or FORMAT fields.  For the “missing” method of unpackVcf, a logical(1) (in which case the corresponding field is not unpacked, regardless of value) or DataFrame or data.frame with row names corresponding to field elements, and with columns Number and Type as defined in the VCF specification at the URL below. Usually, these are obtained from scanVcfHeader on the same file as used to parse the data passed as argument x.
对于非“失踪”的方法unpackVcf,逻辑(1)“信息”或“基因型”领域是否x应扩大。 TRUE如果,则scanVcfHeader(hdr)咨询的信息和/或格式字段的描述。 unpackVcf,逻辑(1)(在这种情况下,相应的字段不解开,不论价值)或DataFrame或data.frame相应行名称为“丢失”的方法字段元素,与列Number和Type在下面的网址VCF的规范定义。通常,这些都是从scanVcfHeader在同一个文件,用于解析传递的数据作为参数x。


Details

详情----------Details----------

Most users will use the vcf* functions; bcf* are restricted to the GENO fields supported by "bcftools" (see documentation at the url below). The argument param allows portions of the file to be input, but requires that the file be BCF or bgzip'd and indexed as a TabixFile.
大多数用户将使用vcf*功能;bcf*到基因型“bcftools”(见下面的URL文件)支持的领域的限制。参数param允许部分文件输入,但要求该文件是BCF或bgzipd和TabixFile索引。

scanVcf with param="missing" and file="character" or file="connection" scan the entire file. With file="connection", an argument n indicates the number of lines of the VCF file to input; a connection open at the beginning of the call is open and incremented by n lines at the end of the call, providing a convenient way to stream through large VCF files.
scanVcfparam="missing"和file="character"或file="connection"扫描整个文件。用file="connection"参数n表示输入的VCF文件的行数;连接在呼叫开始开放是开放的,n线在年底递增的号召,通过大vcf文件流提供了一个方便的方式。

The INFO field of the scanned VCF file is returned as a single "packed" vector, as in the VCF file. The GENO field is returned as a list of matricies, each matrix corresponds to a field as defined in the FORMAT field of the VCF header. Each matrix has as many rows as scanned in the VCF file, and as many columns as there are samples. As with the INFO field, the elements of the matrix are "packed". The reason that INFO and GENO are returned packed is to facilitate manipulation, e.g., selecting particular rows or samples in a consistent manner across elements.
扫描VCF文件的信息字段返回一个单一的包装的向量,在VCF文件。返回基因型领域作为基质中的列表,每个矩阵对应字段定义格式的VCF的头场。每个矩阵有许多行VCF文件扫描,多列有样品。至于与信息领域,矩阵的元素是“包装”。返回包装,信息和基因型的原因是为了方便操作,例如,选择特定的行或样品中各元素的一致的方式。

unpackVcf processes the INFO and / or GENO fields, typically using the information encoded in the header and extracted by consulting scanVcfHeader. When the INFO or FORMAT specification includes a field Number. When this is an integer value, the corresponding INFO or GENO is unpacked as a matrix or array. For fields with variable numbers of elements ("A", "G", "."), the unpacked data is a list of vectors (for INFO) or list of list of vectors (for GENO), with the outer list corresponding to rows in the scanned VCF, the inner list of GENO corresponding to samples, and the inner vector corresponding to sub-elements of the element.
unpackVcf处理的信息和/或基因型领域,通常使用头编码和咨询scanVcfHeader提取信息。当信息或格式规范包括一个字段数目。当这是一个整数值,相应的信息或基因型解压矩阵或数组。对于可变数量的元素(“A”,“G”,。)等领域,是解压缩后的数据向量(信息)或向量列表(基因型)名单列表, VCF的扫描,相应样品的基因型内部列表和相应的元素的子元素内向量的行对应的外部列表。


值----------Value----------

scanVcfHeader / scanBcfHeader returns a list, with one element for each file named in file. Each element of the list is itself a list containing three element. The reference element is a character() vector with names of reference sequences. The sample element is a character() vector of names of samples. The header element is a character() vector of the header lines (preceeded by “##”) present in the VCF file.
scanVcfHeader/scanBcfHeader一个元素为每个文件名为file返回一个列表。列表中的每个元素本身就是一个列表,其中包含三个元素。 reference元素是一个字符()向量与参考序列的名称。 sample元素是一个字符()向量样本的名称。 header元素是一个字符()VCF文件中的标题行(preceeded“#”)的向量。

scanVcf / scanBcf returns a list, with one element per file. Each list has 9 elements, corresponding to the columns of the VCF specification: CHROM, POS, ID, REF, ALTQUAL, FILTER, INFO, FORMAT, GENO.
scanVcf/scanBcf返回一个列表,每一个文件中的元素。每个列表中有9个元素,相应的VCF的规范列:CHROM,POS,ID,REF,ALT“QUAL FILTER,INFO,FORMAT,GENO。

The GENO element is itself a list, with elements corresponding to those defined in the VCF file header. For scanVcf, elements of GENO are returned as a matrix of records x samples; if the description of the element in the file header indicated multiplicity other than 1 (e.g., variable number for “A”, “G”, or “.”), then each entry in the matrix is a character string with sub-entries comma-delimited.
GENO元素本身也是一个列表,与相应的VCF文件头中定义的元素。 scanVcf,的基因型元素是返回一个记录x样品的基质,如果文件头中的元素的描述比1(例如,变量数为“A”的“G”表示多重其他或“。”),然后在矩阵中的每个条目是一个子项以逗号分隔的字符串。

asBcf creates a binary BCF file from a text VCF file.
asBcf从文本VCF文件创建一个二进制的BCF文件。

indexBcf creates an index into the BCF file.
indexBcf到的BCF文件中创建一个索引。

unpackVcf returns a list of the same form as scanVcf, but with INFO and / or GENO elements unpacked to matrix or list elements as appropriate.
unpackVcf为scanVcf相同的形式返回一个列表,但信息和/或解压缩到适当的矩阵或列表中的元素的基因型元素。


作者(S)----------Author(s)----------



Martin Morgan <mtmorgan@fhcrc.org>.




参考文献----------References----------

specification.
information on the portion of the specification implemented by <code>bcftools</code>.
<code>samtools</code>.

参见----------See Also----------

BcfFile, TabixFile
BcfFile,TabixFile


举例----------Examples----------


fl <- system.file("extdata", "ex1.bcf", package="Rsamtools")
scanBcfHeader(fl)
bcf <- scanBcf(fl)
## value: list-of-lists[#值:列表名单]
str(bcf[1:8])
names(bcf[["GENO"]])
str(head(bcf[["GENO"]][["PL"]]))
example(BcfFile)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-25 07:38 , Processed in 0.021528 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表