R语言 segmentSeq包 readMethods()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-26 14:00:50

readMethods(segmentSeq)
readMethods()所属R语言包：segmentSeq

                                    Functions for processing files of various formats into an ‘alignmentData’ object.
                                       用于加工成各种格式的文件alignmentData“对象的功能。

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

These functions take alignment files of various formats to produce an object (see Details) describing the alignment of sequencing tags from different libraries. At present, BAM and text files are supported.
这些功能需要对准各种格式的文件，以产生一个对象（见详情）描述的对齐序列标签从不同的库。目前，BAM和文本文件的支持。

用法----------Usage----------

readGeneric(files, dir = ".", replicates, libnames, chrs, chrlens, cols,
         header = TRUE, gap = 200, polyLength, estimationType = "quantile",
         verbose = TRUE, ...)

readBAM(files, dir = ".", replicates, libnames, chrs, chrlens, countID = NULL,
      gap = 200, polyLength, estimationType = "quantile", verbose = TRUE)

参数----------Arguments----------

参数：files
Filenames of the files to be read in.
要读入的文件的文件名

参数：dir
Directory (or directories) in which the files can be found.
目录（或目录）中的文件可以发现。

参数：replicates
A vector defining the replicate structure if the group. If and only if the ith library is a replicate of the jth library  then @replicates[i] == @replicates[j]. This argument may be given in any form but will be stored as a factor.
如果复制组结构定义一个向量。当且仅当第i个库是第j个库，然后@replicates[i] == @replicates[j]复制。这种说法可能会被以任何形式，但将存储的一个因素。

参数：libnames
Names of the libraries defined by the file names.
文件名定义库的名称。

参数：chrs
A chracter vector defining (a selection of) the chromosome names used in the alignment files.
一个两性间的矢量定义（选择）染色体对齐文件中使用的名称。

参数：chrlens
Lengths of the chromosomes to which the alignments were made.
的路线进行了染色体的长度。

参数：cols
A named character vector which describes which column of the input files contains which data. See Details.
一个名为特征向量描述输入文件的哪一列包含的数据。查看详细信息。

参数：countID
A (two-character) string used by the BAM file to identify the "counts" of individual sequenced reads; that is, how many times a given read appears in the sequenced library. If NULL, it is assumed that the data are redundant (see Details).
（两个字符）字符串的BAM文件，以确定个人测序的“罪名”读取使用，多少次给读序列库中出现。如果为NULL，则假设该数据是多余的（见详情）。

参数：header
Do the input files have a header line? Defaults to TRUE. See Details.
不要输入文件有标题行吗？默认为true。查看详细信息。

参数：gap
The maximum gap between aligned tags that should be allowed in constructing potential segments. See findChunks.
对齐，应在建设潜力段允许的标签之间的最大差距。看到findChunks。

参数：polyLength
If given, an integer value N defining the length of (approximate) homopolymers which will be removed from the data. If a tag contains a sequence of N+1 reads consisting of at least N identical bases, it will be removed. If not given, all data is used.
如果给定一个整数N值定义的长度（近似）均聚物，将数据删除。如果一个标签包含一个N +1的顺序读取至少n个相同的碱基组成，将被删除。如果没有给出，使用的所有数据。

参数：estimationType
The estimationType that will be used by the "baySeq" function getLibsizes to infer the library sizes of the samples.
将使用“baySeq功能getLibsizes来推断样本库大小estimationType。

参数：verbose
Should processing information be displayed? Defaults to TRUE.
要处理的信息显示吗？默认为true。

参数：...
Additional parameters to be passed to read.table. In particular, the "sep" and "skip" arguments may be useful.
额外的参数被传递到read.table。 “九月”和“跳跃”的论点，特别是可能是有用的。

Details

详情----------Details----------

readBAM: This function takes a set of BAM files and generates the 'alignmentData' object from these. If a character string for "countID" is given, the function assumes the data are non-redundant and that "countID" identifies the count data (i.e., how many times each read appears in the sequenced library) in each BAM file. If "countID" is NULL, then it is assumed that the data are redundant, and the count data are inferred from the file.
readBAM：此功能需要一组的BAM文件，并生成这些'alignmentData'对象的。如果一个countID字符串，函数假定的数据冗余和非“countID”标识的计数数据，在BAM的文件（即，每个读了多少次出现在测序库）。 countID“如果是NULL，那么它是假定数据是多余的，计数数据文件推断。

readGeneric: The purpose of this function is to take a set of plain text files and produce an 'alignmentData' object. The function uses read.table to read in the columns of data in the files and so by default columns are separated by any white space. Alternative separators can be used by passing the appropriate value for 'sep' to read.table.
readGeneric：此功能的目的是采取了一套纯文本文件，并产生一个'alignmentData'对象。功能使用read.table默认列中的文件和数据读取的列任何空白分隔。通过适当的值'sep'read.table的，可以用来替代分隔。

The files may contain columns with column names 'chr', 'tag', 'count', 'start', 'end', 'strand' in which case the "cols" argument can be ommitted and "header" set to TRUE. If this is the case, there is no requirement for all the files to have the same ordering of columns (although all must have these column names).
该文件可能包含列名列'chr'，'tag'，'count'，'start'，'end'，'strand'在这种情况下，COLS参数可以省略和“头”设置为TRUE。如果是这样的情况下，有没有为所有的文件有相同的列顺序（虽然都必须有这些列名）的要求。

Alternatively, the columns of data in the input files can be specified by the "cols" argument in the form of a named character vector (e.g; 'cols = c(chr = 1, tag = 2, count = 3, start = 4, end = 5, strand = 6)' would cause the function to assume that the first column contains the chromosome information, the second column contained the tag information, etc. If "cols" is specified then information in the header is ignored. If  "cols" is missing and "header" is FALSE, then it is assumed that the data takes the form described in the example above.
另外，可以指定输入文件中的数据列COLS在指定的字符向量（如形式参数;'cols = c(chr = 1, tag = 2, count = 3, start = 4, end = 5, strand = 6)'会导致功能的假设，第一列包含的染色体信息，第二列中的标签信息等，如果COLS指定，然后在头信息将被忽略。如果COLS失踪头是假的，那么它被认为数据需要在上面的例子中所描述的形成。

The 'tag', 'count' and 'strand' columns may optionally be omitted from either the file column headers or the "cols" argument. If the 'tag' column is omitted, then the data will not account for duplicated sequences when estimating the number of counts in loci. If the 'count' column is omitted, the 'readGeneric' function will assume that the file contains the alignments of each copy of each sequence tag, rather than an aggregated alignment of each unique sequence. The unique alignments will be identified and the number of sequence tags aligning to each position will be calculated. If 'strand' is omitted, the strand will simply be ignored.
'tag'，'count'和'strand'列，可以有选择地从文件的列标题或COLS参数省略。如果'tag'栏被省略，那么数据将不占重复序列位点的数量估计计数时。如果'count'栏被省略，'readGeneric'函数将承担该文件包含每个序列标签的每个副本的路线，而不是每一个独特的序列聚合对齐。独特的路线将确定和调整每个位置的序列标签的数量将被计算。如果'strand'省略，链简单地将被忽略。

值----------Value----------

An alignmentData object.
alignmentData对象。

作者（S）----------Author(s)----------

Thomas J. Hardcastle

参见----------See Also----------

alignmentData
alignmentData

举例----------Examples----------

# Define the chromosome lengths for the genome of interest.[定义感兴趣的基因组染色体长度。]

chrlens <- c(2e6, 1e6)

# Define the files containing sample information.[定义文件包含样本信息。]

datadir <- system.file("extdata", package = "segmentSeq")
libfiles <- c("SL9.txt", "SL10.txt", "SL26.txt", "SL32.txt")

# Establish the library names and replicate structure.[建立图书馆的名称和复制结构。]

libnames <- c("SL9", "SL10", "SL26", "SL32")
replicates <- c(1,1,2,2)

# Process the files to produce an `alignmentData' object.[处理文件，以生产alignmentData“对象。]

alignData <- readGeneric(file = libfiles, dir = datadir, replicates =
replicates, libnames = libnames, chrs = c(">Chr1", ">Chr2"), chrlens =
chrlens, gap = 100)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 segmentSeq包 readMethods()函数中文帮助文档(中英文对照)

浏览过的版块