R语言 chopsticks包 read.snps.long()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 15:00:17

read.snps.long(chopsticks)
read.snps.long()所属R语言包：chopsticks

                                    Read SNP data in long format
                                       在长格式读取的SNP数据

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Reads SNP data when organized in free format as one call per line. Other than the one call per line requirement, there is considerable flexibility. Multiple input files can be read, the input fields can be in any order on the line, and irrelevant fields can be skipped. The samples and SNPs to be read  must be pre-specified, and define rows and columns of an output object of class "snp.matrix".
读取SNP数据，在自由格式为每行调用举办。除了要求每行一个呼叫，有相当大的灵活性。输入字段可以读取多个输入文件，可以在任何命令就行了，可以跳过不相关的领域。样品和SNPs要读取必须是预先指定的，并定义一个类"snp.matrix"输出对象的行和列。

用法----------Usage----------

read.snps.long(files, sample.id = NULL, snp.id = NULL, female = NULL,
            fields = c(sample = 1, snp = 2, genotype = 3, confidence = 4),
            codes = c("0", "1", "2"), threshold = 0.9, lower = TRUE,
            sep = " ", comment = "#", skip = 0, simplify = c(FALSE,FALSE),
            verbose = FALSE, every = 1000)

参数----------Arguments----------

参数：files
A character vector giving the names of the input files
一个特征向量，输入文件的名称

参数：sample.id
A character vector giving the identifiers of the samples to be read
一个特征向量，予以提供样品的标识

参数：snp.id
A character vector giving the names of the SNPs to be read
字符向量的SNP的名称予以

参数：female
If the SNPs are on the X chromosome and the data are to be read as such, this logical vector (of the same length as sample.id should specify whether each sample was from a female subject
如果SNPs是在X染色体上的数据将被作为读，这个逻辑向量（相同长度的sample.id应指定每个样品是否是从一个女性主题

参数：fields
A integer vector with named elements specifying the positions of the required fields in the input record. The fields are identified by the names sample and snp for the sample and SNP identifier fields, confidence for a call confidence score (if present) and either genotype if genotype calls occur as a single field, or allele1 and allele2 if the two alleles are coded in different fields
一个名为所需的字段中输入记录的指定位置的元素的整数向量。这些领域确定的名称sample和snp样品和SNP标识符字段，confidence呼叫信心得分（如果存在的话），要么genotype的如果基因型分型作为一个单一的领域发生，或allele1和allele2如果两个等位基因编码在不同的领域

参数：codes
Either the single string "nucleotide" denoting that coding in terms of nucleotides (A, C, G or T, case insensitive), or a character vector giving genotype or allele codes (see below)
无论是单个字符串"nucleotide"表示，核苷酸编码（A，C，G或T给予，不区分大小写），或者一个字符向量基因型或等位基因编码（见下文）

参数：threshold
A numerical value for the calling threshold on the confidence score
一个调用的信心得分阈值的数值

参数：lower
If TRUE, then threshold represents a lower bound. Otherwise it is an upper bound
如果TRUE，则threshold代表的下限。否则它是一个上限

参数：sep
The delimiting character separating fields in the input record
分隔字符分隔字段中输入记录

参数：comment
A character denoting that any remaining input on a line is to be ignored
字符表示，任何剩余的输入行被忽略

参数：skip
An integer value specifying how many lines are to be skipped at the beginning of each data file
一个整数，指定是在每个数据文件的开头跳过多少行

参数：simplify
If TRUE, sample and SNP identifying strings will be shortened by removal of any common leading or trailing sequences when they are used as row and column names of the output snp.matrix
如果将缩短清除任何共同的前导或尾随序列当他们行和列名的输出TRUEsnp.matrix，样品和SNP标识字符串

参数：verbose
If TRUE, a progress report is generated as every every lines of data are read
TRUE如果，进度报告产生的每every数据线读取

参数：every
See verbose
看到verbose

Details

详情----------Details----------

If nucleotide coding is not used, the codes argument should be a character array giving the valid codes.  For genotype coding of autosomal SNPs, this should be an array of length 3 giving the codes for the three genotypes, in the order homozygous(AA), heterozygous(AB), homozygous(BB). All other codes will be  treated as "no call". The default codes are "0",  "1", "2".  For X SNPs, males are assumed to be coded as homozygous, unless an additional two codes are supplied (representing the  AY and BY genotypes). For allele coding, the codes array should be of length 2 and should specify the codes for the two alleles. Again, any other code is treated as "missing" and, for X SNPs, males should be coded either as homozygous or by omission of  the second allele.
如果不使用核苷酸编码，codes参数应该是一个字符数组，给予有效的代码。为常染色体单核苷酸多态性的基因编码，这应该是一个长度为3的数组提供三种基因型的代码，在为了纯合子（AA），杂合子（AB），纯合子（BB）。所有其他的代码将被视为“不呼叫”。默认代码"0""1"，"2"。对于X个SNPs，男性承担被编码为纯合子，除非提供一个额外的两个代码（代表AY和基因型）。等位基因编码，应该是codes数组的长度为2，并应指定两个等位基因的代码。再次，任何其他代码为“失踪”的X个SNPs，治疗，男性应为纯合子或遗漏的第二个等位基因编码。

Although the function allows for reading of data for the X chromosome directly into an object of class "X.snp.matrix", it will often be preferable to read such data as a "snp.matrix" (i.e. as autosomal) and to coerce it to an object of type "X.snp.matrix" later using as(..., "X.snp.matrix") or new("X.snp.matrix", ..., female=...).
虽然该功能允许一个类的对象直接读取数据，X染色体"X.snp.matrix"，它常常会是可取的阅读"snp.matrix"（即作为常染色体显性遗传）等数据，并强迫它对象的类型"X.snp.matrix"后来使用as(..., "X.snp.matrix")或new("X.snp.matrix", ..., female=...)。

The vectors sample.id and snp.id must be in the same order as they  vary on the input file(s) and this ordering must be consistent. However, there is no requirement that either SNP or sample should vary fastest; this is detected from the input.  Each file may represent a separate sample or SNP, in which case the appropriate .id argument can be omitted and row or column names taken from the file names.
向量sample.id和snp.id必须以相同的顺序，因为他们对输入文件（S），这个顺序不同，必须是一致的。然而，没有任何规定，SNP或样品应变化最快的，这是从输入检测。每个文件可以代表一个单独的样品或SNP，在这种情况下，适当.id参数可以省略，采取行或列名的文件名。

值----------Value----------

An object of class "snp.matrix" or "X.snp.matrix".
一个对象类"snp.matrix"或"X.snp.matrix"。

注意----------Note----------

The function will read gzipped files.
该函数将读gzip文件。

This function has replaced and earlier version which was much less flexible. Because all features have not been fully tested, the older version has been retained as read.snps.long.old.
此功能已经取代了早期版本，这是非常不灵活。，因为所有的功能没有得到充分的测试，一直保留旧版本的read.snps.long.old。

作者（S）----------Author(s)----------

David Clayton <a href="mailto:david.clayton@cimr.cam.ac.uk">david.clayton@cimr.cam.ac.uk</a>

参见----------See Also----------

read.HapMap.dataread.snps.pedfile, read.snps.chiamo,read.snps.long,
read.HapMap.data  read.snps.pedfile，read.snps.chiamo，read.snps.long

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册