recodeSNPs(scrime)
recodeSNPs()所属R语言包:scrime
Recoding of SNP Values
重新编码的SNP值
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Recodes the values used to specify the genotypes of the SNPs to other values. Such a recoding might be required to use other functions contained in this package.
重新编码所使用的值指定为其他值的单核苷酸多态性的基因型。这种重新编码,可能需要使用此套件包含的其他功能。
用法----------Usage----------
recodeSNPs(mat, first.ref = FALSE, geno = 1:3, snp.in.col = FALSE)
参数----------Arguments----------
参数:mat
a matrix or data frame consisting of character strings of length 2 that specify the genotypes of the SNPs. Each of these character strings must be a combination of the letters A, T, C, and G. Missing values can be specified by "NN" or NA. Depending on snp.in.col it is assumed that each row of mat represents a SNP and each column a variable (snp.in.col = FALSE), or vice versa.
矩阵或数据框组成的字符串的长度2,指定的SNP的基因型。这些字符串中的每一个都必须是一个组合的字母A,T,C,G.遗漏值,可以指定"NN"或NA。根据snp.in.col,假定每一行mat表示的SNP和每一列的变量(snp.in.col = FALSE),或反之亦然。
参数:first.ref
does the first letter in the string coding the heterozygous genotype always stands for the more frequent allele? E.g., codes "CC" for the homozygous reference genotype if the genotypes of a SNP are coded by "CC", "CG" and "GG"? If TRUE, the value made up only of this first letter is set to geno[1], and the value made up only of the second letter is set to geno[3]. If FALSE, it is evaluated rowwise which of the homozygous genotypes has the higher frequency and the more often occuring value is set to geno[1], and the other to geno[3].
不总是编码的杂合子基因型的字符串的第一个字母代表更频繁的等位基因吗?例如,代码"CC"参考基因型纯合,如果一个SNP的基因型编码的"CC","CG"和"GG"?如果TRUE,只有这第一个字母组成的值被设置为geno[1],和值设置为geno[3]的第二个字母。如果FALSE,它是评估rowwise的纯合子的基因型具有更高的频率和更经常发生的值被设置为geno[1],和其他geno[3]。
参数:geno
a numeric or character vector of length 3 giving the three values that should be used to recode the genotypes. By default, geno = 1:3 which is the coding, e.g., required by rowChisqStats or pamCat.
一个数字或字符的矢量的长度为3,给出了三个值应该被用于重新编码的基因型。默认情况下,geno = 1:3这是编码,例如,rowChisqStats或pamCat。
参数:snp.in.col
does each column of mat correspond to a SNP (and each row to an array)? If FALSE, it is assumed that each row represents a SNP, and each column an array.
不mat的每一列对应于一个单核苷酸多态性(和每行到一个数组)?如果FALSE,它是假定每一行代表一个SNP,和每一列数组。
值----------Value----------
A matrix of the same size as mat containing the recoded genotypes. (Missing values are coded by NA).
mat包含在重新编码的基因型相同的大小的矩阵。 (遗漏值编码NA)。
参见----------See Also----------
recodeAffySNP, snp2bin
recodeAffySNP,snp2bin
实例----------Examples----------
# Generate an example data set consisting of 5 rows and 12 columns,[生成的一个例子的数据集组成的5行和12列,]
# where it is assumed that each row corresponds to a SNP.[其中假设,每一行对应于一个SNP。]
mat <- matrix("", 10, 12)
mat[c(1, 4, 6),] <- sample(c("AA", "AT", "TT"), 18, TRUE)
mat[c(2, 3, 10),] <- sample(c("CC", "CG", "GG"), 18, TRUE)
mat[c(5, 8),] <- sample(c("GG", "GT", "TT"), 12, TRUE)
mat[c(7, 9),] <- sample(c("AA", "AC", "CC"), 12, TRUE)
mat
# Recode the SNPs[重新编码的单核苷酸多态性]
recodeSNPs(mat)
# Recode the SNPs by assuming that the first letter in[重新编码的单核苷酸多态性假设的第一个字母]
# the heterogyzous genotype refers to the major allele.[heterogyzous基因型是指主要等位基因。]
recodeSNPs(mat, first.ref = TRUE)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|