snpgdsIBDMoM(SNPRelate)
snpgdsIBDMoM()所属R语言包:SNPRelate
PLINK method of moment (MoM) for the Identity-By-Descent (IBD) Analysis
PLINK矩量法(MoM)的身份的下降(IBD)分析
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Calculate three IBD coefficients for non-inbred individual pairs by PLINK method of moment (MoM).
计算三种的IBD系数不交个人对PLINK矩量法(MoM)的。
用法----------Usage----------
snpgdsIBDMoM(gdsobj, sample.id=NULL, snp.id=NULL, autosome.only=TRUE,
remove.monosnp=TRUE, maf=NaN, missing.rate=NaN, allele.freq=NULL,
kinship=FALSE, kinship.constraint=FALSE, num.thread=1, verbose=TRUE)
参数----------Arguments----------
参数:gdsobj
the gdsclass object in the gdsfmt package
gdsclass对象在gdsfmt包
参数:sample.id
a vector of sample id specifying selected samples; if NULL, all samples are used
一个向量的样品ID指定选取的样本,如果为NULL,所有样本都
参数:snp.id
a vector of snp id specifying selected SNPs; if NULL, all SNPs are used
一个向量指定选定的单核苷酸多态性SNP ID,如果为NULL,所有的SNP
参数:autosome.only
if TRUE, use autosomal SNPs only
如果为TRUE,使用常染色体SNP位点
参数:remove.monosnp
if TRUE, remove monomorphic SNPs
如果为TRUE,删除单态的单核苷酸多态性
参数:maf
to use the SNPs with ">= maf" only; if NaN, no MAF threshold
如果为NaN,没有MAF阈值使用的单核苷酸多态性“> = MAF”;
参数:missing.rate
to use the SNPs with "<= missing.rate" only; if NaN, no missing threshold
如果为NaN,无失阈值使用的单核苷酸多态性“<= missing.rate。”而已;
参数:allele.freq
to specify the allele frequencies; if NULL, determine the allele frequencies from gdsobj using the specified samples
指定的等位基因频率,如果为NULL,确定等位基因频率gdsobj使用指定的样品
参数:kinship
if TRUE, output the estimated kinship coefficients
如果TRUE,输出的估计亲属关系系数
参数:kinship.constraint
if TRUE, constrict IBD coefficients ($k_0,k_1,k_2$) in the geneloical region ($2 k_0 k_1 >= k_2^2$)
如果为TRUE,压缩的IBD系数(K_0,K_1,K_2 $)中的geneloical的区域($ 2 K_0 K_1> = K_2 ^ 2 $)
参数:num.thread
the number of CPU cores used
CPU核心的数量
参数:verbose
if TRUE, show information
如果为TRUE,显示信息
Details
详细信息----------Details----------
PLINK IBD estimator is a moment estimator, and it is computationally efficient relative to MLE method. In the PLINK method of moment, a correction factor based on allele counts is used to adjust for sampling. However, if allele frequencies are specified, no correction factor is conducted since the specified allele frequencies are assumed to be known without sampling.
PLINK IBD估计是一时的估计,它是相对于最大似然估计方法计算效率。在“PLINK矩量法,基于等位基因计数的校正因子用于调整采样。然而,如果指定了等位基因频率,没有校正因子进行自指定的等位基因频率假定为已知不包括采样。
The minor allele frequency and missing rate for each SNP passed in snp.id are calculated over all the samples in sample.id.
未成年人的等位基因频率和每个SNP位点的丢失率,通过snp.id计算在所有的样品中sample.id。
值----------Value----------
Return a list:
返回一个列表:
参数:sample.id
the sample ids used in the analysis
在分析中使用的样品的id
参数:snp.id
the SNP ids used in the analysis
在分析中使用的SNP ID的
参数:k0
IBD coefficient, the probability of sharing ZERO IBD
IBD的系数,的概率下分享的ZERO IBD
参数:k1
IBD coefficient, the probability of sharing ONE IBD
鸡传染性法氏囊病的概率系数,共用一个IBD
(作者)----------Author(s)----------
Xiuwen Zheng <a href="mailto:zhengx@u.washington.edu">zhengx@u.washington.edu</a>
参考文献----------References----------
de Bakker PIW, Daly MJ & Sham PC. 2007. PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81.
参见----------See Also----------
snpgdsIBDMLE, snpgdsIBDMLELogLik
snpgdsIBDMLE,snpgdsIBDMLELogLik
实例----------Examples----------
# open an example dataset (HapMap)[打开示例数据集(人类基因组单体型图)]
genofile <- openfn.gds(snpgdsExampleFileName())
# CEU population[CEU人口]
CEU.id <- read.gdsn(index.gdsn(genofile, "sample.id"))[
read.gdsn(index.gdsn(genofile, c("sample.annot", "pop.group")))=="CEU"]
pibd <- snpgdsIBDMoM(genofile, sample.id=CEU.id, num.thread=2, kinship=TRUE)
names(pibd)
flag <- lower.tri(pibd$k0)
plot(NaN, xlim=c(0,1), ylim=c(0,1), xlab="k0", ylab="k1")
lines(c(0,1), c(1,0), col="red", lty=3)
points(pibd$k0[flag], pibd$k1[flag])
# YRI population[YRI人口]
YRI.id <- read.gdsn(index.gdsn(genofile, "sample.id"))[
read.gdsn(index.gdsn(genofile, c("sample.annot", "pop.group")))=="YRI"]
pibd <- snpgdsIBDMoM(genofile, sample.id=YRI.id, num.thread=2)
flag <- lower.tri(pibd$k0)
plot(NaN, xlim=c(0,1), ylim=c(0,1), xlab="k0", ylab="k1")
lines(c(0,1), c(1,0), col="red", lty=3)
points(pibd$k0[flag], pibd$k1[flag])
# specify the allele frequencies[指定的等位基因频率]
afreq <- snpgdsSNPRateFreq(genofile, sample.id=YRI.id)$AlleleFreq
aibd <- snpgdsIBDMoM(genofile, sample.id=YRI.id, num.thread=2, allele.freq=afreq)
flag <- lower.tri(aibd$k0)
plot(NaN, xlim=c(0,1), ylim=c(0,1), xlab="k0", ylab="k1")
lines(c(0,1), c(1,0), col="red", lty=3)
points(aibd$k0[flag], aibd$k1[flag])
# analysis on a subset[分析的一个子集]
subibd <- snpgdsIBDMoM(genofile, sample.id=YRI.id[1:25], num.thread=2, allele.freq=afreq)
summary(c(subibd$k0 - aibd$k0[1:25, 1:25]))
# ZERO[ZERO]
summary(c(subibd$k1 - aibd$k1[1:25, 1:25]))
# ZERO[ZERO]
# close the genotype file[关闭基因型文件]
closefn.gds(genofile)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|