R语言 SNPRelate包 snpgdsPCA()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-30 11:15:12

snpgdsPCA(SNPRelate)
snpgdsPCA()所属R语言包：SNPRelate

                                       Principal Component Analysis (PCA) for SNP genotype data
                                       SNP基因型数据的主成分分析（PCA）

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

To calculate the eigenvectors and eigenvalues for principal component analysis in GWAS.
要计算特征值和特征值主成分分析法在GWAS。

用法----------Usage----------

snpgdsPCA(gdsobj, sample.id = NULL, snp.id = NULL, autosome.only = TRUE,
remove.monosnp = TRUE, maf = NaN, missing.rate = NaN, eigen.cnt = 32,
num.thread = 1, bayesian = FALSE, need.genmat = FALSE, genmat.only = FALSE,
verbose = TRUE)

参数----------Arguments----------

参数：gdsobj
the gdsclass object in the gdsfmt package
gdsclass对象在gdsfmt包

参数：sample.id
a vector of sample id specifying selected samples; if NULL, all samples are used
一个向量的样品ID指定选取的样本，如果为NULL，所有样本都

参数：snp.id
a vector of snp id specifying selected SNPs; if NULL, all SNPs are used
一个向量指定选定的单核苷酸多态性SNP ID，如果为NULL，所有的SNP

参数：autosome.only
if TRUE, use autosomal SNPs only
如果为TRUE，使用常染色体SNP位点

参数：remove.monosnp
if TRUE, remove monomorphic SNPs
如果为TRUE，删除单态的单核苷酸多态性

参数：maf
to use the SNPs with ">= maf" only; if NaN, no MAF threshold
如果为NaN，没有MAF阈值使用的单核苷酸多态性“> = MAF”;

参数：missing.rate
to use the SNPs with "<= missing.rate" only; if NaN, no missing threshold
如果为NaN，无失阈值使用的单核苷酸多态性“<= missing.rate。”而已;

参数：eigen.cnt
output the number of eigenvectors; if eigen.cnt <= 0, then return all eigenvectors
输出的数目的特征向量;若eigen.cnt <= 0，然后返回所有特征向量

参数：num.thread
the number of CPU cores used
CPU核心的数量

参数：bayesian
if TRUE, use bayesian normalization
如果为TRUE，使用贝叶斯标准化

参数：need.genmat
if TRUE, return the genetic covariance matrix
如果为TRUE，返回的遗传协方差矩阵

参数：genmat.only
return the genetic covariance matrix only, do not compute the eigenvalues and eigenvectors
返回的遗传协方差矩阵，不计算特征值和特征向量

参数：verbose
if TRUE, show information
如果为TRUE，显示信息

Details

详细信息----------Details----------

The minor allele frequency and missing rate for each SNP passed in snp.id are calculated over all the samples in sample.id.
未成年人的等位基因频率和每个SNP位点的丢失率，通过snp.id计算在所有的样品中sample.id。

值----------Value----------

Return a snpgdsPCAClass object, and it is a list:
返回一个snpgdsPCAClass对象，它是一个列表：

参数：sample.id
the sample ids used in the analysis
在分析中使用的样品的id

参数：snp.id
the SNP ids used in the analysis
在分析中使用的SNP ID的

参数：eigenval
eigenvalues
特征值

参数：eigenvect
eigenvactors, "# of samples" x "eigen.cnt"
eigenvactors，“＃样品”x“的eigen.cnt”

参数：TraceXTX
the trace of the genetic covariance matrix
遗传协方差矩阵的痕迹

参数：Bayesian
whether use bayerisan normalization
是否使用bayerisan的标准化

参数：genmat
the genetic covariance matrix
遗传协方差矩阵

（作者）----------Author(s)----------

Xiuwen Zheng <a href="mailto:zhengx@u.washington.edu">zhengx@u.washington.edu</a>

参考文献----------References----------

Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 38, 904-909.

参见----------See Also----------

snpgdsPCACorr, snpgdsPCASampLoading, snpgdsPCASNPLoading
snpgdsPCACorr，snpgdsPCASampLoading，snpgdsPCASNPLoading

实例----------Examples----------

# open an example dataset (HapMap)[打开示例数据集（人类基因组单体型图）]
genofile <- openfn.gds(snpgdsExampleFileName())

RV <- snpgdsPCA(genofile, num.thread=2)
pop <- read.gdsn(index.gdsn(genofile, c("sample.annot", "pop.group")))
plot(RV$eigenvect[,2], RV$eigenvect[,1], col=as.integer(factor(pop)),
xlab="PC 2", ylab="PC 1")
legend("topleft", legend=levels(factor(pop)), pch="o", col=1:4)

RV <- snpgdsPCA(genofile, need.genmat=TRUE)
names(RV)
# [1] "sample.id" "snp.id" "eigenval"  "eigenvect" "TraceXTX"  "Bayesian"  "genmat"[[1]“”sample.id“snp.id”“eigenval”eigenvect“TraceXTX”“贝叶斯”genmat“]

# close the genotype file[关闭基因型文件]
closefn.gds(genofile)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册