找回密码
 注册
查看: 696|回复: 0

R语言 SNPRelate包 snpgdsLDpruning()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-9-30 11:14:48 | 显示全部楼层 |阅读模式
snpgdsLDpruning(SNPRelate)
snpgdsLDpruning()所属R语言包:SNPRelate

                                         Linkage Disequilibrium (LD) based SNP pruning
                                         连锁不平衡(LD)为基础的SNP修剪

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Recursively removes SNPs within a sliding window
递归方式删除一个滑动窗口内单核苷酸多态性


用法----------Usage----------


snpgdsLDpruning(gdsobj, sample.id = NULL, snp.id = NULL, autosome.only = TRUE,
        remove.monosnp = TRUE, maf = NaN, missing.rate = NaN,
        method = c("composite", "r", "dprime", "corr"), slide.max.bp = 500000,
        slide.max.n = NA, ld.threshold = 0.2, num.thread = 1, verbose = TRUE)



参数----------Arguments----------

参数:gdsobj
the gdsclass object in the gdsfmt package
gdsclass对象在gdsfmt包


参数:sample.id
a vector of sample id specifying selected samples; if NULL, all samples are used
一个向量的样品ID指定选取的样本,如果为NULL,所有样本都


参数:snp.id
a vector of snp id specifying selected SNPs; if NULL, all SNPs are used
一个向量指定选定的单核苷酸多态性SNP ID,如果为NULL,所有的SNP


参数:autosome.only
if TRUE, use autosomal SNPs only
如果为TRUE,使用常染色体SNP位点


参数:remove.monosnp
if TRUE, remove monomorphic SNPs
如果为TRUE,删除单态的单核苷酸多态性


参数:maf
to use the SNPs with ">= maf" only; if NaN, no MAF threshold
如果为NaN,没有MAF阈值使用的单核苷酸多态性“> = MAF”;


参数:missing.rate
to use the SNPs with "<= missing.rate" only; if NaN, no missing threshold
如果为NaN,无失阈值使用的单核苷酸多态性“<= missing.rate。”而已;


参数:method
"composite", "r", "dprime", "corr", see details
“复合”,“R”,“dprime”,“校正”,查看详情


参数:slide.max.bp
the maximum basepairs in the sliding window
在滑动窗口的最大个碱基对


参数:slide.max.n
the maximum number of SNPs in the sliding window
在滑动窗口的SNP位点的最大数目


参数:ld.threshold
the LD threshold
LD的阈值


参数:num.thread
the number of CPU cores used
CPU核心的数量


参数:verbose
if TRUE, show information
如果为TRUE,显示信息


Details

详细信息----------Details----------

The minor allele frequency and missing rate for each SNP passed in snp.id are calculated over all the samples in sample.id.
未成年人的等位基因频率和每个SNP位点的丢失率,通过snp.id计算在所有的样品中sample.id。

Four methods can be used to calculate linkage disequilibrium values: "composite" for LD composite measure, "r" for r square, "dprime" for D', and "corr" for correlation coefficient. The method "corr" is equivalent to "composite", when SNP genotypes are coded as: 0 &ndash; BB, 1 &ndash; AB, 2 &ndash; AA. The LD is the absolute value of measurement.
有四种方法可以用来计算连锁不平衡值:“复合”LD综合衡量,R平方为“R”,“Ddprime”,和“校正”的相关系数。 “校正”的方法是相当于“复合”,SNP基因型时,被编码为:0  -  AB,BB,1  -  2  -  AA。 LD是绝对测量值。

It is useful to generate a pruned subset of SNPs that are in approximate linkage equilibrium with each other. The function snpgdsLDpruning recursively removes SNPs within a sliding window based on the pairwise genotypic correlation. SNP pruning is conducted chromosome by chromosome, since SNPs in a chromosome can be considered to be independent with the other chromosomes.
它是有用的,以产生被剪枝的子集的单核苷酸多态性的是在近似联动相互平衡的。函数snpgdsLDpruning递归方式删除SNP位点的滑动窗口内基于成对的基因型相关。 SNP进行修剪的染色体的染色体,由于染色体SNPs在可以被认为是独立的与其他染色体。

The pruning algorithm on a chromosome is described as follows (n is the total number of SNPs on that chromosome):
在一条染色体上的修剪算法描述如下(n是该染色体上的SNP位点的总数):

1) Randomly select a starting position i, and let the current SNP set S = { i };
1)随机选择的起始位置我,让目前的SNP集合S = {};

2) For each right position j from i+1 to n: if any LD between j and k is greater than ld.threshold, where k belongs to S, and both of j and k are in the sliding window, then skip j; otherwise, let S be S + { j };
2)对于每个从i +1到n的右位置j:j和k之间的是,如果没有LD大于ld.threshold,其中k属于S,j和k是在滑动窗口,然后跳到J;否则,让S是S + {J};

3) For each left position j from i-1 to 1: if any LD between j and k is greater than ld.threshold, where k belongs to S, and both of j and k are in the sliding window, then skip j; otherwise, let S be S + { j };
3)对于每一个从i-1到1的左侧的位置j:j和k之间的是,如果没有LD大于ld.threshold,其中k属于S,j和k是在滑动窗口,然后跳到J;否则,让S是S + {J};

4) Output S, the final selection of SNPs.
4)输出S,最终选择的单核苷酸多态性。


值----------Value----------

Return a list of SNP IDs stratified by chromosomes.
返回一个列表的染色体的SNP ID的分层。


(作者)----------Author(s)----------


Xiuwen Zheng <a href="mailto:zhengx@u.washington.edu">zhengx@u.washington.edu</a>



参考文献----------References----------



(ed): Mathematical Evolutionary Theory. Princeton, NJ: Princeton University Press, 1989.

参见----------See Also----------

snpgdsLDMat, snpgdsLDpair
snpgdsLDMat,snpgdsLDpair


实例----------Examples----------


# open an example dataset (HapMap)[打开示例数据集(人类基因组单体型图)]
genofile <- openfn.gds(snpgdsExampleFileName())

snpset <- snpgdsLDpruning(genofile)
names(snpset)
#  [1] "chr1"  "chr2"  "chr3"  "chr4"  "chr5"  "chr6"  "chr7"  "chr8"  "chr9"[[1]“”CHR1“CHR 2”CHR 3“”CHR 4“”CHR 5 CHR 6“,”CHR 7“”chr8“CHR9”]
# [10] "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18"[[10]“CHR 10”CHR11“chr12”chr13“”“chr14”的“chr15”,“CHR 16”chr17“chr18”]
# ......[......]
head(snpset$chr1)
# [1] 1 2 3 4 5 6[[1] 1 2 3 4 5 6]

# get SNP ids[获得SNP IDS]
snp.id <- unlist(snpset)

# close the genotype file[关闭基因型文件]
closefn.gds(genofile)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-6-8 03:10 , Processed in 0.024944 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表