R语言 VariantAnnotation包 SIFTDb-class()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-26 15:57:37

SIFTDb-class(VariantAnnotation)
SIFTDb-class()所属R语言包：VariantAnnotation

                                    SIFTDb objects
                                       SIFTDb对象

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

The SIFTDb class is a container for storing a connection to a SIFT  sqlite database.
SIFTDb类是用于存储连接到SIFT特征SQLite数据库的容器。

Details

详情----------Details----------

SIFT is a sequence homology-based tool that sorts intolerant from tolerant  amino acid substitutions and predicts whether an amino acid substitution  in a protein will have a phenotypic effect. SIFT is based on the premise  that protein evolution is correlated with protein function. Positions  important for function should be conserved in an alignment of the protein  family, whereas unimportant positions should appear diverse in an alignment.
SIFT特征是序列的同源性为基础的工具，各种不能容忍宽容氨基酸替换和预测中的一种蛋白质的氨基酸替代是否将有一个表型的影响。 SIFT特征的基础上，与蛋白质功能相关的蛋白质进化的前提。应当保守的蛋白家族的对齐功能的重要职位，而不重要的位置，应该会出现不同的对齐。

SIFT uses multiple alignment information to predict tolerated  and deleterious substitutions for every position of the query sequence.  The procedure can be outlined in the following steps,
SIFT特征使用多个调整信息预测为每一个查询序列中的地位的耐受性和有害换人。在下面的步骤可以概括的过程，

search for similar sequences
寻找相似序列

choose closely related sequences that may share similar function to the query sequence
选择密切相关的序列可能有着相似的功能，查询序列

obtain the alignment of the chosen sequences
获得所选择的序列对齐

calculate normalized probabilities for all possible substitutions from the alignment.
归概率计算对齐所有可能的替换。

Positions with normalized probabilities less than 0.05 are predicted to be deleterious, those greater than or equal to 0.05 are predicted to be tolerated.
归概率小于0.05职位，预计将是有害的，这些预计将大于或等于0.05不能容忍的。

方法----------Methods----------

In the code below, x is a SIFTDb object.
在下面的代码，x是SIFTDb对象。

metadata(x): Returns x's metadata in a data frame.
metadata(x)：返回x的一个数据框中的元数据。

cols(x): Returns the names of the cols that can be used to subset the data columns.
cols(x)：返回可以使用的数据列的子集cols的名字。

keys(x): Returns the names of the keys that can be used to subset the data rows. The keys values are the rsid's.
keys(x)：返回keys可以用于数据行的一个子集的名称。 keys值是RSID的。

select(x, keys = NULL, cols = NULL, ...): Returns a subset of data defined by the character vectors keys  and cols. If no keys are supplied, all rows are returned. If no cols are supplied, all columns are returned. For column descriptions see ?SIFTDbColumns.
select(x, keys = NULL, cols = NULL, ...)：返回一个数据子集定义的字符向量keys和cols。如果没有keys提供，返回所有的行。如果没有cols提供的，所有列返回。列说明?SIFTDbColumns。

作者（S）----------Author(s)----------

Valerie Obenchain <vobencha@fhcrc.org>

参考文献----------References----------

http://sift.jcvi.org/
variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4(7):1073-81
Protein Function Annu Rev Genomics Hum Genet. 2006;7:61-80.
function. Nucleic Acids Res. 2003 Jul 1;31(13):3812-4.

举例----------Examples----------

library(SIFT.Hsapiens.dbSNP132)

## metadata[＃元数据]
metadata(SIFT.Hsapiens.dbSNP132)

## available rsid's [＃可用RSID的]
head(keys(SIFT.Hsapiens.dbSNP132))

## for column descriptions see ?SIFTDbColumns[＃列说明？SIFTDbColumns]
cols(SIFT.Hsapiens.dbSNP132)

## subset on keys and cols [＃键和cols的子集]
rsids <- c("rs2142947", "rs17970171", "rs8692231", "rs3026284")
subst <- c("RSID", "PREDICTION", "SCORE")
select(SIFT.Hsapiens.dbSNP132, keys=rsids, cols=subst)
select(SIFT.Hsapiens.dbSNP132, keys=rsids[1:2])

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册