fpSim(ChemmineR)
fpSim()所属R语言包:ChemmineR
PubChem Fingerprint Search
PubChem数据库指纹搜索
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Function to use PubChem fingerprints for structure similarity comparisons, searching and clustering.
功能使用PubChem数据库的结构相似性比较的指纹,搜索和聚类。
用法----------Usage----------
fpSim(x, y)
参数----------Arguments----------
参数:x
vector containing binary fingerprint data. Needs to have the same length as y (vector or matrix row).
vector包含二进制的指纹数据。需要有y(vector或matrix行)的长度相同。
参数:y
vector or matrix containing binary fingerprint data.
vector或matrix包含二进制的指纹数据。
Details
详情----------Details----------
The function computes the Tanimoto coefficients for pairwise comparisons of binary fingerprints. The coefficient is defined as c/(a+b+c), which is the proportion of the "on-bits" shared among the fingerprints of two compounds divided by their union. The variable c is the number of "on-bits" common in both compounds, while a and b are the number of "on-bits" that are unique in one or the other compound, respectively.
函数计算成对比较二进制指纹的谷本系数。系数被定义为C /(A + B + C),这是他们的工会分为两种化合物的指纹共享“对位”的比例。变量c是“位”共同的两种化合物的数量,而A和B“位”,在一个或其他化合物,分别是独特的数目。
值----------Value----------
Returns numeric vector with Tanimoto coefficients as values and compound identifiers as names.
返回numeric vector谷本系数为名称的价值观和复合标识符。
注意----------Note----------
Limitation: PubChem fingerprints need to be provided, such as in PubChem's SD files.
限制:PubChem数据库需要提供指纹,如PubChem数据库的SD文件。
作者(S)----------Author(s)----------
Thomas Girke
参考文献----------References----------
参见----------See Also----------
Functions: fp2bit
功能:fp2bit
举例----------Examples----------
## Load PubChem SDFset sample[#负载PubChem数据库SDFset样本]
data(sdfsample); sdfset <- sdfsample
cid(sdfset) <- sdfid(sdfset)
## Convert base 64 encoded fingerprints to character vector or binary matrix[#转换碱基64编码的指纹特征向量或二进制矩阵]
fpset <- fp2bit(x=sdfset, type=1)
fpset <- fp2bit(x=sdfset, type=2)
## Pairwise compound structure comparisons[#成对复合结构比较]
fpSim(x=fpset[1,], y=fpset[2,])
## Structure similarity searching: x is query and y is fingerprint database [#结构相似性搜索:X是查询和y是指纹数据库]
fpSim(x=fpset[1,], y=fpset)
## Compute fingerprint-based Tanimoto similarity matrix [#计算指纹的谷本相似矩阵]
simMA <- sapply(rownames(fpset), function(x) fpSim(x=fpset[x,], fpset))
## Hierarchical clustering with simMA as input[作为输入与西马#分层聚类]
hc <- hclust(as.dist(simMA), method="single")
## Plot hierarchical clustering tree[#图的层次聚类树]
plot(as.dendrogram(hc), edgePar=list(col=4, lwd=2), horiz=TRUE)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|