R语言 ChemmineR包 cmp.similarity()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 14:42:39

cmp.similarity(ChemmineR)
cmp.similarity()所属R语言包：ChemmineR

                                    Compute similarity between two compounds using their descriptors
                                       计算两个使用其描述化合物之间的相似

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Given descriptors for two compounds, 'cmp.similarity' returns the similarity measure between the two compounds.
鉴于这两个化合物的描述，“cmp.similarity返回两个化合物之间的相似性度量。

用法----------Usage----------

cmp.similarity(a, b, mode = 1, worst = 0)

参数----------Arguments----------

参数：a
Descriptor of the first compound.
第一个化合物的描述。

参数：b
Descriptor of the second compound.
第二个化合物的描述。

参数：mode
Mode used when computing the distance. See details below.
计算距离时模式下使用。详见下文。

参数：worst
The worst value you are expecting. If 'cmp.similarity' finds the upper bound of similarity is worse than it, it will return a 0 and potentially save some computation.
您所期待的最差值。如果cmp.similarity“认定上的相似性约束的是比它差，它会返回0，并可能保存一些计算。

Details

详情----------Details----------

'cmp.similarity' uses descriptor information generated by 'cmp.parse' and 'cmp.parse1'. Basically, a descriptor is a vector of numbers. The vector actually reprsents the set of descriptors of structural fragment.  Similarity measurement uses Tanimoto coefficient.
“cmp.similarity使用产生”cmp.parse和cmp.parse1“的描述信息。基本上，一个描述符是一个数字的向量。矢量实际上reprsents结构片段的描述。相似性度量使用谷本系数。

'cmp.similarity' supports 3 different modes. In mode 1, normal Tanimoto coefficient is used. In mode 2, it uses the size of descriptor intersection over the size of the smaller descriptor, mainly to deal with compounds that vary a lot in size. In mode 3, it is similar to mode 2, except that it raises the similarity to the power 3 to penalize small values. When mode is 0, 'cmp.similarity' will select mode 1 or mode 3, based on the size differences between the two descriptors.
“cmp.similarity支持3种不同的模式。在模式1，使用正常的谷本系数。在模式2下，它使用的规模，在规模较小的描述，描述交叉口，主要是处理很多不同大小的化合物。模式3，模式2类似，除了它提高到3的权力来惩罚小值的相似性。当模式为0，cmp.similarity“将选择模式1或模式3，基于两个描述符之间的大小差异。

When 'cmp.similarity' is used in searching compounds with a threshold similarity value, or in clustering with a cutoff distance, the threshold similarity and cutoff distance can be used to decide a 'worse' value. 'cmp.similarity' can compute an upper bound of similarity easier, and by comparing this upper bound to the 'worst' value, it can potentially skip the real computation if it finds the similarity will be below the 'worst' value and will be useless to the caller.
当“cmp.similarity”寻找阈值的相似性值的化合物，或在截止距离的聚类，阈值的相似性和截止距离可以用来决定一个“差”的价值。 “cmp.similarity可以计算的相似性更容易的上限，”最坏的“价值上比较，它有可能跳过实际的计算，如果发现的相似性，将下面的”最差“的价值是将无用的来电。

值----------Value----------

Return a numeric value between 0 and 1 which gives the similarity
返回一个0和1之间的数值给人的相似

作者（S）----------Author(s)----------

Y. Eddie Cao, Li-Chang Cheng

参考文献----------References----------

in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients", in J Chem Inf Comput Sci.

参见----------See Also----------

cmp.parse1, cmp.parse,
cmp.parse1，cmp.parse

举例----------Examples----------

## Load sample SD file[＃负载样品的SD文件]
# data(sdfsample); sdfset <- sdfsample[数据（sdfsample）; sdfset < -  sdfsample]

## Generate atom pair descriptor database for searching[＃生成原子对数据库搜索描述]
# apset <- sdf2ap(sdfset) [< -  sdf2ap apset（sdfset）]

## Loads same atom pair sample data set provided by library[＃加载相同的原子对样本数据集，由图书馆提供]
data(apset)

## Compute similarities among two compounds[＃计算两个化合物之间的相似之处]
cmp.similarity(apset[1], apset[2])

## Search apset database with a query compound[＃搜索查询复合apset的数据库]
cmp.search(apset, apset[1], type=3, cutoff = 0.3)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册