R语言 sets包 similarity()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-30 01:34:30

similarity(sets)
similarity()所属R语言包：sets

                                    Similarity and Dissimilarity Functions
                                       相似和相异功能

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Similarities and dissimilarities for (generalized) sets.
（广义）集的相似点和不同点。

用法----------Usage----------

set_similarity(x, y, method = "Jaccard")
gset_similarity(x, y, method = "Jaccard")
cset_similarity(x, y, method = "Jaccard")

set_dissimilarity(x, y, method = c("Jaccard", "Manhattan", "Euclidean", "L1", "L2"))
gset_dissimilarity(x, y, method = c("Jaccard", "Manhattan", "Euclidean", "L1", "L2"))
cset_dissimilarity(x, y, method = c("Jaccard", "Manhattan", "Euclidean", "L1", "L2"))

参数----------Arguments----------

参数：x, y
Two (generalized/customizable) sets.
两集（广义/定制）。

参数：method
Character string specifying the proximity method (see below).
字符串指定的接近方法（见下文）。

Details

详细信息----------Details----------

For two generalized sets X and Y, the Jaccard similarity is |X intersect Y| /    |X U Y| where |.| denotes the cardinality for generalized sets (sum of memberships). The Jaccard  dissimilarity is 1 minus the similarity.
对于两个广义集X和Y，Jaccard相似|X intersect Y| /    |X U Y|其中|.|表示的基数的广义集（总和的会员资格）。 Jaccard相异1负的相似性。

The L1 (or Manhattan) and L2 (or Euclidean)  dissimilarities are defined as follows. For two fuzzy multisets A and B on a given universe X with elements x, let M_A(x) and M_B(x) be functions returning the memberships of an element x in sets A and B, respectively. The memberships are returned in standard form, i.e. as an infinite vector of decreasing membership values, e.g. (1, 0.3, 0, 0, ...). Let M_A(x)_i and M_B(x)_i denote the ith components of these membership vectors. Then the L1 distance is defined as:
L1（或Manhattan）和L2（或Euclidean）相异的定义如下。对于两个模糊多重A和B在一个给定的宇宙X的元素x，让M_A(x)和M_B(x)函数返回的会员资格，元素x在台A和B，分别。返回会籍的标准形式，即作为一个无限向量隶属度值下降，例如(1, 0.3, 0, 0, ...)。让我们M_A(x)_i和M_B(x)_i表示i个分量的这些成员向量。然后L1的距离被定义为：

and the L2
和L2

值----------Value----------

A numeric value (similarity or dissimilarity, as specified).
（相似或相异的规定，）的数值。

源----------Source----------

T. Matthe, R. De Caluwe, G. de Tre, A. Hallez, J. Verstraete, M. Leman, O. Cornelis, D. Moelants, and J. Gansemans (2006), Similarity Between Multi-valued Thesaurus Attributes: Theory and Application in Multimedia Systems, Flexible Query Answering Systems, Lecture Notes in Computer Science, Springer, 331–342.
T.德Caluwe，Matthe，R. G.去滓A. Hallez的，J. Verstraete，M.莱曼，O.科内利斯，D. Moelants，和J. Gansemans的（2006年），多值词库属性的相似性：理论与应用多媒体系统，灵活的查询系统，讲义在计算机科学，施普林格，331-342。

K. Mizutani, R. Inokuchi, and S. Miyamoto (2008), Algorithms of Nonlinear Document Clustering Based on Fuzzy Multiset Model, International Journal of Intelligent Systems, 23, 176–198.
K.水谷，河井口，与宫本（2008年）基础上，国际智能系统杂志，23，176-198模糊多集模型的非线性文档聚类算法。

参见----------See Also----------

set.
set。

实例----------Examples----------

A <- set("a", "b", "c")
B <- set("c", "d", "e")
set_similarity(A, B)
set_dissimilarity(A, B)

A <- gset(c("a", "b", "c"), c(0.3, 0.7, 0.9))
B <- gset(c("c", "d", "e"), c(0.2, 0.4, 0.5))
gset_similarity(A, B, "Jaccard")
gset_dissimilarity(A, B, "Jaccard")
gset_dissimilarity(A, B, "L1")
gset_dissimilarity(A, B, "L2")

A <- gset(c("a", "b", "c"), list(c(0.3, 0.7), 0.1, 0.9))
B <- gset(c("c", "d", "e"), list(0.2, c(0.4, 0.5), 0.8))
gset_similarity(A, B, "Jaccard")
gset_dissimilarity(A, B, "Jaccard")
gset_dissimilarity(A, B, "L1")
gset_dissimilarity(A, B, "L2")

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 sets包 similarity()函数中文帮助文档(中英文对照)

浏览过的版块