找回密码
 注册
查看: 679|回复: 0

R语言 sets包 similarity()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-9-30 01:34:30 | 显示全部楼层 |阅读模式
similarity(sets)
similarity()所属R语言包:sets

                                        Similarity and Dissimilarity Functions
                                         相似和相异功能

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Similarities and dissimilarities for (generalized) sets.
(广义)集的相似点和不同点。


用法----------Usage----------


set_similarity(x, y, method = "Jaccard")
gset_similarity(x, y, method = "Jaccard")
cset_similarity(x, y, method = "Jaccard")

set_dissimilarity(x, y, method = c("Jaccard", "Manhattan", "Euclidean", "L1", "L2"))
gset_dissimilarity(x, y, method = c("Jaccard", "Manhattan", "Euclidean", "L1", "L2"))
cset_dissimilarity(x, y, method = c("Jaccard", "Manhattan", "Euclidean", "L1", "L2"))



参数----------Arguments----------

参数:x, y
Two (generalized/customizable) sets.
两集(广义/定制)。


参数:method
Character string specifying the proximity method (see below).
字符串指定的接近方法(见下文)。


Details

详细信息----------Details----------

For two generalized sets X and Y, the Jaccard similarity is |X intersect Y| /     |X U Y| where |.| denotes the cardinality for generalized sets (sum of memberships). The Jaccard  dissimilarity is 1 minus the similarity.
对于两个广义集X和Y,Jaccard相似|X intersect Y| /     |X U Y|其中|.|表示的基数的广义集(总和的会员资格)。 Jaccard相异1负的相似性。

The L1 (or Manhattan) and L2 (or Euclidean)  dissimilarities are defined as follows. For two fuzzy multisets A and B on a given universe X with elements x, let M_A(x) and M_B(x) be functions returning the memberships of an element x in sets A and B, respectively. The memberships are returned in standard form, i.e. as an infinite vector of decreasing membership values, e.g. (1, 0.3, 0, 0, ...). Let M_A(x)_i and M_B(x)_i denote the ith components of these membership vectors. Then the L1 distance is defined as:
L1(或Manhattan)和L2(或Euclidean)相异的定义如下。对于两个模糊多重A和B在一个给定的宇宙X的元素x,让M_A(x)和M_B(x)函数返回的会员资格,元素x在台A和B,分别。返回会籍的标准形式,即作为一个无限向量隶属度值下降,例如(1, 0.3, 0, 0, ...)。让我们M_A(x)_i和M_B(x)_i表示i个分量的这些成员向量。然后L1的距离被定义为:

and the L2
和L2


值----------Value----------

A numeric value (similarity or dissimilarity, as specified).
(相似或相异的规定,)的数值。


源----------Source----------

T. Matthe, R. De Caluwe, G. de Tre, A. Hallez, J. Verstraete, M. Leman, O. Cornelis, D. Moelants, and J. Gansemans (2006), Similarity Between Multi-valued Thesaurus Attributes: Theory and Application in Multimedia Systems, Flexible Query Answering Systems, Lecture Notes in Computer Science, Springer, 331–342.
T.德Caluwe,Matthe,R. G.去滓A. Hallez的,J. Verstraete,M.莱曼,O.科内利斯,D. Moelants,和J. Gansemans的(2006年),多值词库属性的相似性:理论与应用多媒体系统,灵活的查询系统,讲义在计算机科学,施普林格,331-342。

K. Mizutani, R. Inokuchi, and S. Miyamoto (2008), Algorithms of Nonlinear Document Clustering Based on Fuzzy Multiset Model, International Journal of Intelligent Systems, 23, 176–198.
K.水谷,河井口,与宫本(2008年)基础上,国际智能系统杂志,23,176-198模糊多集模型的非线性文档聚类算法。


参见----------See Also----------

set.
set。


实例----------Examples----------


A <- set("a", "b", "c")
B <- set("c", "d", "e")
set_similarity(A, B)
set_dissimilarity(A, B)

A <- gset(c("a", "b", "c"), c(0.3, 0.7, 0.9))
B <- gset(c("c", "d", "e"), c(0.2, 0.4, 0.5))
gset_similarity(A, B, "Jaccard")
gset_dissimilarity(A, B, "Jaccard")
gset_dissimilarity(A, B, "L1")
gset_dissimilarity(A, B, "L2")

A <- gset(c("a", "b", "c"), list(c(0.3, 0.7), 0.1, 0.9))
B <- gset(c("c", "d", "e"), list(0.2, c(0.4, 0.5), 0.8))
gset_similarity(A, B, "Jaccard")
gset_dissimilarity(A, B, "Jaccard")
gset_dissimilarity(A, B, "L1")
gset_dissimilarity(A, B, "L2")

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-5-21 07:12 , Processed in 0.024003 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表