找回密码
 注册
查看: 631|回复: 0

R语言 GLAD包 hclustglad()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-25 20:38:40 | 显示全部楼层 |阅读模式
hclustglad(GLAD)
hclustglad()所属R语言包:GLAD

                                        Hierarchical Clustering
                                         层次聚类

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Hierarchical cluster analysis on a set of dissimilarities and methods for analyzing it.
上一组的异同和分析方法的聚类分析。


用法----------Usage----------


hclustglad(d, method = "complete", members=NULL)



参数----------Arguments----------

参数:d
a dissimilarity structure as produced by dist.
1相异结构产生dist。


参数:method
the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward", "single", "complete", "average", "mcquitty", "median" or "centroid".
结块的方法来使用。这应该是(一个明确的缩写)"ward","single","complete","average","mcquitty","median"或<X >


参数:members
NULL or a vector with length size of d.
NULL或向量的长度大小的d。


Details

详情----------Details----------

This function performs a hierarchical cluster analysis using a set of dissimilarities for the n objects being clustered.  Initially, each object is assigned to its own cluster and then the algorithm proceeds iteratively, at each stage joining the two most similar clusters, continuing until there is just a single cluster. At each stage distances between clusters are recomputed by the Lance&ndash;Williams dissimilarity update formula according to the particular clustering method being used.
这个函数执行聚类分析,使用正聚集n对象的异同。最初,每个对象都分配给自己的聚类,则算法迭代收益在每一个阶段,参加两个最相似的聚类,一直持续到有仅仅是一个单一聚类。在聚类之间每个阶段的距离是由兰斯·威廉姆斯不同的更新公式重新计算,根据正在使用的特定的聚类方法。

A number of different clustering methods are provided. Ward's minimum variance method aims at finding compact, spherical clusters.  The complete linkage method finds similar clusters. The single linkage method (which is closely related to the minimal spanning tree) adopts a "friends of friends" clustering strategy. The other methods can be regarded as aiming for clusters with characteristics somewhere between the single and complete link methods.
提供了许多不同的聚类方法。病房的最小方差法,旨在寻找紧凑,球形聚类。完整的联系方法找到类似的聚类。单一的联系方法(这是密切相关的最小生成树)采用“朋友的朋友”的聚类策略。其他的方法可以被视为旨在为特色聚类之间单一和完整的链接方法。

If members!=NULL, then d is taken to be a dissimilarity matrix between clusters instead of dissimilarities between singletons and members gives the number of observations per cluster. This way the hierarchical cluster algorithm can be &ldquo;started in the middle of the dendrogram&rdquo;, e.g., in order to reconstruct the part of the tree above a cut (see examples). Dissimilarities between clusters can be efficiently computed (i.e., without hclustglad itself) only for a limited number of distance/linkage combinations, the simplest one being squared Euclidean distance and centroid linkage. In this case the dissimilarities between the clusters are the squared Euclidean distances between cluster means.
如果members!=NULL然后d是一个聚类,而不是单身和之间的异同之间的相异矩阵members给每个聚类的意见。这样的层次聚类算法,可以“开始在中间的树状”,例如,为了重建上面砍树的一部分(见例子)。聚类之间的异同,可以有效地计算(即不hclustglad本身),只有数量有限的距离/联动组合,最简单的平方欧几里德距离和重心联动。在这种情况下,聚类之间的异同之间的聚类方式的平方欧氏距离。

In hierarchical cluster displays, a decision is needed at each merge to specify which subtree should go on the left and which on the right. Since, for n observations there are n-1 merges, there are 2^{(n-1)} possible orderings for the leaves in a cluster tree, or dendrogram. The algorithm used in hclustglad is to order the subtree so that the tighter cluster is on the left (the last, i.e. most recent, merge of the left subtree is at a lower value than the last merge of the right subtree). Single observations are the tightest clusters possible, and merges involving two observations place them in order by their observation sequence number.
在分层聚类显示,决定是否需要在每个合并到指定子树应该去的左侧和右侧。 n观测以来,有n-1合并,有2^{(n-1)}簇树的叶子,或树状的可能顺序。 hclustglad所使用的算法是命令子树,使左侧紧簇(左子树的最后,即最近期,在较低的值比去年合并右子树合并) 。单观测是最严格的聚类可能,合并涉及两个观测放置,以便他们通过自己的观察序列号。


值----------Value----------

An object of class hclust which describes the tree produced by the clustering process. The object is a list with components:
一个对象的的类hclust介绍聚类过程中所产生的树。对象是一个组件列表:


参数:merge
an n-1 by 2 matrix. Row i of merge describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in merge indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons.
n-12矩阵。行imerge介绍一步i聚类簇的合并。如果一个元素j行是否定的,然后观察-j在这个阶段合并。 j如果是积极的,那么合并形成的阶段(早)j算法聚类。因此merge负项表明单身群,并积极的条目表明,非单身群。


参数:height
a set of n-1 non-decreasing real values. The clustering height: that is, the value of the criterion associated with the clustering method for the particular agglomeration.
设置了n-1实际值非递减。聚类的高度,即:与聚类相关的标准值method为特定的集聚。


参数:order
a vector giving the permutation of the original observations suitable for plotting, in the sense that a cluster plot using this ordering and matrix merge will not have crossings of the branches.
矢量给予适合用于绘制的原始观测的排列,在这个意义上使用这个顺序和矩阵merge聚类图不会有分支机构的口岸。


参数:labels
labels for each of the objects being clustered.
标签为每个正在聚集的对象。


参数:call
the call which produced the result.
呼叫产生的结果。


参数:method
the cluster method that has been used.
已使用的聚类方法。


参数:dist.method
the distance that has been used to create d (only returned if the distance object has a "method" attribute).
已被用来创建距离d(只返回的距离,如果对象有一个"method"属性)。


作者(S)----------Author(s)----------



The <code>hclustglad</code> function is based an Algorithm
contributed to STATLIB by F. Murtagh.




参考文献----------References----------

Cluster Analysis. London: Heinemann Educ. Books.
Clustering  Algorithms. New York: Wiley.
Numerical Taxonomy. San Francisco: Freeman.
Cluster Analysis for Applications. Academic Press: New York.
Classification. Second Edition. London: Chapman and Hall / CRC
&ldquo;Multidimensional Clustering Algorithms&rdquo;, in COMPSTAT Lectures 4. Wuerzburg: Physica-Verlag (for algorithmic details of algorithms used).

举例----------Examples----------


data(USArrests)
hc <- hclustglad(dist(USArrests), "ave")
plot(hc)
plot(hc, hang = -1)

## Do the same with centroid clustering and squared Euclidean distance,[#做相同的质心聚类和欧氏距离平方,]
## cut the tree into ten clusters and reconstruct the upper part of the[#切成10聚类树和重建的上部]
## tree from the cluster centers.[#树的聚类中心。]
hc <- hclustglad(dist(USArrests)^2, "cen")
memb <- cutree(hc, k = 10)
cent <- NULL
for(k in 1:10){
  cent <- rbind(cent, colMeans(USArrests[memb == k, , drop = FALSE]))
}
hc1 <- hclustglad(dist(cent)^2, method = "cen", members = table(memb))
opar <- par(mfrow = c(1, 2))
plot(hc,  labels = FALSE, hang = -1, main = "Original Tree")
plot(hc1, labels = FALSE, hang = -1, main = "Re-start from 10 clusters")
par(opar)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-7 02:28 , Processed in 0.023416 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表