R语言:agnes()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-16 18:23:49

agnes(cluster)
agnes()所属R语言包：cluster

 Agglomerative Nesting (Hierarchical Clustering)
 凝聚嵌套（层次聚类）

 译者：生物统计家园网机器人LoveR

描述----------Description----------

Computes agglomerative hierarchical clustering of the dataset.
计算凝聚层次聚类的数据集。

用法----------Usage----------

agnes(x, diss = inherits(x, "dist"), metric = "euclidean",
 stand = FALSE, method = "average", par.method,
 keep.diss = n < 100, keep.data = !diss)

参数----------Arguments----------

参数：x
data matrix or data frame, or dissimilarity matrix, depending on the value of the diss argument. In case of a matrix or data frame, each row corresponds to an observation, and each column corresponds to a variable. All variables must be numeric. Missing values (NAs) are allowed. In case of a dissimilarity matrix, x is typically the output of daisy or dist. Also a vector with length n*(n-1)/2 is allowed (where n is the number of observations), and will be interpreted in the same way as the output of the above-mentioned functions. Missing values (NAs) are not allowed.
数据矩阵或数据框，或相异矩阵，根据上diss参数值。在一个矩阵或数据框的情况下，每一行对应一个观察，每列对应一个变量。所有的变量必须是数字。遗漏值（NAS）是允许的。在一个相异矩阵的情况下，x通常是daisy或dist输出。也被允许长度为n *（N-1）/ 2的向量（其中n为若干意见），将在上述功能的输出相同的方式解释。遗漏值（NAS）是不允许的。

参数：diss
logical flag: if TRUE (default for dist or dissimilarity objects), then x is assumed to be a dissimilarity matrix. If FALSE, then x is treated as a matrix of observations by variables.
逻辑标志：如果为TRUE（默认为dist或dissimilarity对象），然后x假设是相异矩阵。如果为FALSE，那么x被视为一个由变量的观测矩阵。

参数：metric
character string specifying the metric to be used for calculating dissimilarities between observations. The currently available options are "euclidean" and "manhattan". Euclidean distances are root sum-of-squares of differences, and manhattan distances are the sum of absolute differences. If x is already a dissimilarity matrix, then this argument will be ignored.
字符串指定的度量用于计算之间的意见异同。目前可用的选项是“欧几里德”和“曼哈顿”。欧氏距离总和的平方差异的根，和曼哈顿距离是绝对差异的总和。 x如果已经是一个相异矩阵，那么这个参数将被忽略。

参数：stand
logical flag: if TRUE, then the measurements in x are standardized before calculating the dissimilarities. Measurements are standardized for each variable (column), by subtracting the variable's mean value and dividing by the variable's mean absolute deviation. If x is already a dissimilarity matrix, then this argument will be ignored.
逻辑标志：如果属实，那么测量x前计算的异同标准化。测量是为每个变量（列）减去变量的平均值除以变量的平均绝对偏差，标准化。 x如果已经是一个相异矩阵，那么这个参数将被忽略。

参数：method
character string defining the clustering method. The six methods implemented are "average" ([unweighted pair-]group average method, UPGMA), "single" (single linkage), "complete" (complete linkage), "ward" (Ward's method), "weighted" (weighted average linkage) and its generalization "flexible" which uses (a constant version of) the Lance-Williams formula and the par.method argument. Default is "average".
字符串定义的聚类方法。六实施的方法是“平均”（[加权对组平均法，聚类），“单”（单联动），“完成”（完整的联动），“病房”（Ward的方法），“ ;加权（加权平均联动）及其推广"flexible"使用（恒定版）兰斯 - 威廉姆斯公式和par.method参数。默认是“平均”。

参数：par.method
if method == "flexible", numeric vector of length 1, 3, or 4, see in the details section.
如果method == "flexible"，向量的长度为1，3或4的数字，在细节部分看到。

参数：keep.diss, keep.data
logicals indicating if the dissimilarities and/or input data x should be kept in the result. Setting these to FALSE can give much smaller results and hence even save memory allocation time.
逻辑值表示的异同和/或输入数据x应保持在结果。这些设置FALSE可以给结果小得多，因此，即使节省内存分配时间。

Details

详情----------Details----------

agnes is fully described in chapter 5 of Kaufman and Rousseeuw (1990). Compared to other agglomerative clustering methods such as hclust, agnes has the following features: (a) it yields the agglomerative coefficient (see agnes.object) which measures the amount of clustering structure found; and (b) apart from the usual tree it also provides the banner, a novel graphical display (see plot.agnes).
agnes完全中所述：章考夫曼和Rousseeuw的（1990）5。其他诸如凝聚聚类方法相比hclust，agnes具有以下特点：（一）它产生的凝聚系数（见agnes.object）测量发现聚类结构; （b）除了一般的树，它也提供了旗帜，一种新型的图形显示（见plot.agnes）。

The agnes-algorithm constructs a hierarchy of clusterings. At first, each observation is a small cluster by itself. Clusters are merged until only one large cluster remains which contains all the observations. At each stage the two nearest clusters are combined to form one larger cluster.
agnes算法构造一个层次聚类。参考起初，每个观测本身就是一个小型集群。集群合并，直到只有一个大型集群的遗骸，其中包含所有的意见。在每一个阶段，这两个最接近的集群相结合，形成一个较大的集群。

For method="average", the distance between two clusters is the average of the dissimilarities between the points in one cluster and the points in the other cluster. In method="single", we use the smallest dissimilarity between a point in the first cluster and a point in the second cluster (nearest neighbor method). When method="complete", we use the largest dissimilarity between a point in the first cluster and a point in the second cluster (furthest neighbor method).
两个群集之间的距离对于method="average"，是在其他集群在一个集群中的点和点之间的异同平均水平。在的method="single" ，我们使用在第一簇的点，一个点在第二个簇（近邻法）之间的最小相异。 当method="complete"，我们在第一簇的点，一个点在第二个群集（最远邻法）之间使用的最大的相异。

The method = "flexible" allows (and requires) more details: The Lance-Williams formula specifies how dissimilarities are computed when clusters are agglomerated (equation (32) in K.\&R., p.237). If clusters C_1 and C_2 are agglomerated into a new cluster, the dissimilarity between their union and another cluster Q is given by
method = "flexible"允许（和要求）的更多细节：兰斯 - 威廉姆斯公式指定异同是如何计算集群结块（方程（32）光\＆R，第237页）。如果集群C_1和C_2结块到一个新的集群，他们的工会和另一个群集Q之间的不同给予

where the four coefficients (α_1, α_2, β, γ) are specified by the vector par.method:
四个系数(α_1, α_2, β, γ)向量指定par.method：

If par.method is of length 1, say = α, par.method is extended to give the “Flexible Strategy” (K. \& R., p.236 f) with Lance-Williams coefficients (α_1 = α_2 = α, β = 1 - 2α, γ=0). If of length 3, γ = 0 is used.
par.method如果长度为1，说= α，par.method延长给兰斯的“柔性战略”（光\＆R，p.236 f）威廉姆斯系数(α_1 = α_2 = α, β = 1 - 2α, γ=0)。参考，如果长度为3，γ = 0使用。

Care and expertise is probably needed when using method = "flexible" particularly for the case when par.method is specified of longer length than one. The weighted average (method="weighted") is the same as method="flexible", par.method = 0.5.
时使用method = "flexible"时par.method是一个长度长于指定的情况下，特别是为可能需要照顾和专业知识。加权平均（method="weighted"）是method="flexible", par.method = 0.5的相同。

值----------Value----------

an object of class "agnes" representing the clustering. See agnes.object for details.
类"agnes"代表聚类的对象。看到agnes.object详情。

背景----------BACKGROUND----------

Cluster analysis divides a dataset into groups (clusters) of observations that are similar to each other.
聚类分析划分成组的意见，彼此相似（集群）的数据集。

Hierarchical methods like agnes, diana, and mona construct a hierarchy of clusterings, with the number of clusters
像agnes，diana，mona建设数量的聚类层次，集群的分层方法

Partitioning methods like pam, clara, and fanny
像pam，clara，fanny分区方法

参考文献----------References----------

Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.
Clustering in an Object-Oriented Environment. Journal of Statistical Software, 1. http://www.stat.ucla.edu/journals/jss/
Robust Clustering Techniques in S-PLUS, Computational Statistics and Data Analysis, 26, 17–37.
A General Theory of Classifactory Sorting Strategies, I. Hierarchical Systems. Computer J. 9, 373–380.

参见----------See Also----------

agnes.object, daisy, diana, dist, hclust, plot.agnes, twins.object.
agnes.object，daisy，diana，dist，hclust，plot.agnes，twins.object。

举例----------Examples----------

data(votes.repub)
agn1 <- agnes(votes.repub, metric = "manhattan", stand = TRUE)
agn1
plot(agn1)

op <- par(mfrow=c(2,2))
agn2 <- agnes(daisy(votes.repub), diss = TRUE, method = "complete")
plot(agn2)
agnS <- agnes(votes.repub, method = "flexible", par.meth = 0.6)
plot(agnS)
par(op)

data(agriculture)
## Plot similar to Figure 7 in ref[＃绘制类似图7文献]
## Not run: plot(agnes(agriculture), ask = TRUE)[＃无法运行图（艾格尼丝（农业），问= TRUE）]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言:agnes()函数中文帮助文档(中英文对照)

浏览过的版块