找回密码
 注册
查看: 3430|回复: 0

R语言:clusplot.default()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-16 17:55:44 | 显示全部楼层 |阅读模式
clusplot.default(cluster)
clusplot.default()所属R语言包:cluster

                                        Bivariate Cluster Plot (clusplot) Default Method
                                         二元聚类图(clusplot)默认方法

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Creates a bivariate plot visualizing a partition (clustering) of the data. All observation are represented by points in the plot, using principal components or multidimensional scaling. Around each cluster an ellipse is drawn.
创建一个可视化的数据分区(集群)的二元图。代表所有的观察点的图,采用主成分或多维尺度。每个簇周围绘制一个椭圆。


用法----------Usage----------


## Default S3 method:[默认方法]
clusplot(x, clus, diss = FALSE, cor = TRUE, stand = FALSE,
          lines = 2, shade = FALSE, color = FALSE,
          labels= 0, plotchar = TRUE,
          col.p = "dark green", col.txt = col.p,
          col.clus = if(color) c(2, 4, 6, 3) else 5, cex = 1, cex.txt = cex,
          span = TRUE,
          add = FALSE,
          xlim = NULL, ylim = NULL,
          main = paste("CLUSPLOT(", deparse(substitute(x)),")"),
          sub = paste("These two components explain",
             round(100 * var.dec, digits = 2), "% of the point variability."),
          verbose = getOption("verbose"),
          ...)



参数----------Arguments----------

参数:x
matrix or data frame, or dissimilarity matrix, depending on the value of the diss argument.  In case of a matrix (alike), each row corresponds to an observation, and each column corresponds to a variable.  All variables must be numeric. Missing values (NAs) are allowed.  They are replaced by the median of the corresponding variable.  When some variables or some observations contain only missing values, the function stops with a warning message.  In case of a dissimilarity matrix, x is the output of daisy or dist or a symmetric matrix.  Also, a vector of length n*(n-1)/2 is allowed (where n is the number of observations), and will be interpreted in the same way as the output of the above-mentioned functions.  Missing values (NAs) are not allowed.  
矩阵或数据框,或相异矩阵,上diss参数值而定。 (都)在一个矩阵的情况下,每行对应一个观察,每一列对应一个变量。所有的变量必须是数字。遗漏值(NAS)是允许的。他们被替换相应的变量的中位数。当一些变量或一些意见包含唯一缺少的值,函数停止一条警告消息。在一个相异矩阵的情况下,x是daisy或dist或对称矩阵输出。此外,向量的长度n*(n-1)/2允许(其中n的若干意见),将在上述功能的输出相同的方式解释。遗漏值(NAS)是不允许的。


参数:clus
a vector of length n representing a clustering of x.  For each observation the vector lists the number or name of the cluster to which it has been assigned. clus is often the clustering component of the output of pam, fanny or clara.
一个长度为n代表了x集群的向量。对于每个观测向量列出的集群,它已被指派的号码或姓名。 clus往往是pam,fanny或clara输出的群集组件。


参数:diss
logical indicating if x will be considered as a dissimilarity matrix or a matrix of observations by variables (see x arugment above).
逻辑表示x如果将考虑作为相异矩阵或一个由变量的观测矩阵(见xarugment以上)。


参数:cor
logical flag, only used when working with a data matrix (diss       = FALSE). If TRUE, then the variables are scaled to unit variance.
逻辑标志,只有一个数据矩阵(diss       = FALSE)工作时使用。如果属实,那么变量缩放到单位方差。


参数:stand
logical flag: if true, then the representations of the n observations in the 2-dimensional plot are standardized.  
逻辑标志:如果属实,那么在2维图的N个观测交涉标准化。


参数:lines
integer out of 0, 1, 2, used to obtain an idea of the distances between ellipses.  The distance between two ellipses E1 and E2 is measured along the line connecting the centers m1 and m2 of the two ellipses.  In case E1 and E2 overlap on the line through m1 and m2, no line is drawn.  Otherwise, the result depends on the value of lines: If     
0, 1, 2整数出来,用来获得一个椭圆形之间的距离的想法。沿线连接中心m1和m2的两个椭圆形的两个椭圆E1和E2之间的距离测量。 E1和E2的情况下就行重叠,通过m1和m2,没有线绘制。否则,其结果取决于值lines:如果

lines = 0,no distance lines will appear on the plot;  
= 0行,没有长途线路上会出现的图;

lines = 1,the line segment between m1 and m2 is drawn;  
线= 1m1和m2绘制之间的线段;

lines = 2,a line segment between the boundaries of E1 and E2 is drawn (along the line connecting m1 and m2).     
= 2行,E1和E2之间的边界线段绘制(连接线沿线m1和m2)。


参数:shade
logical flag: if TRUE, then the ellipses are shaded in relation to their density. The density is the number of points in the cluster divided by the area of the ellipse.  
逻辑标志:如果属实,那么椭圆形,其密度阴影。密度点在椭圆区域划分的集群。


参数:color
logical flag: if TRUE, then the ellipses are colored with respect to their density. With increasing density, the colors are light blue, light green, red and purple.  To see these colors on the graphics device, an appropriate color scheme should be selected (we recommend a white background).
逻辑标志:如果属实,那么其密度着色椭圆。随着密度的增加,颜色是光的蓝色,浅绿色,红色和紫色。图形设备上看到这些颜色,可以选择适当的配色方案(我们推荐一个白色的背景)。


参数:labels
integer code, currently one of 0,1,2,3,4 and 5.  If     
整数代码,目前0,1,2,3,4和5。如果

labels= 0,no labels are placed in the plot;  
标签= 0,没有标签被放置在小区;

labels= 1,points and ellipses can be identified in the plot (see identify);  
标签= 1,点和椭圆可以识别的图(见identify);

labels= 2,all points and ellipses are labelled in the plot;  
标签= 2,所有的点和椭圆标记中的图;

labels= 3,only the points are labelled in the plot;  
标签= 3,只有点标记的图;

labels= 4,only the ellipses are labelled in the plot.  
标签= 4,只有椭圆标记中的图。

labels= 5,the ellipses are labelled in the plot, and points can be identified.     The levels of the vector clus are taken as labels for the clusters.  The labels of the points are the rownames of x if x is matrix like. Otherwise (diss = TRUE), x is a vector, point labels can be attached to x as a "Labels" attribute (attr(x,"Labels")), as is done for the output of daisy.  A possible names attribute of clus will not be taken into account.  
标签= 5,椭圆标记中的图,可以识别点。向量clus水平都采取集群的标签。点的标签,是了x如果x是矩阵希望rownames。否则(diss = TRUE)x是一个向量,点标签可以连接到x作为一个“标签”属性(attr(x,"Labels")),输出daisy。一种可能的names属性clus不会被考虑。


参数:plotchar
logical flag: if TRUE, then the plotting symbols differ for points belonging to different clusters.  
逻辑标志:如果属实,那么图形符号不同分属于不同的集群。


参数:span
logical flag: if TRUE, then each cluster is represented by the ellipse with smallest area containing all its points. (This is a special case of the minimum volume ellipsoid.)<br> If FALSE, the ellipse is based on the mean and covariance matrix of the same points.  While this is faster to compute, it often yields a much larger ellipse.  There are also some special cases:  When a cluster consists of only one point, a tiny circle is drawn around it.  When the points of a cluster fall on a straight line, span=FALSE draws a narrow ellipse around it and span=TRUE gives the exact line segment.  
逻辑标志:如果属实,那么每个簇包含所有点的最小面积的椭圆代表。 (这是一个特殊情况下的最小体积椭球)。参考,如果为FALSE,椭圆上的相同点的均值和协方差矩阵为基础。虽然这是更快的计算,它往往产生一个更大的椭圆形。也有一些特殊情况:当集群中只有一个点,它周围绘制一个小圆圈。当集群秋天的一条直线上的点,span=FALSE周围绘制一个狭窄的椭圆形和span=TRUE给出确切的线段。


参数:add
logical indicating if ellipses (and labels if labels is true) should be added to an already existing plot.  If false, neither a title or sub title, see sub, is written.
逻辑表明,如果椭圆(标签labels如果是真正的)应该被添加到一个已经存在的图。如果假的,既不是title或子标题,看到sub,被写入。


参数:col.p
color code(s) used for the observation points.
颜色代码(S)用于观测点。


参数:col.txt
color code(s) used for the labels (if labels >= 2).
颜色代码(S)用于标签(如果labels >= 2)。


参数:col.clus
color code for the ellipses (and their labels); only one if color is false (as per default).
颜色代码为椭圆(和它们的标签);只有一个,如果颜色是假的(默认为)。


参数:cex, cex.txt
character expansion (size), for the point symbols and point labels, respectively.
字符扩展(大小),点符号和点标签,分别。


参数:xlim, ylim
numeric vectors of length 2, giving the x- and y- ranges as in plot.default.
数值向量的长度为2的x和y的范围,给予plot.default。


参数:main
main title for the plot; by default, one is constructed.
图的主标题,默认情况下,一个构造。


参数:sub
sub title for the plot; by default, one is constructed.
子标题为图;默认情况下,一个构造。


参数:verbose
a logical indicating, if there should be extra diagnostic output; mainly for "debugging".
逻辑表示,如果有应该是额外的诊断输出,主要用于“调试”。


参数:...
Further graphical parameters may also be supplied, see par.
也可以提供进一步的图形参数,请参阅par。


Details

详情----------Details----------

clusplot uses the functions princomp and cmdscale.  These functions are data reduction techniques. They will represent the data in a bivariate plot. Ellipses are then drawn to indicate the clusters.  The further layout of the plot is determined by the optional arguments.
clusplot使用的功能princomp和cmdscale。这些功能是数据压缩技术。他们将代表在二元图中的数据。椭圆形,然后绘制表明集群。该图的进一步布局确定的可选参数。


值----------Value----------

An invisible list with components:
无形的一个组件列表:


参数:Distances
When lines is 1 or 2 we optain a k by k matrix (k is the number of clusters).  The element in [i,j] is the distance between ellipse i and ellipse j.<br> If lines = 0, then the value of this component is NA.  
当lines是1或2,我们optain AK K矩阵(k是簇的数目)。在[i,j]元素是我的椭圆和椭圆j之间的距离。参考如果lines = 0,那么这个组件的值是NA。


参数:Shading
A vector of length k (where k is the number of clusters), containing the amount of shading per cluster. Let y be a vector where element i is the ratio between the number of points in cluster i and the area of ellipse i. When the cluster i is a line segment, y[i] and the density of the cluster are set to NA. Let z be the sum of all the elements of y without the NAs. Then we put shading = y/z *37 + 3 .  
一个向量的长度为k(其中k是簇的数目),含有金额的每个群集的阴影。设y是一个向量元素我是在集群之间的点数比我和我的椭圆的面积。当群集i是一个线段,Y [I]和集群的密度设置为NA。设Z是没有NAS y的所有元素的总和。然后,我们把阴影= Y / Z * 37 + 3。


副作用----------Side Effects----------

a visual display of the clustering is plotted on the current graphics device.
聚类的视觉显示当前图形设备上绘制。


注意----------Note----------

When we have 4 or fewer clusters, then the color=TRUE gives every cluster a different color.  When there are more than 4 clusters, clusplot uses the function pam to cluster the densities into 4 groups such that ellipses with nearly the same density get the same color.  col.clus specifies the colors used.
当我们有4个或更少的集群,然后color=TRUE给每一个集群不同的颜色。当有超过4个集群,clusplot使用功能集群密度pam等分为4组,用几乎相同的密度椭圆得到相同的颜色。 col.clus指定所使用的颜色。

The col.p and col.txt arguments, added for R, are recycled to have length the number of observations. If col.p has more than one value, using color = TRUE can be confusing because of a mix of point and ellipse colors.
col.p和col.txt参数,为R的增加,回收有长度的若干意见。如果col.p有多个值,使用color = TRUE可以混淆,因为点和椭圆形的颜色组合。


参考文献----------References----------

Displaying a Clustering with CLUSPLOT, Computational Statistics and Data Analysis, 30, 381&ndash;392.<br> A version of this is available as technical report from http://www.agoras.ua.ac.be/abstract/Disclu99.htm
Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.
Integrating Robust Clustering Techniques in S-PLUS, Computational Statistics and Data Analysis, 26, 17-37.

参见----------See Also----------

princomp, cmdscale, pam, clara, daisy, par, identify, cov.mve, clusplot.partition.
princomp,cmdscale,pam,clara,daisy,par,identify,cov.mve,clusplot.partition。


举例----------Examples----------


## plotting votes.diss(dissimilarity) in a bivariate plot and[在二元图#图votes.diss(相异)]
## partitioning into 2 clusters[#分割成2集群]
data(votes.repub)
votes.diss <- daisy(votes.repub)
pamv <- pam(votes.diss, 2, diss = TRUE)
clusplot(pamv, shade = TRUE)
## is the same as[#是一样的]
votes.clus <- pamv$clustering
clusplot(votes.diss, votes.clus, diss = TRUE, shade = TRUE)

clusplot(pamv, col.p = votes.clus, labels = 4)# color points and label ellipses[色点和标签椭圆]
# "simple" cheap ellipses: larger than minimum volume:[“简单”便宜的椭圆形:大于体积最小:]
# here they are *added* to the previous plot:[在这里,他们是*加*前面的图:]
clusplot(pamv, span = FALSE, add = TRUE, col.clus = "midnightblue")

## a work-around for setting a small label size:[#工作周围设置一个小标签的大小:]
clusplot(votes.diss, votes.clus, diss = TRUE)
op <- par(new=TRUE, cex = 0.6)
clusplot(votes.diss, votes.clus, diss = TRUE,
         axes=FALSE,ann=FALSE, sub="", col.p=NA, col.txt="dark green", labels=3)
par(op)
## MM: This should now be as simple as[#的MM:现在应该是简单,只要]
clusplot(votes.diss, votes.clus, diss = TRUE, labels = 3, cex.txt = 0.6)


if(interactive()) { #  uses identify() *interactively* :[使用识别()*交互式*:]
  clusplot(votes.diss, votes.clus, diss = TRUE, shade = TRUE, labels = 1)
  clusplot(votes.diss, votes.clus, diss = TRUE, labels = 5)# ident. only points[IDENT。只点]
}

## plotting iris (data frame) in a 2-dimensional plot and partitioning[#虹膜(数据框)绘制一个2维的图和分区]
## into 3 clusters.[#分为3类。]
data(iris)
iris.x <- iris[, 1:4]
cl3 <- pam(iris.x, 3)$clustering
op <- par(mfrow= c(2,2))
clusplot(iris.x, cl3, color = TRUE)
U <- par("usr")
## zoom in :[#放大:]
rect(0,-1, 2,1, border = "orange", lwd=2)
clusplot(iris.x, cl3, color = TRUE, xlim = c(0,2), ylim = c(-1,1))
box(col="orange",lwd=2); mtext("sub region", font = 4, cex = 2)
##  or zoom out :[#或缩小:]
clusplot(iris.x, cl3, color = TRUE, xlim = c(-4,4), ylim = c(-4,4))
mtext("`super' region", font = 4, cex = 2)
rect(U[1],U[3], U[2],U[4], lwd=2, lty = 3)

# reset graphics[重置图形]
par(op)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-23 06:09 , Processed in 0.041070 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表