找回密码
 注册
查看: 3247|回复: 0

R语言 seriation包 dissplot()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-9-30 01:29:41 | 显示全部楼层 |阅读模式
dissplot(seriation)
dissplot()所属R语言包:seriation

                                        Dissimilarity Plot
                                         相异图

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Visualizes a dissimilarity matrix using seriation and matrix shading. Entries with lower dissimilarities (higher similarity) are plotted darker. Such a plot can be used to uncover hidden structure in the data.
勾画出其相异度矩阵,采用系列化和矩阵阴影。较低的异同(较高的相似性)的项目策划更暗。这样一个图可以用来在数据中发现隐藏的结构。

The plot can also be used to visualize cluster quality (see Ling 1973). Objects belonging to the same cluster are displayed in consecutive order. The placement of clusters and the within cluster order is obtained by a seriation algorithm which tries to place large similarities/small dissimilarities close to the diagonal. Compact clusters are visible as dark squares (low dissimilarity) on the diagonal of the plot. Additionally, a Silhouette plot (Rousseeuw 1987) is added. This visualization is similar to CLUSION (see Strehl and Ghosh 2002), however, allows for using arbitrary seriating algorithms.
的图,也可以用于可视化聚类质量(见凌1973)。属于同一个聚类的对象显示在连续的顺序。聚类和聚类内为了获得一个的系列化算法,它试图将大量的相似性/小相异的对角线位置。紧凑型聚类是可见的暗广场(低不同)的对角线上的图。此外,一个剪影图(Rousseeuw 1987)中的溶液。这种可视化是类似的包裹体(见施特雷尔和Ghosh,2002),但是,允许使用任意的seriating算法。


用法----------Usage----------


dissplot(x, labels = NULL, method = NULL, control = NULL, options = NULL)   



参数----------Arguments----------

参数:x
an object of class dist.
对象类dist。


参数:labels
NULL or an integer vector of the same length as rows/columns in x indicating the cluster membership for each object in x as consecutive integers starting with one. The labels are used to reorder the matrix.
NULL或整数向量的长度相同的行/列xx从1开始的连续整数中的每个对象表示在聚类成员。标签是用于对重新排序矩阵。


参数:method
a list with up to three elements or a single character string. Use a single character string to apply the same algorithm to reorder the clusters (inter cluster seriation) as well as the objects within each cluster (intra cluster seriation).
与多达三个要素或单个字符的字符串的列表。使用一个字符串,应用相同的算法重新排序簇(簇间系列化),以及在每个聚类(内部聚类系列化)的对象。

If separate algorithms for inter and intra cluster seriation are required, method can be a list of two named elements (inter_cluster and intra_cluster each containing the name of the respective seriation method. See seriate.dist for available algorithms.  
如果需要单独的算法的帧间和帧内聚类系列化,method可以是一个list2命名的元素(inter_cluster和intra_cluster各含有的名称相应的系列化方法请参阅seriate.dist可用的算法。

Set method to NA to plot the matrix as is (no or only coarse  seriation). For intra cluster reordering the special method silhouette width is available. Objects in clusters are then ordered by silhouette width (from silhouette plots). If no method is given, the default method of seriate.dist is used.
设置的方法NA绘制的矩阵(没有或只有粗糙的系列化)。对于内部聚类重新排序特殊的方法silhouette width是可用的。簇中的对象,然后有序的轮廓宽度(剪影图)。如果没有method,默认的方法seriate.dist使用。

The third list element (named aggregation)  controls how inter cluster dissimilarities are computed from from the given dissimilarity matrix. The choices are "avg" (average pairwise dissimilarities; average-link),  "min" (minimal pairwise dissimilarities; single-link),  "max" (maximal pairwise dissimilarities; complete-link), and "Hausdorff" (pairs up each point from one cluster with the most similar point from the other cluster and then uses the largest dissimilarity of paired up points).  
第三个列表元素(名为aggregation)控制簇间的不同点是如何计算从给定的相异度矩阵。选择是"avg"(的平均成对的不同点,平均链路),"min"(最小的两两相异;单链路),"max"(最大两两相异;完整的链接),和 "Hausdorff"(对将每个点与其他聚类最相似的点,从一个聚类,然后使用最大相异的配对点)。


参数:control
a list of control options passed on to the seriation algorithm.   In case of two different seriation algorithms, control can contain a list of two named elements (inter_cluster  and intra_cluster) containing each a list with the control options for the respective algorithm.
的控制选项的列表传递的系列化算法。在两个不同的系列化算法的情况下,control可以包含两个命名的元素的列表(inter_cluster和intra_cluster),其中包含每一个列表的控制选项为各自的算法。


参数:options
a list with options for plotting the matrix. The list can contain the following elements:
与用于绘制的矩阵的选项的列表。该列表可以包含以下元素:

   


plota logical indicating  if a plot should be produced.  if FALSE, the returned object can be plotted later using the function plot which takes as the second argument a list of plotting options (see options below).
plota逻辑表示如果图应。如果FALSE,返回的对象可以绘制以后使用功能plot它作为第二个参数绘图选项的列表(见options以下)。

cluster_labels a logical indicating whether to display cluster labels in the plot.
cluster_labels逻辑表示是否显示类别标签中的图。

averages a logical vector of length two.  The first element controls the upper triangle and the second element  the lower triangle of the plot. FALSE displays the original dissimilarity between objects, TRUE displays cluster-wise  average dissimilarities, and NA leaves the triangle white (default: c(FALSE, TRUE), i.e., the lower triangle displays averages)
averages逻辑向量的长度为二。第一元件控制的上三角形和第二元素的下三角的图。 FALSE显示原来的对象之间的相异,TRUE显示聚类明智的平均相异,和NA离开三角白(默认:c(FALSE, TRUE),即较低的三角形显示的平均)

lines a logical indicating  whether to draw lines to separate clusters.  
lines逻辑指示是否画线独立的聚类。

flip a logical indicating if the clusters are displayed  on the diagonal from north-west to south-east (FALSE; default) or from  north-east to south-west (TRUE).  
flip如果聚类上显示的对角线从西北向东南(FALSE,默认),或从东北到西南(TRUE)的逻辑。

silhouettes a logical indicating  whether to include a silhouette plot (see Rousseeuw, 1987).  
silhouettes逻辑显示,是否包含剪影图(见,1987年Rousseeuw)。

threshold a numeric. If used, only plot distances below the threshold are displayed.  
threshold的数字。如果使用的话,仅积低于阈值的距离显示。

col colors used for the image plot. If col is a single number, it specifies the number of different shades used in the plot (default: 100 shades of gray using the HCL colorspace). If col  not a single number, it is expected to be a full color palette and the potions hue and power are ignored.
col使用的颜色的图像图。如果col是一个单一的数字,它指定用深浅不同的图的数量(默认值:100使用HCL色彩深浅不一的灰色)。如果col不是一个单一的数字,它有望成为一个完整的调色板和药水hue和power将被忽略。

hue color in [0,360] taken from the HCL color wheel. NA indicates gray scale (default: gray scale).
hue的颜色[0360]从HCL色轮。 NA表示灰度(默认:灰度)。

power determines how luminance (and chroma for color)  should be increased (1 = linear, 2 = quadratic, etc.) with dissimilarity.  
power确定如何亮度(及色度的颜色)应增加(1 =线性,2 =二次,等)与相异。

colorkey a logical indicating  whether to place a color key below the plot.  
colorkey逻辑指示是否在图形的下方放置一个彩色键。

main title for the plot.
main标题的图。

lines_col color used for the lines to separate clusters.  
lines_col使用的颜色的线条独立的聚类。

newpage a logical indicating  whether to start plot on a new page (see grid.newpage in package grid).  
newpage逻辑表示是否启动一个新的页面(见图grid.newpage在包grid)。

popa logical indicating  whether to pop the created viewports (see package grid)?   
pop逻辑表示是否要弹出的创建视口(见包装grid)?

gpan object of class gpar containing graphical parameters (see gpar in package grid).     </table>
gp类的一个对象gpar包含图形参数(见gpar在包grid)。 </ TABLE>


值----------Value----------

An invisible object of class cluster_proximity_matrix with the following elements: <table summary="R valueblock"> <tr valign="top"><td>order</td> <td>  NULL or integer vector giving the order used to plot x.</td></tr> <tr valign="top"><td>cluster_order</td> <td>  NULL or integer vector giving the order  of the clusters as plotted.</td></tr> <tr valign="top"><td>method</td> <td>  vector of character strings indicating the seriation methods  used for plotting  x.</td></tr> <tr valign="top"><td>k</td> <td>  NULL or integer scalar giving the number of clusters generated.</td></tr> <tr valign="top"><td>description</td> <td>  a data.frame  containing information (label, size, average intra-cluster dissimilarity and the average silhouette) for the clusters as displayed in the plot (from top/left to bottom/right).</td></tr>
有一条无形的类的对象cluster_proximity_matrix包含下列元素:<table summary="R valueblock"> <tr valign="top"> <TD> order </ TD> <TD><X >或整数向量发出命令用来绘制NULL。</ TD> </ TR> <tr valign="top"> <TD> x</ TD> <TD>cluster_order或整数向量在发出命令的聚类绘制。</ TD> </ TR> <tr valign="top"> <TD> NULL </ TD> <TD>矢量字符的字符串表示系列化的图method。</ TD> </ TR> <tr valign="top"> <TD>x </ TD> <TD>k或方法整数标量,聚类产生的数量。</ TD> </ TR> <tr valign="top"> <TD>NULL </ TD> <td>一个description包含的信息(标签,大小,平均聚类内的差异性和平均剪影)显示聚类中的图(左顶/底/右)。</ TD> </ TR>

</table> This object can be used for plotting via plot(x, options = NULL, ...), where x is the object and options contains a list with plotting options (see above).
</ table>这个对象可以被用于绘制通过plot(x, options = NULL, ...),其中x的对象和options包含绘图选项的列表(见上文)。


参考文献----------References----------

Ling, R.F. (1973): A computer generated aid for cluster analysis.  Communications of the ACM,  16(6), 355&ndash;361.
Rousseeuw, P.J. (1987):  Silhouettes: A graphical aid to the interpretation and  validation of cluster analysis.  Journal of Computational and Applied Mathematics, 20(1), 53&ndash;65.
Strehl, A. and Ghosh, J. (2003): Relationship-based clustering and  visualization for high-dimensional data mining.  INFORMS Journal on Computing,  15(2), 208&ndash;230.


参见----------See Also----------

dist (in package stats), package grid and seriate.
dist(包stats),包grid和seriate。


实例----------Examples----------


data("iris")
d <- dist(iris[-5])

## plot original matrix[#图原矩阵]
res <- dissplot(d, method = NA)

## plot reordered matrix using the nearest insertion algorithm (from tsp)[#图重新排序矩阵使用最接近的插入算法(茶匙)]
res <- dissplot(d, method = "tsp",
    options = list(main = "Seriation (TSP)"))

## cluster with pam (we know iris has 3 clusters)[#聚类与PAM(我们知道光圈有3类)]
library("cluster")
l <- pam(d, 3, cluster.only = TRUE)

## we use a grid layout to place several plots on a page[#我们使用了网格布局,在页面上放置几个图]
grid.newpage()
pushViewport(viewport(layout=grid.layout(nrow = 2, ncol = 2),
    gp = gpar(fontsize = 8)))
pushViewport(viewport(layout.pos.row = 1, layout.pos.col = 1))

## visualize the clustering[#可视化的聚类]
res <- dissplot(d, l, method = "chen",  
    options = list(main = "PAM + Seriation (Chen) - standard",
    newpage = FALSE))

popViewport()
pushViewport(viewport(layout.pos.row = 1, layout.pos.col = 2))

## more visualization options[#更多的可视化选项]
## color: use 10 shades of blue (hue = 270)[:使用10#颜色深浅不同的蓝色(色相= 270)]
plot(res, options = list(main = "PAM + Seriation (Chen) - blue, only avg.",
    col= 10, hue=260, averages = c(TRUE, TRUE), newpage = FALSE))

popViewport()
pushViewport(viewport(layout.pos.row = 2, layout.pos.col = 1))

## threshold and cubic scale to highlight differences[#阈值和立方规模,突出差异]
plot(res, options = list(main = "PAM + Seriation (Chen) - threshold",
    threshold = 1.5, power = 3, newpage = FALSE))

popViewport()
pushViewport(viewport(layout.pos.row = 2, layout.pos.col = 2))

## use custom (logistic) scale[#使用自定义(MF)规模]
plot(res, options = list(main = "PAM + Seriation (Chen) - logistic scale",
    col= hcl(c = 0, l = (plogis(seq(0,10,length=100),
        location = 2, scale = 1/2, log = FALSE))*100),
        newpage = FALSE))

popViewport(2)

## the reordered_cluster_dissimilarity_matrix object[#reordered_cluster_dissimilarity_matrix对象]
res
names(res)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-5-21 02:42 , Processed in 0.025600 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表