revisedsil(RSKC)
revisedsil()所属R语言包:RSKC
The revised silhouette
修订后的剪影
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This function returns a revised silhouette plot, cluster centers in weighted squared Euclidean distances and a matrix containing the weighted squared Euclidean distances between cases and each cluster center. Missing values are adjusted.
这个函数返回经修订的剪影图,聚类中心的加权欧氏距离平方和矩阵的情况下,每个聚类中心的加权欧氏距离平方之间。缺少的值被调整。
用法----------Usage----------
revisedsil(d,W,C,out=NULL,CASEofINT=out,col1="black",CASEofINT2=NULL,col2="red",print.plot=TRUE)
参数----------Arguments----------
参数:d
A numerical data matrix, N by p, where N is the number of cases and p is the number of features.
的数值数据矩阵,Np,其中N是多少的情况下,p是多少功能。
参数:W
A positive real vector of weights of length p.
正实的权重向量的长度p。
参数:C
An integer vector of class labels of length N.
整数向量的长度N类的标签。
参数:out
Vector of the case indices that should be excluded in the calculation of cluster centers. In RSKC, cluster centers are calculated without the cases that have the furthest 100*alpha % Weighted squared Euclidean distances to their closest cluster centers. If one wants to obtain the cluster centers from RSKC output, set out = <RSKCoutput>$oW.
矢量的情况下的指标应排除在计算聚类中心。在RSKC,聚类中心最远100 *alpha%加权平方欧氏距离最近的聚类中心的情况下,有没有计算。如果一个人想获得的聚类中心RSKC输出,out=<RSKCoutput>$oW。
参数:CASEofINT
A vector of the case indices that appear in the revised silhouette plot. The revised silhouette widths of these indices are colored in col1.
一个向量的情况下,指数出现在修订后的剪影图。修订后的剪影,这些指标的宽度着色col1。
参数:col1
See CASEofINT.
见CASEofINT。
参数:CASEofINT2
A vector of the case indices that appear in the revised silhouette plot. The indices are colored in col2.
一个向量的情况下,指数出现在修订后的剪影图。该指数的颜色col2。
参数:col2
See CASEofINT2
见CASEofINT2
参数:print.plot
If TRUE, the revised silhouette is plotted.
如果TRUE,修订后的人影绘制。
值----------Value----------
<table summary="R valueblock"> <tr valign="top"><td>trans.mu</td> <td> Cluster centers in reduced weighted dimension. See example for more detail. </td></tr>
<table summary="R valueblock"> <tr valign="top"> <TD> trans.mu</ TD> <TD>聚类中心的加权尺寸减少。更多详细信息,请参见示例。 </ TD> </ TR>
<tr valign="top"><td>WdisC</td> <td> N by ncl matrix, where ncl is the prespecified number of clusters. It contains the weighted distance between each case and all cluster centers. See example for more detail. </td></tr>
<tr valign="top"> <TD> WdisC </ TD> <TD>Nncl矩阵,其中ncl是预先设定的聚类。它包含了每一种情况下,所有聚类中心的加权距离。更多详细信息,请参见示例。 </ TD> </ TR>
<tr valign="top"><td>sil.order</td> <td> Silhouette values of each case in the order of the case index. </td></tr> <tr valign="top"><td>sil.i</td> <td> Silhouette values of cases ranked by decreasing order within clusters. The corresponding case index are in obs.i </td></tr>
<tr valign="top"> <TD>sil.order</ TD> <TD>剪影的顺序的情况下,指数值在每一种情况下。 </ TD> </ TR> <tr valign="top"> <TD> sil.i</ TD> <TD>剪影值的情况下,聚类内的递减顺序排名。相应的情况下,指数在obs.i</ TD> </ TR>
</table>
</ TABLE>
(作者)----------Author(s)----------
Yumi Kondo <y.kondo@stat.ubc.ca>
参考文献----------References----------
Yumi Kondo (2011), Robustificaiton of the sparse K-means clustering algorithm, MSc. Thesis, University of British Columbia http://hdl.handle.net/2429/37093
实例----------Examples----------
# little simulation function [小的仿真功能]
sim <-
function(mu,f){
D<-matrix(rnorm(60*f),60,f)
D[1:20,1:50]<-D[1:20,1:50]+mu
D[21:40,1:50]<-D[21:40,1:50]-mu
return(D)
}
### output trans.mu ###[#####输出trans.mu]
p<-200;ncl<-3
# simulate a 60 by p data matrix with 3 classes [P数据矩阵与模拟60 3班]
d<-sim(2,p)
# run RSKC[运行RSKC]
re<-RSKC(d,ncl,L1=2,alpha=0.05)
# cluster centers in weighted squared Euclidean distances by function sil[聚类中心的加权欧氏距离平方的功能SIL]
sil.mu<-revisedsil(d,re$weights,re$labels,out=re$oW,print.plot=FALSE)$trans.mu
# calculation [计算]
trans.d<-sweep(d[,re$weights!=0],2,sqrt(re$weights[re$weights!=0]),FUN="*")
class<-re$labels;class[re$oW]<-ncl+1
MEANs<-matrix(NA,ncl,ncol(trans.d))
for ( i in 1 : 3) MEANs[i,]<-colMeans(trans.d[class==i,,drop=FALSE])
sil.mu==MEANs
# coincides [恰逢]
### output WdisC ###[#####输出WdisC]
p<-200;ncl<-3;N<-60
# generate 60 by p data matrix with 3 classes [60 P数据矩阵3类]
d<-sim(2,p)
# run RSKC[运行RSKC]
re<-RSKC(d,ncl,L1=2,alpha=0.05)
si<-revisedsil(d,re$weights,re$labels,out=re$oW,print.plot=FALSE)
si.mu<-si$trans.mu
si.wdisc<-si$WdisC
trans.d<-sweep(d[,re$weights!=0],2,sqrt(re$weights[re$weights!=0]),FUN="*")
WdisC<-matrix(NA,N,ncl)
for ( i in 1 : ncl) WdisC[,i]<-rowSums(scale(trans.d,center=si.mu[i,],scale=FALSE)^2)
# WdisC and si.wdisc coincides[WdisC,并si.wdisc重合]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|