HierarchicalSparseCluster.permute(sparcl)
HierarchicalSparseCluster.permute()所属R语言包:sparcl
Choose tuning parameter for sparse hierarchical clustering
选择调整参数稀疏的层次聚类
译者:生物统计家园网 机器人LoveR
描述----------Description----------
The tuning parameter controls the L1 bound on w, the feature weights. A permutation approach is used to select the tuning parameter.
调整参数控制L1 W,要素权重的约束。置换的方法来选择调谐参数。
用法----------Usage----------
HierarchicalSparseCluster.permute(x, nperms = 10, wbounds = NULL,
dissimilarity=c("squared.distance", "absolute.value"),standardize.arrays=FALSE)
参数----------Arguments----------
参数:x
A nxp data matrix, with n observations and p feaures.
恩智浦的数据矩阵,N个观测值和P feaures。
参数:nperms
The number of permutations to perform.
的排列数来执行。
参数:wbounds
The sequence of tuning parameters to consider. The tuning parameters are the L1 bound on w, the feature weights. If NULL, then a default sequence will be used. If non-null, should be greater than 1.
调整参数的顺序来考虑。调优参数L1 W,要素权重的约束。如果为NULL,则一个默认的序列将被使用。如果非空,应该是大于1。
参数:dissimilarity
How should dissimilarity be computed? Default is squared.distance.
应该如何相异计算?默认是squared.distance。
参数:standardize.arrays
Should the arrays first be standardized? Default is FALSE.
如果数组的第一个标准化了吗?默认值是false。
Details
详细信息----------Details----------
Let $d_ii'j$ denote the dissimilarity between observations i and i' along feature j.
让d_iij $表示观察i和i沿特征j之间的相异。
Sparse hierarchical clustering seeks a p-vector of weights w (one per feature) and a nxn matrix U that optimize $maximize_U,w sum_j w_j sum_ii' d_ii'j U_ii'$ subject to $||w||_2 <= 1, ||w||_1 <= s, w_j >= 0, sum_ii' U_ii'^2 <= 1$, where s is a value for the L1 bound on w. Let O(s) denote the objective function with tuning parameter s: i.e. $O(s)=sum_j w_j sum_ii' d_ii'j U_ii'$.
稀疏的层次聚类的目的向量的权重w(每一个功能)和一个n×n的矩阵U,优化maximize_U,瓦特sum_j w_j sum_ii的d_iij U_ii $ $ | |没有| | _2 <= 1的P- ,| |瓦特| | _1 <= s时w_j> = 0,sum_ii的U_ii“^ 2 <= 1,其中,s是一个值,该值在W结合的L1。 O(S)为目标函数与调整参数s:即$ O(S)= sum_j w_j sum_ii d_iij U_ii美元。
We permute the data as follows: within each feature, we permute the observations. Using the permuted data, we can run sparse hierarchical clustering with tuning parameter s, yielding the objective function O*(s). If we do this repeatedly we can get a number of O*(s) values.
我们置换的数据如下:在每一个功能,我们置换的意见。使用排列的数据,我们可以调整参数s运行稀疏的层次聚类,得到目标函数O *(S)。如果我们这样做反复,我们可以得到一些O *(S)值。
Then, the Gap statistic is given by $Gap(s)=log(O(s))-mean(log(O*(s)))$. The optimal s is that which results in the highest Gap statistic. Or, we can choose the smallest s such that its Gap statistic is within $sd(log(O*(s)))$ of the largest Gap statistic.
然后,差距统计数字是$峡(S)= LOG(O(S))的意思(登录(O *()))$。最优s是,这会导致最高的间隙统计。或者,我们可以选择最小的S等,其间隙统计范围内的SD(log(O *()))的最大间隙统计。
值----------Value----------
参数:gaps
The gap statistics obtained (one for each of the tuning parameters tried). If O(s) is the objective function evaluated at the tuning parameter s, and O*(s) is the same quantity but for the permuted data, then Gap(s)=log(O(s))-mean(log(O*(s))).
获得的间隙统计(每个调谐参数之一试过)。如果O(S)的目标函数在调整参数s进行评估,O *(s)是相同的数量,但置换后的数据,然后间隙(S)=log(O(S)),平均值(log( O *()))。
参数:sdgaps
The standard deviation of log(O*(s)), for each value of the tuning parameter s.
log(O *())的标准偏差,为每个调谐参数s的值。
参数:nnonzerows
The number of features with non-zero weights, for each value of the tuning parameter.
的数量与非零的权重的功能,每个值的调谐参数。
参数:wbounds
The tuning parameters considered.
调整参数。
参数:bestw
The value of the tuning parameter corresponding to the highest gap statistic.
调谐参数的值对应于最高的间隙统计。
(作者)----------Author(s)----------
Daniela M. Witten and Robert Tibshirani
参考文献----------References----------
参见----------See Also----------
HierarchicalSparseCluster, KMeansSparseCluster, KMeansSparseCluster.permute
HierarchicalSparseCluster,KMeansSparseCluster,KMeansSparseCluster.permute
实例----------Examples----------
# Generate 2-class data[生成2级数据]
set.seed(1)
x <- matrix(rnorm(100*200),ncol=200)
y <- c(rep(1,50),rep(2,50))
x[y==1,1:25] <- x[y==1,1:25]+2
# Do tuning parameter selection for sparse hierarchical clustering[调整参数的选择稀疏的层次聚类]
perm.out <- HierarchicalSparseCluster.permute(x, wbounds=c(1.5,2:9),
nperms=5)
print(perm.out)
plot(perm.out)
# Perform sparse hierarchical clustering[执行稀疏的层次聚类]
sparsehc <- HierarchicalSparseCluster(dists=perm.out$dists, wbound=perm.out$bestw, method="complete")
par(mfrow=c(1,2))
plot(sparsehc)
plot(sparsehc$hc, labels=rep("", length(y)))
print(sparsehc)
# Plot using knowledge of class labels in order to compare true class[图类标签的使用知识,以比较真实的类]
# labels to clustering obtained[标签得到的聚类]
par(mfrow=c(1,1))
ColorDendrogram(sparsehc$hc,y=y,main="My Simulated
Data",branchlength=.007)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|