R语言 weightedKmeans包 fgkm()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-10-1 20:58:16

fgkm(weightedKmeans)
fgkm()所属R语言包：weightedKmeans

                                       Feature Group Weighting K-Means for Subspace clustering
                                       功能组权重的子空间聚类K-均值为

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Perform an feature group weighting subspace k-means.
执行功能组加权子空间的k-means。

用法----------Usage----------

  fgkm(x, k, strGroup, lambda, eta, maxiter=100, delta=0.000001, maxrestart=10)

参数----------Arguments----------

参数：x
numeric matrix of observations and features.
数字矩阵的观测和功能。

参数：k
target number of clusters.
聚类的目标数量。

参数：strGroup
a string give the group information, formated as "0-9:10-19:20-49"
一个字符串组信息，格式化为“0-9:10-19:20-49”

参数：lambda
parameter of feature weight distribution.
参数特征重量分布。

参数：eta
parameter of group weight distribution.
参数组的重量分布。

参数：delta
maximum change allowed between iterations for convergence.
迭代收敛之间允许的最大变化。

参数：maxiter
maximum number of iterations.
最大迭代次数。

参数：maxrestart
maximum number of restarts. Default is 10 so that we stand a good chance of getting a full set of clusters. Normally, any empty clusters that result are removed from the result, and so we may obtain fewer than k clusters if we don't allow restarts(i.e., maxrestart=0). If < 0, then there is no limit on the number of restarts and we are much likely to get a full set of k clusters.
重新启动的最大数量。默认值是10，使我们站在了一个很好的机会获得一套完整的聚类。通常情况下，任何空聚类，从结果中删除，因此我们可以得到少于k个簇，如果我们不容许重新启动（即maxrestart = 0）。如果<0，然后重新启动的次数是没有限制的，我们有很大的可能得到一套完整的k个聚类。

Details

详细信息----------Details----------

The feature group weighting k-means clustering algorithm is a extension to ewkm, which itself is a soft subspace clustering method.
该功能组加权k-means聚类算法是一个扩展ewkm，这本身就是一个软子空间聚类方法。

The algorithm weights subspaces in both feature groups and individual features.
该算法的权重子空间功能团体和个人的特点。

Always check the number of iterations, the number of restarts, and the total number of iterations as they give a good indication of whether the algorithm converged.
检查的迭代次数，重新启动的次数和总的迭代次数的算法是否融合，为他们提供一个良好的迹象。

As with any distance based algorithm, be sure to rescale your numeric data so that large values do not bias the clustering. A quick rescaling method to use is scale.
任何距离的算法，一定要重新调整您的数字数据，这样大的值没有偏见的聚类。使用一个快速的重新标度方法是scale。

值----------Value----------

Return an object of class "kmeans" and "fgkm", compatible with other function that work with kmeans objects, such as the 'print' method. The object is a list with the following components in addition to the components of the kmeans object:
返回一个对象类“的kmeans”和“fgkm”，与其他功能工作的kmeans的对象，如“打印”的方法。该对象是一个除了的k均值对象的组件列表中的下列组件：

参数：cluster
A vector of integer (from 1:k) indicating the cluster to which each point is allocated.
一个向量的整数（从1：K）表示聚类中的每个点分配。

参数：centers
A matrix of cluster centers.
矩阵的聚类中心。

参数：featureWeight
A matrix of weights recording the relative importance of each feature for each cluster.
记录为每个聚类的每个特征的相对重要性的权重的矩阵。

参数：groupWeight
A matrix of group weights recording the relative importance of each feature goup for each cluster.
矩阵的组权重，每个聚类的每个功能党团记录的相对重要性。

参数：iterations
This report on the number of iterations before termination. Check this to see whether the maxiters was reached. If so then teh algorithm may not be converging, and thus the resulting clustering may not be particularly good.
本报告关于在终止之前的迭代数目。检查这，看是否maxiters达成。如果是这样的话德算法可能不能聚光，并且因此，所得到的聚类可能不是特别好。

参数：restarts
The number of times the clustering restarted because of a disappearing cluster resulting from one or more k-means having no observations associated with it. An number here greater than zero indicates that the algorithm is not converging on a clustering for the given k. It is recommeded that k be reduced.
的数目倍聚类重新启动，因为一个消失的聚类，从而从一个或较多的K-装置不具有与它相关联的观测。这里大于零的数字表明，该算法是没有交集的聚类对于给定的k。它特别建议的k降低。

参数：totalIterations
The total number of iterations over all restarts.
总的迭代次数在所有重新启动。

参数：totolCost
The total cost calculated in the cost function.
在成本函数中计算出的总成本。

（作者）----------Author(s)----------

Longfei Xiao <a href="mailto:lf.xiao@siat.ac.cn">lf.xiao@siat.ac.cn</a>

参考文献----------References----------

clustering of high-dimensional data, Pattern Recognition(2011), doi:10.1016/j.patcog.2011.06.004

参见----------See Also----------

kmeans ewkm
kmeansewkm

实例----------Examples----------

# The data fgkm.sample has 600 objects and 50 dimensions.[数据fgkm.sample有600个对象和50个尺寸。]
# Scale the data before clustering[缩放的数据进行聚类]
x <- scale(fgkm.sample)

# Group information is formated as below.[集团信息格式如下。]
# Each feature is separated by ':'.[每个功能是由“：”分隔。]
strGroup <- "0-9:10-19:20-49"

# Use the fgkm algorithm.[使用算法的fgkm。]
myfgkm <- fgkm(x, 3, strGroup, 3, 1)

# You can print the clustering result now.[现在，您可以打印的聚类结果。]
myfgkm$cluster
myfgkm$featureWeight
myfgkm$groupWeight
myfgkm$iterations
myfgkm$restarts
myfgkm$totiters
myfgkm$totss

# Use a cluster validation method from package 'clv'.[使用聚类验证方法，从CLV包“。]

# real.cluster is the real class label of the data 'fgkm.sample'.[real.cluster是真正的类标签的数据fgkm.sample。]
real.cluster <- rep(1:3, each=200)

# std.ext() returns four values SS, SD, DS, DD.[std.ext（）返回四个值SS，SD，DS，DD。]
std <- std.ext(as.integer(myfgkm$cluster), real.cluster)

# Rand index [兰德指数]
clv.Rand(std)

# Jaccard coefficient[Jaccard系数]
clv.Jaccard(std)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册