R语言 scrime包 gknn()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-29 23:02:18

gknn(scrime)
gknn()所属R语言包：scrime

                                    Generalized k Nearest Neighbors
                                       广义k近邻

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Predicts the classes of new observations with k Nearest Neighbors based on an user-specified distance measure.
新的观测与预测类k根据用户指定的距离度量近邻。

用法----------Usage----------

gknn(data, cl, newdata, nn = 5, distance = NULL, use.weights = FALSE, ...)

参数----------Arguments----------

参数：data
a numeric matrix in which each row represents an observation and each column a variable. If distance is "smc", "cohen" or "pcc", the values in data must be integers between 1 and n.cat, where n.cat is the maximum number of levels one of the variables can take. Missing values are allowed.
一个数值的矩阵，其中每一行代表一个观察和每一列的变量。如果distance是"smc"，"cohen"或"pcc"，中data的值必须是整数1至n.cat，其中n.cat是的最大数量的水平，可以采取的变量之一。遗漏值是允许的。

参数：cl
a numeric vector of length nrow(data) giving the class labels of the observations represented by the rows of data. cl must consist of integers between 1 and n.cl, where n.cl is the number of groups.
一个数值向量的长度nrow(data)给的意见所代表的行data类的标签。 cl必须由整数1至n.cl，其中n.cl的组数。

参数：newdata
a numeric matrix in which each row represents a new observation for which the class label should be predicted and each column consists of the same variable as the corresponding column of data.
一个数字矩阵中的每一行代表一个新的观察类的标签应进行预测和每一列由同一变量的相应列的data。

参数：nn
an integer specifying the number of nearest neighbors used to classify the new observations.
最近的邻居的数量的整数，指定新的观测数据进行分类。

参数：distance
character vector naming the distance measure used to identify the nn nearest neighbors. Must be one of "smc", "cohen", "pcc", "euclidean", "maximum", "manhattan", "canberra", and "minkowski". If NULL, it is determined in an ad hoc way if the data seems to be categorical. If this is the case distance is set to "smc". Otherwise, it is set to "euclidean".
字符向量命名的距离测量，用于识别nn最近的邻居。必须有一个"smc"，"cohen"，"pcc"，"euclidean"，"maximum"，"manhattan"，"canberra"，<X >。如果"minkowski"，它是在一个特设的方式来确定的数据似乎是明确的。如果是这样的情况下NULL被设置成distance。否则，它被设置为"smc"。

参数：use.weights
should the votes of the nearest neighbors be weighted by the reciprocal of the distances to the new observation when the class of a new observation should be predicted?
加权票的近邻距离的倒数上课的时候，一个新的观察应预见到新的观察？

参数：...
further arguments for the distance measure. If, e.g.,  distance = "minkowski", then p can also be specified, see dist. If distance = "pcc", then version can also be specified, see pcc.
进一步的论据的距离测量。如果，例如，distance = "minkowski"，那么p也可以指定，请参阅dist。如果distance = "pcc"，那么version也可以指定，请参阅pcc。

值----------Value----------

The predicted classes of the new observations.
的预测类新的观测。

（作者）----------Author(s)----------

Holger Schwender, <a href="mailto:holger.schwender@udo.edu">holger.schwender@udo.edu</a>

参考文献----------References----------

Schwender, H.\ (2007). Statistical Analysis of Genotype and Gene Expression Data.

参见----------See Also----------

knncatimpute, smc, pcc
knncatimpute，smc，pcc

实例----------Examples----------

# Using the example from the function knn.[使用从功能KNN的例子。]

library(class)
data(iris3)
train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3])
test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3])
cl <- c(rep(2, 25), rep(1, 25), rep(1, 25))

knn.out <- knn(train, test, as.factor(cl), k = 3, use.all = FALSE)
gknn.out <- gknn(train, cl, test, nn = 3)

# Both applications lead to the same predictions.[这两个应用程序导致相同的预测。]

knn.out == gknn.out

# But gknn allows to use other distance measures than the Euclidean [但gknn允许使用其他措施比欧氏距离]
# distance. E.g., the Manhattan distance.[距离。例如，曼哈顿距离。]

gknn(train, cl, test, nn = 3, distance = "manhattan")

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册