R语言 scrime包 knncatimpute()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-29 23:02:28

knncatimpute(scrime)
knncatimpute()所属R语言包：scrime

                                    Missing Value Imputation with kNN
                                       缺失值插补的KNN

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Imputes missing values in a matrix composed of categorical variables using k Nearest Neighbors.
插补缺失值的分类变量的使用k近邻组成的矩阵。

用法----------Usage----------

knncatimpute(x, dist = NULL, nn = 3, weights = TRUE)

参数----------Arguments----------

参数：x
a numeric matrix containing missing values. All non-missing values must be integers between 1 and n.cat, where n.cat is the maximum number of levels the categorical variables in x can take. If the k nearest observations should be used to replace the missing values of an observation, then each row must represent one of the observations and each column one of the variables. If the k nearest variables should be used to impute the missing values of a variable, then each row must correspond to a variable and each column to an observation.
一个数字矩阵缺失值的。所有非缺失值必须是整数1至n.cat，这里n.cat是最大数量的水平x可以分类变量。如果k最近的观测资料应使用以替换丢失的观察值，然后每行必须代表的意见和每一列的变量之一。如果k最近的变量应该归咎于缺少的一个变量的值，然后每行必须对应于一个变量，每列的观察。

参数：dist
either a character string naming the distance measure or a distance matrix. If the former, dist must be either "smc", "cohen", or "pcc". If the latter, dist must be a symmetric matrix having the same number of rows as x. In this case, both the upper and the lower triangle of dist must contain the distances, and the row and column names of dist must be equal to the row names of x. If NULL, dist = "smc" is used.
无论是一个字符串命名的距离度量或距离矩阵。如果是前者，dist是"smc"，"cohen"或"pcc"。如果是后者，dist必须是对称矩阵具有相同的行数作为x。在这种情况下，无论是上部和下部三角dist必须包含的距离，并且该行和列名dist必须等于x行名。如果NULL，dist = "smc"使用。

参数：nn
an integer specifying k, i.e.\ the number of nearest neighbors, used in the imputation of the missing values.
一个整数，指定k，即\最近的邻居，用于归集的遗漏值的数量。

参数：weights
should weighted kNN be used to impute the missing values? If TRUE, the vote of each nearest neighbor is weighted by the reciprocal of its distance to the observation or variable when the missing values of this observation or variable, respectively, are replaced.
应该加权kNN归咎于缺少的值吗？如果TRUE，每一个最近的邻居的投票权重的距离观察或变量时的遗漏值的观察变量，分别被替换的倒数。

值----------Value----------

A matrix of the same size as x in which all the missing values have been imputed.
x中，所有的遗漏值已归罪于大小相同的矩阵。

（作者）----------Author(s)----------

Holger Schwender, <a href="mailto:holger.schwender@udo.edu">holger.schwender@udo.edu</a>

参考文献----------References----------

Schwender, H.\ (2007). Statistical Analysis of Genotype and Gene Expression Data.

参见----------See Also----------

knncatimputeLarge, gknn, smc, pcc
knncatimputeLarge，gknn，smc，pcc

实例----------Examples----------

# Generate a data set consisting of 200 rows and 50 columns[生成数据集由200行和50列组成的]
# in which the values are integers between 1 and 3.[在其中的值是1和3之间的整数。]
# Afterwards, remove 20 of the values randomly.[之后，随机的值删除20。]

mat <- matrix(sample(3, 10000, TRUE), 200)
mat[sample(10000, 20)] <- NA

# Replace the missing values.[更换失踪的值。]

mat2 <- knncatimpute(mat)

# Replace the missing values using the 5 nearest neighbors[替换缺失值的使用最近的邻居]
# and Cohen's Kappa.[科恩的Kappa。]

mat3 <- knncatimpute(mat, nn = 5, dist = "cohen")

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册