R语言 yaImpute包 ann()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-10-2 07:29:23

ann(yaImpute)
ann()所属R语言包：yaImpute

                                    Approximate nearest neighbor search routines
                                       近似最近邻搜索程序

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Given a set of reference data points S, ann constructs a kd-tree or box-decomposition tree (bd-tree) for efficient k-nearest neighbor searches.
给定一组参考的数据点S，ann构造一个kd-树或盒分解树（BD-树），用于高效k-最近邻检索数据。

用法----------Usage----------

ann(ref, target, k=1, eps=0.0, tree.type="kd",
search.type="standard", bucket.size=1, split.rule="sl_midpt",
shrink.rule="simple", verbose=TRUE, ...)

参数----------Arguments----------

参数：ref
an n x d matrix containing the reference point set S. Each row in ref corresponds to a point in d-dimensional space.
n x d矩阵的参考点设置S。中的每一行refd-维空间中的一个点相对应。

参数：target
an m x d matrix containing the points for which k nearest neighbor reference points are sought.
m x d基质中含有k最近邻参考点是寻求。

参数：k
defines the number of nearest neighbors to find. The default is k=1.
定义了一些最近的邻居发现。默认值是k= 1。

参数：eps
the i-th nearest neighbor is at most (1+eps) from true i-th nearest neighbor, where eps>=0 . Specifically, the true (not squared) difference between the true i-th and the approximation of the i-th point is a factor of (1+eps). The default value of eps=0 is an exact search.
i-th最近的邻居是在最（1 +eps）由真i-th最近的邻居，中eps“>=0。具体而言，真正的（不是）的平方的区别，真正的i-th和i-th点的近似系数为（1 +eps）。的默认值eps= 0是一个精确的搜索。

参数：tree.type
the data structures kd-tree or bd-tree as quoted key words kd and bd, respectively.  A brute force search can be specified with the quoted key word brute. If brute is specified, then all subsequent arguments are ignored.  The default is the kd-tree.
数据结构kd树或树所报的关键词kd和BD，BD-。可以指定所报的关键字野蛮蛮力搜索。如果指定了野蛮的，那么以后所有参数将被忽略。默认值是kd树。

参数：search.type
either standard or priority search in the kd-tree or bd-tree, specified by quoted key words standard and priority, respectively. The default is the standard search.
KD-树或BD-树指定援引关键词标准和优先，分别是标准或优先级的搜索。默认值是标准的搜索。

参数：bucket.size
the maximum number of reference points in the leaf nodes. The default is 1.
在叶节点中的参考点的最大数目。默认值是1。

参数：split.rule
is the strategy for the recursive splitting of those nodes with more points than the bucket size.  The splitting rule applies to both the kd-tree and bd-tree.  Splitting rule options are the quoted key words:
是比斗大小的多点这些节点的递归分割的策略。的分裂规则适用于kd树和BD-树。拆分规则选项所报的关键词：

standard - standard kd-tree
standard  - 标准kd树

midpt - midpoint
midpt  - 中点

fair - fair-split
fair - 公平分割

sl\_midpt - sliding-midpoint (default)
sl\_midpt  - 滑动中点（默认）

sl{fair - fair-split rule
sl{fair  - 公平分割规则

See supporting documentation, reference below, for a thorough description and discussion of these splitting rules.
请参阅支持文档，参考下文，进行彻底的描述和讨论这些分割规则。

参数：shrink.rule
applies only to the bd-tree and is an additional strategy (beyond the splitting rule) for the recursive partitioning of nodes.  This argument is ignored if tree.type is specified as kd. Shrinking rule options are quoted key words:
仅适用于BD-树，用于递归分割节点是一个额外的策略（超出的分割规则）。如果tree.type被指定为KD，此参数将被忽略。收缩规则选项援引关键词：

none - equivalent to the kd-tree
none - 相当于kd树

simple - simple shrink (default)
simple  - 简单的收缩（默认）

centroid - centroid shrink
centroid - 质心收缩

See supporting documentation, reference below, for a thorough description and discussion of these shrinking rules.
请参阅支持文档，参考下文，进行彻底的描述和讨论这些萎缩的规则。

参数：verbose
if true, search progress is printed to the screen.
如果为true，搜索进度被打印到屏幕上。

参数：...
currently no additional arguments.
目前没有任何额外的参数。

Details

详细信息----------Details----------

The ann function calls portions of the Approximate Nearest Neighbor Library, written by David M. Mount.  All of the ann function arguments are detailed in the ANN Programming Manual found at http://www.cs.umd.edu/~mount/ANN.
ann函数调用近似近邻图书馆，由David M.山写的部分。所有的ann函数的参数在http://www.cs.umd.edu/安装/ ANN ANN编程手册中详细介绍。

值----------Value----------

An object of class ann, which is a list with some or all of the following tags:
类的一个对象ann，它是与以下的标签中的一些或所有的列表：

参数：knnIndexDist
an m x 2k matrix. Each row corresponds to a target point in target and columns 1:k hold the ref matrix row indices of the nearest neighbors, such that column 1 index holds the ref matrix row index for the first nearest neighbor and column k is the k-th nearest neighbor index.  Columns k+1:2k hold the Euclidean distance from the target to each of the k nearest neighbors indexed in columns 1:k.
m x 2k矩阵。每一行对应于一个目标点在target和列1：k的形体ref矩阵行指数，其中最接近的邻居，例如，第1列的索引持有ref矩阵行索引的第一个最近的邻居和列k是k-th最近邻指数。列k+1：2K的持有从目标的欧几里得距离每个采集的k近邻列中的1：k。

参数：searchTime
total search time, not including data structure construction, etc.
总的搜索时间，不包括数据结构建筑，等等。

参数：k
as defined in the ann function call.
中定义ann函数调用。

参数：eps
as defined in the ann function call.
中定义ann函数调用。

参数：tree.type
as defined in the ann function call.
中定义ann函数调用。

参数：search.type
as defined in the ann function call.
中定义ann函数调用。

参数：bucket.size
as defined in the ann function call.
中定义ann函数调用。

参数：split.rule
as defined in the ann function call.
中定义ann函数调用。

参数：shrink.rule
as defined in the ann function call.
中定义ann函数调用。

（作者）----------Author(s)----------

Andrew O. Finley <a href="mailto:finleya@msu.edu">finleya@msu.edu</a> <br>

实例----------Examples----------

## Make a couple of bivariate normal classes[＃一对夫妇的二元正常上课]
rmvn <- function(n, mu=0, V = matrix(1))
{
  p <- length(mu)
  if(any(is.na(match(dim(V),p))))
stop("Dimension problem!")
  D <- chol(V)
  matrix(rnorm(n*p), ncol=p) %*% D + rep(mu,rep(n,p))
}

m <- 10000

## Class 1.[＃1级。]
mu.1 <- c(20, 40)
V.1 <- matrix(c(-5,1,0,5),2,2); V.1 <- V.1%*%t(V.1)
c.1 <- cbind(rmvn(m, mu.1, V.1), rep(1, m))

## Class 2.[＃2类。]
mu.2 <- c(30, 60)
V.2 <- matrix(c(4,2,0,2),2,2); V.2 <- V.2%*%t(V.2)
c.2 <- cbind(rmvn(m, mu.2, V.2), rep(2, m))

## Class 3.[＃3类。]
mu.3 <- c(15, 60)
V.3 <- matrix(c(5,5,0,5),2,2); V.3 <- V.3%*%t(V.3)
c.3 <- cbind(rmvn(m, mu.3, V.3), rep(3, m))

c.all <- rbind(c.1, c.2, c.3)
max.x <- max(c.all[,1]); min.x <- min(c.all[,1])
max.y <- max(c.all[,2]); min.y <- min(c.all[,2])

## Check them out.[＃检查出来。]
plot(c.1[,1], c.1[,2], xlim=c(min.x, max.x), ylim=c(min.y, max.y),
   pch=19, cex=0.5,
   col="blue", xlab="Variable 1", ylab="Variable 2")
points(c.2[,1], c.2[,2], pch=19, cex=0.5, col="green")
points(c.3[,1], c.3[,2], pch=19, cex=0.5, col="red")

## Take a reference sample.[＃以参考样本。]
n <- 2000
ref <- c.all[sample(1:nrow(c.all), n),]

## Compare search times[＃比较搜索时间。]
k <- 10
## Do a simple brute force search.[做一个简单的蛮力搜索。]
brute <- ann(ref=ref[,1:2], target=c.all[,1:2],
         tree.type="brute", k=k, verbose=FALSE)
print(brute$searchTime)

## Do an exact kd-tree search.[做一个准确的KD树搜索。]
kd.exact <- ann(ref=ref[,1:2], target=c.all[,1:2],
            tree.type="kd", k=k, verbose=FALSE)
print(kd.exact$searchTime)

## Do an approximate kd-tree search.[做一个大致的KD树搜索。]
kd.approx <- ann(ref=ref[,1:2], target=c.all[,1:2],
               tree.type="kd", k=k, eps=100, verbose=FALSE)
print(kd.approx$searchTime)

## Takes too long to calculate for this many targets.[＃如此多的目标需要很长的计算。]
## Compare overall accuracy of the exact vs. approximate search[＃比较整体精度的精确与近似搜索]
##knn.mode <- function(knn.indx, ref){[＃knn.mode < - 的功能（knn.indx，REF）{]
##  x <- ref[knn.indx,][＃X < - 文献[knn.indx，]]
##  as.numeric(names(sort(as.matrix(table(x))[,1],[＃as.numeric（名称（排序（as.matrix（表（X））[1]，]
##                      decreasing=TRUE))[1])[＃减少= TRUE））[1]）]
##}[＃}]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册