distmc(Rlof)
distmc()所属R语言包:Rlof
Distance Matrix Computation with multi-threads
距离矩阵的计算与多线程
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This function is similar to dist() in stats, with additional support of multi-threading.
此功能是类似的到dist()stats,多线程的额外支持。
用法----------Usage----------
distmc(x, method = "euclidean", diag = FALSE, upper = FALSE, p = 2)
参数----------Arguments----------
参数:x
a numeric matrix, data frame or "dist" object.
一个数字矩阵,数据框或"dist"对象的。
参数:method
the distance measure to be used. This must be one of "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski". Any unambiguous substring can be given.
要使用的距离测量。这必须是一个"euclidean","maximum","manhattan","canberra","binary"或"minkowski"。可以给任何明确的子串。
参数:diag
logical value indicating whether the diagonal of the distance matrix should be printed by print.dist.
逻辑值,该值指示是否应印有print.dist对角线的距离矩阵。
参数:upper
logical value indicating whether the upper triangle of the distance matrix should be printed by print.dist.
逻辑值,该值指示是否应印上三角的距离矩阵的print.dist。
参数:p
The power of the Minkowski distance.
闵可夫斯基距离的力量。
Details
详细信息----------Details----------
Available distance measures are (written for two vectors x and y):
可用的距离量测,(写两个向量x和y):
euclidean:Usual square distance between the two vectors (2 norm).
euclidean:一般方形的两个向量之间的距离(2规范)。
maximum:Maximum distance between two components of x and y (supremum norm)
maximum:x和y(确规范的两个组成部分之间的最大距离)
manhattan:Absolute distance between the two vectors (1 norm).
manhattan:绝对的两个向量之间的距离(1标准)。
canberra: sum(|x_i - y_i| / |x_i + y_i|). Terms with zero numerator and denominator are omitted from the sum and treated as if the values were missing.
canberra:sum(|x_i - y_i| / |x_i + y_i|)。省略的总和为零的分子和分母的条款,并视为值。
This is intended for non-negative values (e.g. counts): taking the absolute value of the denominator is a 1998 R modification to avoid negative distances.
这是用于非负值(如计数):取绝对值的分母是1998年ŕ修改,以避免负的距离。
binaryaka asymmetric binary): The vectors are regarded as binary bits, so non-zero elements are "on" and zero elements are "off". The distance is the proportion of bits in which only one is on amongst those in which at least one is on.
binary:(又名不对称二进制):该向量被视为二进制位,所以非零元素是“上”和零个元素是“关断”。距离的比例,其中只有一个是在那些之中,其中至少有一个是在的比特。
minkowski:The p norm, the pth root of the sum of the pth powers of the differences of the components.
minkowski:p规范,pp次方的不同的组件的总和次方根。
Missing values are allowed, and are excluded from all computations involving the rows within which they occur. Further, when Inf values are involved, all pairs of values are excluded when their contribution to the distance gave NaN or NA.<br> If some columns are excluded in calculating a Euclidean, Manhattan, Canberra or Minkowski distance, the sum is scaled up proportionally to the number of columns used. If all pairs are excluded when calculating a particular distance, the value is NA.
遗漏值是允许的,所有的计算涉及行内,他们被排除。此外,参与Inf值时,所有的值对被排除在外时的距离了自己的贡献NaN或NA。<BR>如果一些列被排除在计算欧几里德,曼哈顿,堪培拉或Minkowski距离的总和被缩放比例使用的列的数目。如果所有计算的特定距离时,对被排除,该值是NA。
The "distmc" method of as.matrix() and as.dist() can be used for conversion between objects of class "dist" and conventional distance matrices.
"distmc"方法as.matrix()和as.dist()可以使用对象的类"dist"和传统的距离矩阵之间的转换。
as.dist() is a generic function. Its default method handles objects inheriting from class "dist", or coercible to matrices using as.matrix(). Support for classes representing distances (also known as dissimilarities) can be added by providing an as.matrix() or, more directly, an as.dist method for such a class.
as.dist()是一个通用的功能。它的默认方法处理对象继承类"dist",或强制转换矩阵as.matrix()。对类距离(也被称为相异)的支持,可以添加提供as.matrix(),或更直接的是,一个as.dist这样一个类的方法。
值----------Value----------
distmc returns an object of class "dist".
distmc返回一个对象类"dist"。
The lower triangle of the distance matrix stored by columns in a vector, say do. If n is the number of observations, i.e., n <- attr(do, "Size"), then for i < j ≤ n, the dissimilarity between (row) i and j is do[n*(i-1) - i*(i-1)/2 + j-i]. The length of the vector is n*(n-1)/2, i.e., of order n^2.
一个向量按列存储的距离矩阵的下三角,说do。如果n的若干意见,即,n <- attr(do, "Size"),然后为i < j ≤ n(行)i和j之间的相异是:do[n*(i-1) - i*(i-1)/2 + j-i]。向量的长度是n*(n-1)/2,即为了n^2。
The object has the following attributes (besides "class" equal to "dist"): <table summary="R valueblock"> <tr valign="top"><td>Size</td> <td> integer, the number of observations in the dataset.</td></tr> <tr valign="top"><td>Labels</td> <td> optionally, contains the labels, if any, of the observations of the dataset.</td></tr> <tr valign="top"><td>Diag, Upper</td> <td> logic, corresponding to the arguments diag and upper above, specifying how the object should be printed.</td></tr> <tr valign="top"><td>call</td> <td> optional, the call used to create the object.</td></tr> <tr valign="top"><td>method</td> <td> optional, the distance measure used; resulting from distmc(), the (match.arg()ed) method argument.</td></tr> </table>
对象具有以下属性(除了"class"等于"dist"):<table summary="R valueblock"> <tr valign="top"> <TD>Size</ TD > <TD>整数,在数据集中的若干意见。</ TD> </ TR> <tr valign="top"> <TD>Labels </ TD> <TD>可选,包含的标签(如有)的观测数据集。</ TD> </ TR> <tr valign="top"> <TD> Diag, Upper </ TD> <TD>逻辑,相应的参数 X>和diag上面,指定的对象应该如何进行打印。</ TD> </ TR> <tr valign="top"> <TD>upper</ TD> <TD>可选的,call用于创建对象。</ TD> </ TR> <tr valign="top"> <TD>call </ TD> <TD>可选,距离措施method(distmc()版)match.arg()参数。</ TD> </ TR> </表>
参考文献----------References----------
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979) Multivariate Analysis. Academic Press.
Borg, I. and Groenen, P. (1997) Modern Multidimensional Scaling. Theory and Applications. Springer.
参见----------See Also----------
dist() in the stats package
dist()中stats包
实例----------Examples----------
data(iris)
df<-data.frame(iris[-5])
dist.data<-distmc(df, 'manhattan')
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|