R语言:unique()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-16 21:01:15

unique(base)
unique()所属R语言包：base

                                    Extract Unique Elements
                                       提取独特的元素

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

unique returns a vector, data frame or array like x but with duplicate elements/rows removed.
unique返回一个向量，数据框像x但重复的元素/行删除或数组。

用法----------Usage----------

unique(x, incomparables = FALSE, ...)

## Default S3 method:[默认方法]
unique(x, incomparables = FALSE, fromLast = FALSE, ...)

## S3 method for class 'matrix'
unique(x, incomparables = FALSE, MARGIN = 1,
   fromLast = FALSE, ...)

## S3 method for class 'array'
unique(x, incomparables = FALSE, MARGIN = 1,
   fromLast = FALSE, ...)

参数----------Arguments----------

参数：x
a vector or a data frame or an array or NULL.
向量或一个数据框或一个数组或NULL。

参数：incomparables
a vector of values that cannot be compared. FALSE is a special value, meaning that all values can be compared, and may be the only value accepted for methods other than the default.  It will be coerced internally to the same type as x.
不能相比的值向量。 FALSE是一个特殊的值，这意味着可以比较所有值，可能是唯一的值比默认的其他方法接受。它将被裹挟国内相同类型的x。

参数：fromLast
logical indicating if duplication should be considered from the last, i.e., the last (or rightmost) of identical elements will be kept.  This only matters for names or dimnames.
逻辑表示如果重复应考虑最后，即最后（或最右边）相同的元素将被保留。 names或dimnames此有关的事项。

参数：...
arguments for particular methods.
参数为特定的方法。

参数：MARGIN
the array margin to be held fixed: a single integer.
阵列保证金举行的固定：一个整数。

Details

详情----------Details----------

This is a generic function with methods for vectors, data frames and arrays (including matrices).
这是一个通用函数为向量，数据框和阵列（包括矩阵）的方法。

The array method calculates for each element of the dimension specified by MARGIN if the remaining dimensions are identical to those for an earlier element (in row-major order).  This would most commonly be used for matrices to find unique rows (the default) or columns (with MARGIN = 2).
数组的方法计算出每个维度指定的元素MARGIN如果剩下的尺寸是相同的较早的元素（行大订单）。这将最常用的用于矩阵来找到独特的行（默认）或列（用MARGIN = 2）。

Note that unlike the Unix command uniq this omits duplicated and not just repeated elements/rows.  That is, an element is omitted if it is equal to any previous element and not just if it is equal the immediately previous one.  (For the latter, see rle).
需要注意的是不同的Unix命令uniq省略重复的，不只是重复元素/行。也就是说，一个元素被省略，如果它是等于以往任何元素，而不是只是如果它等于立即上一个。（对于后者，看到rle）。

Missing values are regarded as equal, but NaN is not equal to NA_real_.  Character strings are regarded as equal if they are in different encodings but would agree when translated to UTF-8.
遗漏值被视为相等，但NaN不等于NA_real_。字符串被视为平等的，如果他们在不同的编码，但会同意转换为UTF-8时。

Values in incomparables will never be marked as duplicated. This is intended to be used for a fairly small set of values and will not be efficient for a very large set.
价值观incomparables将永远不会被标记为重复。这是为了用于为一套相当小的值，并不会是一个非常大的一套有效的。

When used on a data frame with more than one column, or an array or matrix when comparing dimensions of length greater than one, this tests for identity of character representations.  This will catch people who unwisely rely on exact equality of floating-point numbers!
当一个数据框时，用多个列，或者一个数组或矩阵比较不止一个更大尺寸的长度，字符表示的身份为这个测试。这将赶上人不明智依靠精确的浮点数平等的人！

Character strings will be compared as byte sequences if any input is marked as "bytes".
字符串将作为字节序列进行比较，如果任何输入"bytes"标记。

值----------Value----------

For a vector, an object of the same type of x, but with only one copy of each duplicated element.  No attributes are copied (so the result has no names).
为向量的x同类型的对象，但只有一个副本每个重复元素。没有属性被复制（这样的结果有没有名字）。

For a data frame, a data frame is returned with the same columns but possibly fewer rows (and with row names from the first occurrences of the unique rows).
对于一个数据框，数据框返回相同的列，但可能更少的行（与从独特的行的第一次出现的行名）。

A matrix or array is subsetted by [, drop = FALSE], so dimensions and dimnames are copied appropriately, and the result always has the same number of dimensions as x.
矩阵或数组由[, drop = FALSE]，所以尺寸和dimnames子集被复制适当的，结果总是有相同数量的尺寸为x。

警告----------Warning----------

Using this for lists is potentially slow, especially if the elements are not atomic vectors (see vector) or differ only in their attributes.  In the worst case it is O(n^2).
使用名单这可能是缓慢的，尤其是如果元素不是原子的向量（见vector），或只在它们的属性不同。在最坏的情况下，它是O(n^2)。

参考文献----------References----------

The New S Language. Wadsworth & Brooks/Cole.

参见----------See Also----------

duplicated which gives the indices of duplicated elements.
duplicated使重复的元素的指标。

rle which is the equivalent of the Unix uniq -c command.
rle这是相当于Unix的uniq -c命令。

举例----------Examples----------

x <- c(3:5, 11:8, 8 + 0:5)
(ux <- unique(x))
(u2 <- unique(x, fromLast = TRUE)) # different order[不同的顺序]
stopifnot(identical(sort(ux), sort(u2)))

length(unique(sample(100, 100, replace=TRUE)))
## approximately 100(1 - 1/e) = 63.21[＃约100（1  -  1 / E）= 63.21]

unique(iris)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册