R语言 seriation包 criterion()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-30 01:29:27

criterion(seriation)
criterion()所属R语言包：seriation

 Criterion for a loss/merit function for data given a permutation
 标准的一个置换的数据的丢失/优点功能

 译者：生物统计家园网机器人LoveR

描述----------Description----------

Compute the value for different loss functions L and merit function M for data given a permutation.
不同的损失函数计算的值L和优点功能M给定一个置换的数据。

用法----------Usage----------

criterion(x, order = NULL, method = NULL)

参数----------Arguments----------

参数：x
an object of class dist or a matrix (currently no functions are implemented for array).
类的一个对象dist或矩阵（目前没有任何功能都实现了阵列）。

参数：order
an object of class ser_permutation suitable for x. If NULL, the identity permutation is used.
类的一个对象ser_permutation适合x。如果NULL，身份使用置换。

参数：method
a character vector with the names of the criteria to be employed, or NULL (default) in which case all available criteria are used.
字符向量与标准的名称，或NULL（默认），在这种情况下，所有可用的标准使用。

Details

详细信息----------Details----------

For a symmetric dissimilarity matrix D with elements d(i,j) where i, j = 1 … p, the aim is generally to place low distance values close to the diagonal. The following criteria to judge the quality of a certain permutation of the objects in a dissimilarity matrix are currently implemented:
对称的相异度矩阵D的元素d(i,j)其中i, j = 1 … p，其目的是，一般的对角线的距离值低。目前正在实施以下标准来判断质量有一定的对象的相异度矩阵的排列：



"Gradient\_raw", "Gradient\_weighted" Gradient measures (Hubert et al 2001).
"Gradient\_raw", "Gradient\_weighted"的梯度措施（休伯特等2001）。

A symmetric dissimilarity matrix where the values in all rows and columns only increase when moving away from the main diagonal is called a perfect anti-Robinson matrix (Robinson 1951). A suitable merit measure which quantifies the divergence of a matrix from the anti-Robinson form is
一种对称的相异度矩阵的所有行和列中的值只增加时，远离主对角线被称为一个完美的反，罗宾逊矩阵（罗宾逊1951年）。合适的的优点措施，量化的反罗宾逊形式的矩阵的分歧

M(D) = ∑_{i<k<j}f(d_{ij}, d_{ik}) + ∑_{i<k<j}f(d_{ij}, d_{kj})
M（D）=Σ_{I <K <J}（D_ {IJ}，D_ {IK）+Σ_{I <K <J} F（D_ {IJ}，D_的{KJ}） 

Hubert et al (2001) suggest two functions. The first function is given by:
休伯特等（2001）提出了两种功能。第一个函数由下式给出：

It results in raw number of triples satisfying the gradient constraints minus triples which violate the constraints.
它导致了原三元满足梯度违反约束的限制减去三倍。

The second function is defined as:
第二个函数被定义为：

"AR\_events", "AR\_deviations" Anti-Robinson events (Chen 2002). An even simpler loss function can be created in the same way as the gradient measures above by concentrating on violations only.
"AR\_events", "AR\_deviations"反罗宾逊事件（陈2002）。以同样的方式进行梯度上述措施违反仅集中在一个更简单的损失函数可以创建。




To only count the violations we use
只算违规行为，我们使用




I(\cdot) is an indicator function returning 1 only for violations. Chen (2002) presented a formulation for an equivalent loss function and called the violations anti-Robinson events and also introduced a weighted versions of the loss function resulting in
I(\cdot)是仅针对违反行为的指标函数返回1。陈（2002）提出了制定一个相当于损失的功能，还推出了被称为违反反罗宾逊事件和造成的损失函数中的加权版本




"Path\_length" Hamiltonian path length (Caraux and Pinloche 2005).
"Path\_length"哈密顿的路径长度（Caraux和Pinloche 2005）。

The order of the objects in a dissimilarity matrix corresponds to a path through a graph where each node represents an object and is visited exactly once, i.e., a Hamilton path. The length of the path is defined as the sum of the edge weights, i.e., dissimilarities.
通过图中的每个节点代表一个对象，恰好被访问一次，即有Hamilton路的顺序的一个相异度矩阵中的对象对应的路径。的路径的长度被定义为各边权重的总和，即，不同点。

The length of the Hamiltonian path is equal to the value of the minimal span loss function (as used by Chen 2002). Both notions are related to the traveling salesperson problem (TSP).
汉弥尔顿路径的长度是相等的值的的最小量程损失功能（陈2002所使用）。这两个概念相关的旅行商问题（TSP）。

If order is not unique or there are non-finite distance values NA is returned.
如果order是不是唯一的或有非有限的距离值NA返回。

"Inertia" Inertia criterion (Caraux and Pinloche 2005).
"Inertia"惯性，标准（Caraux和Pinloche 2005）。

Measures the moment of the inertia of dissimilarity values around the diagonal as
在对角线周围作为安全措施的相异值的时刻的惯性

|i-j| is used as a measure for the distance to the diagonal and d(i,j) gives the weight. This criterion gives higher weight to values farther away from the diagonal. It increases with quality.
|i-j|被用作的距离的对角线和d(i,j)的衡量给出的重量。该标准给出了更高的权重的值远离对角线。它与质量的增加而增加。

"Least\_squares" Least squares criterion (Caraux and Pinloche 2005).
"Least\_squares"最小二乘，标准（Caraux和Pinloche 2005年）。

The sum of squares of deviations between the dissimilarities and rank differences (in the matrix) between two elements:
之间的不同点和秩的差异（在基质中）的两个元素之间的偏差的平方的总和：

Note that if Euclidean distance is used to calculate D from a data matrix X, the order of the elements in X by projecting them on the first principal component of X minimizes this criterion. The least squares criterion is related to unidimensional scaling.
请注意，如果被用来计算欧几里德距离D从数据矩阵X，中的元素的顺序X通过他们的第一主成分的突出X最大限度地减少这一标准。最小二乘准则相关的一维的缩放。

For a general matrix X = x_{ij}, i = 1 … m and j = 1 … n, currently the following loss/merit functions are implemented:
对于一般的矩阵X = x_{ij}，i = 1 … m和j = 1 … n，目前的亏损/优点功能的实现：



"ME" Measure of Effectiveness (McCormick 1972).
"ME"措施的有效性（麦考密克1972年）。

The measure of effectiveness (ME) for matrix X, is defined as
的措施的有效性（ME）的为基质X，被定义为

with, by convention
按照惯例，

ME is a merit measure, i.e. a higher ME indicates a better arrangement. Maximizing ME is the objective of the bond energy algorithm (BEA).
我是一个优点措施，即一个更高，，ME表示一个更好的安排。最大限度地提高我的目标是算法（BEA）的键能。

"Moore_stress", "Neumann_stress" Stress (Niermann 2005).
"Moore_stress"，"Neumann_stress"的应力（Niermann 2005）。

Stress measures the conciseness of the presentation of a matrix/table and can be seen as a purity function which compares the values in a matrix/table with its neighbors. The stress measure used here is computed as the sum of squared distances of each matrix entry from its adjacent entries. The following types of neighborhoods are available:
应力测量矩阵/表的演示，可以被看作是在矩阵/表与邻国的纯度比较值的功能，简洁。这里使用的是作为从与其相邻的条目每个矩阵条目的平方距离的总和计算的应力测量。以下类型的社区：



Moore:comprises the eight adjacent entries.
摩尔：包括8个相邻的条目。

Neumann:comprises the four adjacent entries.
诺伊曼：由四个相邻的条目。

The major difference between the Moore and the Neumann neighborhood is that for the later the contribution of row and column permutations to stress are independent and thus can be optimized independently.
的摩尔和Neumann邻域之间的主要差别是，用于购买的行和列排列的应力的贡献是独立的，因此可以独立地优化。

值----------Value----------

A named vector of real values.
一个命名为向量的实际值。

参考文献----------References----------

G. Caraux and S. Pinloche (2005): Permutmatrix: A Graphical Environment to Arrange Gene Expression Profiles in Optimal Linear Order, Bioinformatics, 21(7), 1280–1281.
C.-H. Chen (2002): Generalized association plots: Information visualization via iteratively generated correlation matrices, Statistica Sinica, 12(1), 7–29.
L. Hubert, P. Arabie, and J. Meulman (2001): Combinatorial Data Analysis: Optimization by Dynamic Programming. Society for Industrial Mathematics.

S. Niermann (2005): Optimizing the Ordering of Tables With Evolutionary Computation, The American Statistician, 59(1), 41–46.
W.S. Robinson (1951): A method for chronologically ordering archaeological deposits, American Antiquity, 16, 293–301.
W.T. McCormick, P.J. Schweitzer and T.W. White (1972): Problem decomposition and data reorganization by a clustering technique, Operations Research, 20(5), 993-1009.

实例----------Examples----------

## create random data and calculate distances[＃创建一个随机的数据和计算距离]
m <- matrix(runif(10),ncol=2)
d <- dist(m)

## get an order for rows (optimal for the least squares criterion)[＃命令行（最佳最小二乘准则）]
o <- seriate(m, method = "PCA", margin = 1)
o

## compare the values for all available criteria[＃可用于所有标准比较值]
rbind(
unordered = criterion(d),
ordered = criterion(d, o)
)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册