seqdistmc(TraMineR)
seqdistmc()所属R语言包:TraMineR
Multichannel distances between sequences
多通道序列之间的距离
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Compute multichannel pairwise distances between sequences. Several metrics are available: optimal matching (OM), the longest common subsequence (LCS),
计算序列的多通道两两之间的距离。几个指标是:最佳匹配(OM),最长公共子序列(LCS),
用法----------Usage----------
seqdistmc(channels, method, norm=FALSE, indel=1, sm=NULL,
with.missing=FALSE, full.matrix=TRUE, link="sum", cval=2,
参数----------Arguments----------
参数:channels
A list of state sequence objects defined with the seqdef function, each state sequence object corresponding to a "channel". <tr valign="top"><td>method</td>
seqdef功能,每个状态对应的序列对象的“通道”的状态序列中定义的对象列表。 <tr valign="top"> <TD> method</ TD>
a character string indicating the metric to be used. One of "OM" (Optimal Matching), "LCS" (Longest Common Subsequence), "HAM" (Hamming distance), "DHD" (Dynamic Hamming distance).
一个字符的字符串,表示要使用的度量。之一"OM"(最佳匹配),"LCS"(最长公共子序列),"HAM"(海明距离),"DHD"(动态海明距离)。
参数:norm
if TRUE, the computed distances are normalized to account for differences in sequence lengths. Default is FALSE. See details.
如果TRUE,计算出的距离被标准化为序列长度的差异。默认是FALSE。查看详细信息。
参数:indel
A vector with an insertion/deletion cost for each channel (OM method).
与为每个信道的插入/删除成本(OM方法)的向量。
参数:sm
A list with a substitution-cost matrix for each channel (OM, HAM and DHD method) or a list of method names for generating the substitution-costs (see seqsubm).
每个通道矩阵替代成本(OM,火腿和DHD方法)或产生替代成本的方法名称的列表(见seqsubm列表“)。
参数:with.missing
Must be set to TRUE when sequences contain non deleted gaps (missing values) or when channels are of different length. See details.
必须设置为TRUE序列包含未删除差距(缺失值),或当通道具有不同的长度。查看详细信息。
参数:full.matrix
If TRUE (default), the full distance matrix is returned. If FALSE, an object of class dist is returned.
如果TRUE(默认),返回的距离矩阵。如果FALSE,一个对象的类dist的返回。
参数:link
One of "sum" or "mean". Method to compute the "link" between channels. Default is to sum the substitution costs.
一个"sum"或"mean"。通道之间的“链接”的方法来计算。默认情况下是要总结的替代成本。
参数:cval
Substitution cost for "CONSTANT" matrix, see seqsubm.
"CONSTANT"矩阵替代的成本,请参阅seqsubm。
参数:miss.cost
Missing values substitution cost, see seqsubm.
缺失值替换成本,seqsubm。
参数:cweight
A vector of channel weights. Default is 1 (same weight for each channel).
一个通道的权重向量的。默认值是1(每通道)相同的权重。
Details
详细信息----------Details----------
The seqdistmc function returns a matrix of multichannel distances between sequences. The available metrics (see 'method' option) are optimal matching ("OM"), longest common subsequence ("LCS"), Hamming distance ("HAM") and Dynamic Hamming Distance ("DHD"). See seqdist for more information about distances between sequences. The seqdistmc function computes a multichannel distance in two steps following the strategy proposed by <CITE>Pollock (2007)</CITE>. First it builds a new sequence object derived from the combination of the sequences of each channel. Second, it derives the substitution cost matrix by summing (or averaging) the costs of substitution across channels. It then calls seqdist to compute the final matrix. Normalization may be useful when dealing with sequences that are not all of the same length. For details on the applied normalization, see seqdist.
seqdistmc函数返回一个矩阵的多通道序列之间的距离。可用的度量(见“方法”选项)的最佳匹配("OM"),最长公共子序列("LCS"),海明距离("HAM")和动态海明距离("DHD"的)。见seqdist的更多信息,序列之间的距离。 seqdistmc函数计算两个步骤之后提出的战略<CITE>波洛克(2007)</ CITE>多通道的距离。首先,建立一个新的序列来自各信道的序列的组合的对象。其次,它派生的替代成本矩阵求和(或平均)跨渠道的成本替代。然后,它调用seqdist来计算最终的矩阵。标准化处理序列是不是所有具有相同的长度时,可能是有用的。有关的应用标准化的详细信息,请参阅seqdist。
值----------Value----------
A matrix of pairwise distances between sequences is returned.
返回的矩阵序列两两之间的距离。
参考文献----------References----------
<h3>See Also</h3>
实例----------Examples----------
## Building one channel per type of event left, children or married[#楼每一个通道类型的事件左边,孩子或结婚]
bf <- as.matrix(biofam[, 10:25])
children <- bf==4 | bf==5 | bf==6
married <- bf == 2 | bf== 3 | bf==6
left <- bf==1 | bf==3 | bf==5 | bf==6
## Building sequence objects[#楼的序列对象]
child.seq <- seqdef(children)
marr.seq <- seqdef(married)
left.seq <- seqdef(left)
## Using transition rates to compute substitution costs on each channel[#使用的升学率,计算每个通道上的替代成本]
mcdist <- seqdistmc(channels=list(child.seq, marr.seq, left.seq),
method="OM", sm =list("TRATE", "TRATE", "TRATE"))
## Using a weight of 2 for children channel and specifying substitution-cost[2#使用权少儿频道,并指定替代成本]
smatrix <- list()
smatrix[[1]] <- seqsubm(child.seq, method="CONSTANT")
smatrix[[2]] <- seqsubm(marr.seq, method="CONSTANT")
smatrix[[3]] <- seqsubm(left.seq, method="TRATE")
mcdist2 <- seqdistmc(channels=list(child.seq, marr.seq, left.seq),
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|