seqient(TraMineR)
seqient()所属R语言包:TraMineR
Within sequence entropies
在序列熵
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Computes normalized or non-normalized within sequence entropies
内序列熵计算归或不归
用法----------Usage----------
seqient(seqdata, norm=TRUE, base=exp(1), with.missing=FALSE)
参数----------Arguments----------
参数:seqdata
a sequence object as returned by the the seqdef function.
返回的序列对象的seqdef功能。
参数:norm
logical: should the entropy be normalized? TRUE by default. (see details)
逻辑:应熵进行归一呢? TRUE默认值。 (见详情)
参数:base
real positive value: base of the logarithm used in the entropy formula (see details). If entropy is normalized (norm=TRUE), its value is the same whatever the base. Default is exp(1), i.e., the natural logarithm is used.
真正的积极的对数熵公式中使用的值:碱基(见详情)。如果是归熵(norm=TRUE),它的值是相同的任何碱基。默认值是exp(1),即,使用的自然对数。
参数:with.missing
logical: if TRUE, the missing state (gap in sequences) is handled as an additional state when computing the state distribution in the sequence.
逻辑:如果TRUE,缺少的状态(序列中的差距)的处理计算时态分布序列中的一个额外的状态。
Details
详细信息----------Details----------
The seqient function returns the Shannon entropy of each sequence in seqdata. The entropy of a sequence is computed using the formula
在seqdata seqient函数返回每个序列的香农熵。在序列的熵是使用下面的公式计算的
where s is the size of the alphabet and p_i the proportion of occurrences of the ith state in the considered sequence. The log is here the natural logarithm, i.e., the logarithm in base e. The entropy can be interpreted as the "uncertainty" of predicting the states in a given sequence. If all states in the sequence are the same, the entropy is equal to 0. The maximum entropy for a sequence of length 12 with an alphabet of 4 states is 1.386294 and is attained when each of the four states appears 3 times.
其中s的大小的字母和p_ii个状态考虑的顺序出现的比例。log是这里的自然对数的对数,即在碱基e。可以被解释为预测的状态在一个给定的序列中的“不确定性”的熵。如果在序列中的所有的状态是相同的,熵是等于0。最大熵的长度为12的序列的4种状态以字母表是1.386294,达到每个的四种状态时,出现3次。
Normalization can be requested with the norm=TRUE option, in which case the returned value is the entropy divided by the entropy of the alphabet. The later is an upper bound for the entropy of sequences made from this alphabet. It exactly is the maximal entropy when the sequence length is a multiple of the alphabet size. The value of the normalized entropy is independent of the chosen logarithm base.
可以规范化要求的norm=TRUE选项,在这种情况下,返回值是的熵的字母划分的熵。购买是从这个字母表的序列的熵的上限。它究竟是最大熵时,序列长度为字母表的大小的倍数。归一熵的值是独立选择的对数的底。
值----------Value----------
a vector with an entropy value for each sequence in seqdata; the vector length is equal to the number of sequences.
中的每个序列的熵值的矢量与seqdata;的向量长度的序列的数量是相等的。
参考文献----------References----------
参见----------See Also----------
seqstatd for the entropy of the transversal state distributions by positions in the sequence.
seqstatd的熵的横向状态分布序列中的位置。
实例----------Examples----------
data(actcal)
actcal.seq <- seqdef(actcal,13:24)
## Summarize and plots an histogram[#总结和绘制的直方图]
## of the within sequence entropy[#内的序列熵]
actcal.ient <- seqient(actcal.seq)
summary(actcal.ient)
hist(actcal.ient)
## Examples using with.missing argument[#示例使用with.missing参数的]
data(ex1)
ex1.seq <- seqdef(ex1, 1:13, weights=ex1$weights)
seqient(ex1.seq)
seqient(ex1.seq, with.missing=TRUE)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|