count(seqinr)
count()所属R语言包:seqinr
Composition of dimer/trimer/etc oligomers
二聚体/三聚体/等组成的低聚物
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Counts the number of times dimer/trimer/etc oligomers occur in a sequence. Note that the oligomers are overlapping by default.
的数量进行计数的时刻二聚体/三聚体/等低聚物发生在一个序列中。注意,缺省情况下,重叠的低聚物。
用法----------Usage----------
count(seq, wordsize, start = 0, by = 1, freq = FALSE, alphabet = s2c("acgt"), frame = start)
参数----------Arguments----------
参数:seq
a vector of single characters.
单字符的向量。
参数:wordsize
an integer giving the size of word (n-mer) to count.
一个整数,字的大小(N-mer)的计数。
参数:start
an integer (0, 1, 2,...) giving the starting position to consider in the sequence. The default value 0 means that we start at the first nucleotide in the sequence.
的整数(0,1,2,...)给出的起始位置,要考虑在序列中。默认值0意味着我们开始在第一个核苷酸序列。
参数:by
an integer defaulting to 1 for the window step.
整数默认为1的窗口步骤。
参数:freq
if TRUE, word relative frequencies (summing to 1) are returned instead of counts
如果返回TRUE,文字的相对频率(总结:1),而不是计数
参数:alphabet
a vector of single characters used to build the oligomer set.
用于建立的低聚物集的单个字符的向量。
参数:frame
synonymous for start
启动代名词
Details
详细信息----------Details----------
count counts the occurence of all words by moving a window of length word. The window step is controlled by the argument by. start controls the starting position in the sequence for the count.
count计算的所有单词,通过移动窗口的长度word的发生。窗口的步骤所控制的参数by。 start控制计数序列中的开始位置。
值----------Value----------
This function returns a table whose dimnames are all the possible oligomers. All oligomers are returned, even if absent from the sequence.
这个函数返回一个table的dimnames都是可能的低聚物。将返回所有的低聚物,即使缺席序列。
(作者)----------Author(s)----------
D. Charif, J.R. Lobry with suggestions from Gabriel Valiente, Stefanie Hartmann and Christian Gautier
参考文献----------References----------
参见----------See Also----------
table for the class of the returned objet. See rho and
table返回的OBJET巴黎家居装饰博览会之类的。见rho
实例----------Examples----------
a <- s2c("acgggtacggtcccatcgaa")
##[#]
## To count dinucleotide occurrences in sequence a:[#要计算核苷酸序列发生在一个]
##[#]
count(a, word = 2)
##[#]
## To count trinucleotide occurrences in sequence a, with start = 2:[,#要数三核苷酸序列发生在一个与启动= 2:]
##[#]
count(a, word = 3, start = 2)
##[#]
## To count dinucleotide relative frequencies in sequence a:[#要计算核苷酸序列a的相对频率:]
##[#]
count(a, word = 2, freq = TRUE)
##[#]
## To count dinucleotides in codon positions III-I in a coding sequence:[#要计算二核苷酸在密码子的位置III-I的编码序列:]
##[#]
alldinuclIIIpI <- s2c("NNaaNatNttNtgNgtNtcNctNtaNagNggNgcNcgNgaNacNccNcaNN")
resIIIpI <- count(alldinuclIIIpI, word = 2, start = 2, by = 3)
stopifnot(all( resIIIpI == 1))
##[#]
## Simple sanity check:[#简单的例行性检查:]
##[#]
alldinucl <- "aattgtctaggcgacca"
stopifnot(all(count(s2c(alldinucl), 2) == 1))
alldiaa <- "aaxxzxbxvxyxwxtxsxpxfxmxkxlxixhxgxexqxcxdxnxrxazzbzvzyzwztzszpzfzmzkzlzizhzgzezqzczdznzrzabbvbybwbtbsbpbfbmbkblbibhbgbebqbcbdbnbrbavvyvwvtvsvpvfvmvkvlvivhvgvevqvcvdvnvrvayywytysypyfymykylyiyhygyeyqycydynyryawwtwswpwfwmwkwlwiwhwgwewqwcwdwnwrwattstptftmtktltithtgtetqtctdtntrtasspsfsmskslsishsgsesqscsdsnsrsappfpmpkplpiphpgpepqpcpdpnprpaffmfkflfifhfgfefqfcfdfnfrfammkmlmimhmgmemqmcmdmnmrmakklkikhkgkekqkckdknkrkallilhlglelqlcldlnlrlaiihigieiqicidiniriahhghehqhchdhnhrhaggegqgcgdgngrgaeeqecedenereaqqcqdqnqrqaccdcncrcaddndrdannrnarra"
stopifnot(all(count(s2c(alldiaa), 2, alphabet = s2c("arndcqeghilkmfpstwyvbzx")) == 1))
##[#]
## Example with dinucleotide count in the complete Human mitochondrion genome:[#示例完整的人类线粒体基因组的核苷酸数:]
##[#]
humanMito <- read.fasta(file = system.file("sequences/humanMito.fasta", package = "seqinr"))
##[#]
## Get the dinucleotide count:[#获取核苷酸数:]
##[#]
dinu <- count(humanMito[[1]], 2)
##[#]
## Put the results in a 4 X 4 array:[#将在4×4阵列的结果:]
##[#]
dinu2 <- dinu
dim(dinu2) <- c(4, 4)
nucl <- s2c("ACGT")
dimnames(dinu2) <- list(paste(nucl, "-3\'", sep = ""), paste("5\'-", nucl, sep = ""))
##[#]
## Show that CpG and GpT dinucleotides are depleted:[#显示耗尽的CpG和GPT二核苷酸的是:]
##[#]
mosaicplot(t(dinu2), shade = TRUE,
main = "Dinucleotide XpY frequencies in the Human\nmitochondrion complete genome",
xlab = "First nucleotide: Xp",
ylab = "Second nucleotide: pY", las = 1, cex = 1)
mtext("Note the depletion in CpG and GpT dinucleotides", side = 1, line = 3)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|