matchPWM(Biostrings)
matchPWM()所属R语言包:Biostrings
PWM creating, matching, and related utilities
PWM的创建,配套及相关设施
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Position Weight Matrix (PWM) creating, matching, and related utilities for DNA data. (PWM for amino acid sequences are not supported.)
位置权重矩阵(PWM)的创建,配套相关的实用程序,并为DNA数据。 (氨基酸序列PWM不支持。)
用法----------Usage----------
PWM(x, type = c("log2probratio", "prob"),
prior.params = c(A=0.25, C=0.25, G=0.25, T=0.25))
matchPWM(pwm, subject, min.score="80%", ...)
countPWM(pwm, subject, min.score="80%", ...)
PWMscoreStartingAt(pwm, subject, starting.at=1)
## Utility functions for basic manipulation of the Position Weight Matrix
maxWeights(x)
minWeights(x)
maxScore(x)
minScore(x)
unitScale(x)
## S4 method for signature 'matrix'
reverseComplement(x, ...)
参数----------Arguments----------
参数:x
For PWM: a rectangular character vector or rectangular DNAStringSet object ("rectangular" means that all elements have the same number of characters) with no IUPAC ambiguity letters, or a Position Frequency Matrix represented as an integer matrix with row names containing at least A, C, G and T (typically the result of a call to consensusMatrix). For maxWeights, minWeights, maxScore, minScore, unitScale and reverseComplement: a Position Weight Matrix represented as a numeric matrix with row names A, C, G and T.
PWM:一个长方形的特征向量或没有IUPAC模糊字母,或位置频率矩阵行名称的整数矩阵表示的矩形DNAStringSet对象(“矩形”意味着所有元素具有相同的字符数)包含至少一个C,G和T(通常是调用consensusMatrix的结果)。 maxWeights,minWeights,maxScore,minScore,unitScale和reverseComplement:作为一个数字矩阵与行名代表一个无位置权重矩阵,,C,G和T。
参数:type
The type of Position Weight Matrix, either "log2probratio" or "prob". See Details section for more information.
位置权重矩阵的类型,无论是“log2probratio”或“概率”。有关更多信息,请参阅详细信息部分。
参数:prior.params
A positive numeric vector, which represents the parameters of the Dirichlet conjugate prior, with names A, C, G, and T. See Details section for more information.
一个积极的数字向量,它代表之前的Dirichlet共轭参数名称的,C,G和T.更多信息,请参阅详细信息部分。
参数:pwm
A Position Weight Matrix represented as a numeric matrix with row names A, C, G and T.
一个位置权重矩阵与行的数字矩阵表示名称的,C,G和T。
参数:subject
A DNAString, XStringViews or MaskedDNAString object for matchPWM and countPWM. A DNAString object for PWMscoreStartingAt.
一个DNAString,XStringViews或MaskedDNAString对象matchPWM和countPWM。一个的PWMscoreStartingAtDNAString对象。
参数:min.score
The minimum score for counting a match. Can be given as a character string containing a percentage (e.g. "85%") of the highest possible score or as a single number.
计数匹配的最低得分。可以作为一个字符的字符串包含一个百分比(例如"85%")可能达到的最高得分,或作为一个单一的数字。
参数:starting.at
An integer vector specifying the starting positions of the Position Weight Matrix relatively to the subject.
指定一个整数向量的相对主体的位置权重矩阵的起始位置。
参数:...
Additional arguments for methods.
附加参数的方法。
Details
详情----------Details----------
The PWM function uses a multinomial model with a Dirichlet conjugate prior to calculate the estimated probability of base b at position i. As mentioned in the Arguments section, prior.params supplies the parameters for the DNA bases A, C, G, and T in the Dirichlet prior. These values result in a position independent initial estimate of the probabilities for the bases to be priorProbs = prior.params/sum(prior.params) and the posterior (data infused) estimate for the probabilities for the bases in each of the positions to be postProbs = (consensusMatrix(x) + prior.params)/(length(x) + sum(prior.params)). When type = "log2probratio", the PWM = unitScale(log2(postProbs/priorProbs)). When type = "prob", the PWM = unitScale(postProbs).
PWM函数使用一类Dirichlet共轭前多项式模型来计算基B在我的位置的概率的估计。在“参数”一节中提到,prior.params提供的参数为DNA碱基A,C,G,和之前的Dirichlet牛逼。这些位置无关的概率为碱基的初步估计值priorProbs = prior.params/sum(prior.params)“后的数据注入的位置,每个碱基的概率的估计是postProbs = (consensusMatrix(x) + prior.params)/(length(x) + sum(prior.params))。当type = "log2probratio",PWM =unitScale(log2(postProbs/priorProbs))。当type = "prob",PWM =unitScale(postProbs)。
值----------Value----------
A numeric matrix representing the Position Weight Matrix for PWM.
代表PWM位置权重矩阵的数值矩阵。
A numeric vector containing the Position Weight Matrix-based scores for PWMscoreStartingAt.
一个数字向量的位置权重矩阵为基础的PWMscoreStartingAt分数。
An XStringViews object for matchPWM.
matchPWMXStringViews对象。
A single integer for countPWM.
一个单一的整数countPWM。
A vector containing the max weight for each position in pwm for maxWeights.
一个向量,包含在pwmmaxWeights每个位置的最大重量。
A vector containing the min weight for each position in pwm for minWeights.
一个向量,为每个位置的最小重量在pwmminWeights。
The highest possible score for a given Position Weight Matrix for maxScore.
maxScore对于一个给定的位置权重矩阵的最高得分。
The lowest possible score for a given Position Weight Matrix for maxScore.
maxScore对于一个给定的位置权重矩阵的最低得分。
The modified numeric matrix given by (x - minScore(x)/ncol(x))/(maxScore(x) - minScore(x)) for unitScale.
修改后的数字矩阵(x - minScore(x)/ncol(x))/(maxScore(x) - minScore(x))unitScale。
A PWM obtained by reverting the column order in PWM x and by reassigning each row to its complementary nucleotide for reverseComplement.
一个PWM得到恢复x和reverseComplement互补核苷酸重新分配每一行的PWM的列顺序。
作者(S)----------Author(s)----------
H. Pages and P. Aboyoun
参考文献----------References----------
identification of regulatory elements, Nat Rev Genet., 5(4):276-87.
参见----------See Also----------
consensusMatrix, matchPattern, reverseComplement, DNAString-class, XStringViews-class
consensusMatrix,matchPattern,reverseComplement,级DNAString,XStringViews级
举例----------Examples----------
## Data setup:[#数据设置:]
data(HNF4alpha)
library(BSgenome.Dmelanogaster.UCSC.dm3)
chr3R <- Dmelanogaster$chr3R
chr3R
## Create a PWM from a PFM or directly from a rectangular[#创建一个PWM从PFM或直接从一个长方形]
## DNAStringSet object:[#DNAStringSet对象:]
pfm <- consensusMatrix(HNF4alpha)
pwm <- PWM(pfm) # same as 'PWM(HNF4alpha)'[同样作为“PWM(HNF4alpha)”]
## Perform some general routines on the PWM:[#执行一些一般性的PWM例程:]
round(pwm, 2)
maxWeights(pwm)
maxScore(pwm)
reverseComplement(pwm)
## Score the first 5 positions:[#得分第5的位置:]
PWMscoreStartingAt(pwm, unmasked(chr3R), starting.at=1:5)
## Match the plus strand:[#匹配加股:]
hits <- matchPWM(pwm, chr3R)
nhit <- countPWM(pwm, chr3R) # same as 'length(hits)'[相同的长度(点击)“]
## Post-calculate the scores of the hits:[#发表计算命中得分:]
scores <- PWMscoreStartingAt(pwm, subject(hits), start(hits))
## Match the minus strand:[#匹配负链:]
matchPWM(reverseComplement(pwm), chr3R)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|