R语言 Biostrings包 matchPWM()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 13:47:32

matchPWM(Biostrings)
matchPWM()所属R语言包：Biostrings

                                    PWM creating, matching, and related utilities
                                       PWM的创建，配套及相关设施

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Position Weight Matrix (PWM) creating, matching, and related utilities for DNA data. (PWM for amino acid sequences are not supported.)
位置权重矩阵（PWM）的创建，配套相关的实用程序，并为DNA数据。（氨基酸序列PWM不支持。）

用法----------Usage----------

PWM(x, type = c("log2probratio", "prob"),
prior.params = c(A=0.25, C=0.25, G=0.25, T=0.25))

matchPWM(pwm, subject, min.score="80%", ...)
countPWM(pwm, subject, min.score="80%", ...)
PWMscoreStartingAt(pwm, subject, starting.at=1)

## Utility functions for basic manipulation of the Position Weight Matrix
maxWeights(x)
minWeights(x)
maxScore(x)
minScore(x)
unitScale(x)
## S4 method for signature 'matrix'
reverseComplement(x, ...)

参数----------Arguments----------

参数：x
For PWM: a rectangular character vector or rectangular DNAStringSet object ("rectangular" means that all elements have the same number of characters) with no IUPAC ambiguity letters, or a Position Frequency Matrix represented as an integer matrix with row names containing at least A, C, G and T (typically the result of a call to consensusMatrix).  For maxWeights, minWeights, maxScore, minScore, unitScale and reverseComplement: a Position Weight Matrix represented as a numeric matrix with row names A, C, G and T.
PWM：一个长方形的特征向量或没有IUPAC模糊字母，或位置频率矩阵行名称的整数矩阵表示的矩形DNAStringSet对象（“矩形”意味着所有元素具有相同的字符数）包含至少一个C，G和T（通常是调用consensusMatrix的结果）。 maxWeights，minWeights，maxScore，minScore，unitScale和reverseComplement：作为一个数字矩阵与行名代表一个无位置权重矩阵，，C，G和T。

参数：type
The type of Position Weight Matrix, either "log2probratio" or "prob". See Details section for more information.
位置权重矩阵的类型，无论是“log2probratio”或“概率”。有关更多信息，请参阅详细信息部分。

参数：prior.params
A positive numeric vector, which represents the parameters of the Dirichlet conjugate prior, with names A, C, G, and T. See Details section for more information.
一个积极的数字向量，它代表之前的Dirichlet共轭参数名称的，C，G和T.更多信息，请参阅详细信息部分。

参数：pwm
A Position Weight Matrix represented as a numeric matrix with row names A, C, G and T.
一个位置权重矩阵与行的数字矩阵表示名称的，C，G和T。

参数：subject
A DNAString, XStringViews or MaskedDNAString object for matchPWM and countPWM.  A DNAString object for PWMscoreStartingAt.
一个DNAString，XStringViews或MaskedDNAString对象matchPWM和countPWM。一个的PWMscoreStartingAtDNAString对象。

参数：min.score
The minimum score for counting a match. Can be given as a character string containing a percentage (e.g. "85%") of the highest possible score or as a single number.
计数匹配的最低得分。可以作为一个字符的字符串包含一个百分比（例如"85%"）可能达到的最高得分，或作为一个单一的数字。

参数：starting.at
An integer vector specifying the starting positions of the Position Weight Matrix relatively to the subject.
指定一个整数向量的相对主体的位置权重矩阵的起始位置。

参数：...
Additional arguments for methods.
附加参数的方法。

Details

详情----------Details----------

The PWM function uses a multinomial model with a Dirichlet conjugate prior to calculate the estimated probability of base b at position i. As mentioned in the Arguments section, prior.params supplies the parameters for the DNA bases A, C, G, and T in the Dirichlet prior. These values result in a position independent initial estimate of the probabilities for the bases to be priorProbs = prior.params/sum(prior.params) and the posterior (data infused) estimate for the probabilities for the bases in each of the positions to be postProbs = (consensusMatrix(x) + prior.params)/(length(x) + sum(prior.params)). When type = "log2probratio", the PWM = unitScale(log2(postProbs/priorProbs)). When type = "prob", the PWM = unitScale(postProbs).
PWM函数使用一类Dirichlet共轭前多项式模型来计算基B在我的位置的概率的估计。在“参数”一节中提到，prior.params提供的参数为DNA碱基A，C，G，和之前的Dirichlet牛逼。这些位置无关的概率为碱基的初步估计值priorProbs = prior.params/sum(prior.params)“后的数据注入的位置，每个碱基的概率的估计是postProbs = (consensusMatrix(x) + prior.params)/(length(x) + sum(prior.params))。当type = "log2probratio"，PWM =unitScale(log2(postProbs/priorProbs))。当type = "prob"，PWM =unitScale(postProbs)。

值----------Value----------

A numeric matrix representing the Position Weight Matrix for PWM.
代表PWM位置权重矩阵的数值矩阵。

A numeric vector containing the Position Weight Matrix-based scores for PWMscoreStartingAt.
一个数字向量的位置权重矩阵为基础的PWMscoreStartingAt分数。

An XStringViews object for matchPWM.
matchPWMXStringViews对象。

A single integer for countPWM.
一个单一的整数countPWM。

A vector containing the max weight for each position in pwm for maxWeights.
一个向量，包含在pwmmaxWeights每个位置的最大重量。

A vector containing the min weight for each position in pwm for minWeights.
一个向量，为每个位置的最小重量在pwmminWeights。

The highest possible score for a given Position Weight Matrix for maxScore.
maxScore对于一个给定的位置权重矩阵的最高得分。

The lowest possible score for a given Position Weight Matrix for maxScore.
maxScore对于一个给定的位置权重矩阵的最低得分。

The modified numeric matrix given by (x - minScore(x)/ncol(x))/(maxScore(x) - minScore(x)) for unitScale.
修改后的数字矩阵(x - minScore(x)/ncol(x))/(maxScore(x) - minScore(x))unitScale。

A PWM obtained by reverting the column order in PWM x and by reassigning each row to its complementary nucleotide for reverseComplement.
一个PWM得到恢复x和reverseComplement互补核苷酸重新分配每一行的PWM的列顺序。

作者（S）----------Author(s)----------

H. Pages and P. Aboyoun

参考文献----------References----------

identification of regulatory elements, Nat Rev Genet., 5(4):276-87.

参见----------See Also----------

consensusMatrix, matchPattern, reverseComplement, DNAString-class, XStringViews-class
consensusMatrix，matchPattern，reverseComplement，级DNAString，XStringViews级

举例----------Examples----------

  ## Data setup:[＃数据设置：]
  data(HNF4alpha)
  library(BSgenome.Dmelanogaster.UCSC.dm3)
  chr3R <- Dmelanogaster$chr3R
  chr3R

  ## Create a PWM from a PFM or directly from a rectangular[＃创建一个PWM从PFM或直接从一个长方形]
  ## DNAStringSet object:[＃DNAStringSet对象：]
  pfm <- consensusMatrix(HNF4alpha)
  pwm <- PWM(pfm)  # same as 'PWM(HNF4alpha)'[同样作为“PWM（HNF4alpha）”]

  ## Perform some general routines on the PWM:[＃执行一些一般性的PWM例程：]
  round(pwm, 2)
  maxWeights(pwm)
  maxScore(pwm)
  reverseComplement(pwm)

  ## Score the first 5 positions:[＃得分第5的位置：]
  PWMscoreStartingAt(pwm, unmasked(chr3R), starting.at=1:5)

  ## Match the plus strand:[＃匹配加股：]
  hits <- matchPWM(pwm, chr3R)
  nhit <- countPWM(pwm, chr3R)  # same as 'length(hits)'[相同的长度（点击）“]

  ## Post-calculate the scores of the hits:[＃发表计算命中得分：]
  scores <- PWMscoreStartingAt(pwm, subject(hits), start(hits))

  ## Match the minus strand:[＃匹配负链：]
  matchPWM(reverseComplement(pwm), chr3R)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册