找回密码
 注册
查看: 1496|回复: 0

R语言 Biostrings包 matchPWM()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-25 13:47:32 | 显示全部楼层 |阅读模式
matchPWM(Biostrings)
matchPWM()所属R语言包:Biostrings

                                        PWM creating, matching, and related utilities
                                         PWM的创建,配套及相关设施

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Position Weight Matrix (PWM) creating, matching, and related utilities for DNA data. (PWM for amino acid sequences are not supported.)
位置权重矩阵(PWM)的创建,配套相关的实用程序,并为DNA数据。 (氨基酸序列PWM不支持。)


用法----------Usage----------


PWM(x, type = c("log2probratio", "prob"),
    prior.params = c(A=0.25, C=0.25, G=0.25, T=0.25))

matchPWM(pwm, subject, min.score="80%", ...)
countPWM(pwm, subject, min.score="80%", ...)
PWMscoreStartingAt(pwm, subject, starting.at=1)

## Utility functions for basic manipulation of the Position Weight Matrix
maxWeights(x)
minWeights(x)
maxScore(x)
minScore(x)
unitScale(x)
## S4 method for signature 'matrix'
reverseComplement(x, ...)



参数----------Arguments----------

参数:x
For PWM: a rectangular character vector or rectangular DNAStringSet object ("rectangular" means that all elements have the same number of characters) with no IUPAC ambiguity letters, or a Position Frequency Matrix represented as an integer matrix with row names containing at least A, C, G and T (typically the result of a call to consensusMatrix).  For maxWeights, minWeights, maxScore, minScore, unitScale and reverseComplement: a Position Weight Matrix represented as a numeric matrix with row names A, C, G and T.  
PWM:一个长方形的特征向量或没有IUPAC模糊字母,或位置频率矩阵行名称的整数矩阵表示的矩形DNAStringSet对象(“矩形”意味着所有元素具有相同的字符数)包含至少一个C,G和T(通常是调用consensusMatrix的结果)。 maxWeights,minWeights,maxScore,minScore,unitScale和reverseComplement:作为一个数字矩阵与行名代表一个无位置权重矩阵,,C,G和T。


参数:type
The type of Position Weight Matrix, either "log2probratio" or "prob". See Details section for more information.  
位置权重矩阵的类型,无论是“log2probratio”或“概率”。有关更多信息,请参阅详细信息部分。


参数:prior.params
A positive numeric vector, which represents the parameters of the Dirichlet conjugate prior, with names A, C, G, and T. See Details section for more information.  
一个积极的数字向量,它代表之前的Dirichlet共轭参数名称的,C,G和T.更多信息,请参阅详细信息部分。


参数:pwm
A Position Weight Matrix represented as a numeric matrix with row names A, C, G and T.  
一个位置权重矩阵与行的数字矩阵表示名称的,C,G和T。


参数:subject
A DNAString, XStringViews or MaskedDNAString object for matchPWM and countPWM.  A DNAString object for PWMscoreStartingAt.  
一个DNAString,XStringViews或MaskedDNAString对象matchPWM和countPWM。一个的PWMscoreStartingAtDNAString对象。


参数:min.score
The minimum score for counting a match. Can be given as a character string containing a percentage (e.g. "85%") of the highest possible score or as a single number.  
计数匹配的最低得分。可以作为一个字符的字符串包含一个百分比(例如"85%")可能达到的最高得分,或作为一个单一的数字。


参数:starting.at
An integer vector specifying the starting positions of the Position Weight Matrix relatively to the subject.  
指定一个整数向量的相对主体的位置权重矩阵的起始位置。


参数:...
Additional arguments for methods.  
附加参数的方法。


Details

详情----------Details----------

The PWM function uses a multinomial model with a Dirichlet conjugate prior to calculate the estimated probability of base b at position i. As mentioned in the Arguments section, prior.params supplies the parameters for the DNA bases A, C, G, and T in the Dirichlet prior. These values result in a position independent initial estimate of the probabilities for the bases to be priorProbs = prior.params/sum(prior.params) and the posterior (data infused) estimate for the probabilities for the bases in each of the positions to be postProbs = (consensusMatrix(x) + prior.params)/(length(x) + sum(prior.params)). When type = "log2probratio", the PWM = unitScale(log2(postProbs/priorProbs)). When type = "prob", the PWM = unitScale(postProbs).  
PWM函数使用一类Dirichlet共轭前多项式模型来计算基B在我的位置的概率的估计。在“参数”一节中提到,prior.params提供的参数为DNA碱基A,C,G,和之前的Dirichlet牛逼。这些位置无关的概率为碱基的初步估计值priorProbs = prior.params/sum(prior.params)“后的数据注入的位置,每个碱基的概率的估计是postProbs = (consensusMatrix(x) + prior.params)/(length(x) + sum(prior.params))。当type = "log2probratio",PWM =unitScale(log2(postProbs/priorProbs))。当type = "prob",PWM =unitScale(postProbs)。


值----------Value----------

A numeric matrix representing the Position Weight Matrix for PWM.
代表PWM位置权重矩阵的数值矩阵。

A numeric vector containing the Position Weight Matrix-based scores for PWMscoreStartingAt.
一个数字向量的位置权重矩阵为基础的PWMscoreStartingAt分数。

An XStringViews object for matchPWM.
matchPWMXStringViews对象。

A single integer for countPWM.
一个单一的整数countPWM。

A vector containing the max weight for each position in pwm for maxWeights.
一个向量,包含在pwmmaxWeights每个位置的最大重量。

A vector containing the min weight for each position in pwm for minWeights.
一个向量,为每个位置的最小重量在pwmminWeights。

The highest possible score for a given Position Weight Matrix for maxScore.
maxScore对于一个给定的位置权重矩阵的最高得分。

The lowest possible score for a given Position Weight Matrix for maxScore.
maxScore对于一个给定的位置权重矩阵的最低得分。

The modified numeric matrix given by (x - minScore(x)/ncol(x))/(maxScore(x) - minScore(x)) for unitScale.
修改后的数字矩阵(x - minScore(x)/ncol(x))/(maxScore(x) - minScore(x))unitScale。

A PWM obtained by reverting the column order in PWM x and by reassigning each row to its complementary nucleotide for reverseComplement.
一个PWM得到恢复x和reverseComplement互补核苷酸重新分配每一行的PWM的列顺序。


作者(S)----------Author(s)----------


H. Pages and P. Aboyoun



参考文献----------References----------

identification of regulatory elements, Nat Rev Genet., 5(4):276-87.

参见----------See Also----------

consensusMatrix, matchPattern, reverseComplement, DNAString-class, XStringViews-class
consensusMatrix,matchPattern,reverseComplement,级DNAString,XStringViews级


举例----------Examples----------


  ## Data setup:[#数据设置:]
  data(HNF4alpha)
  library(BSgenome.Dmelanogaster.UCSC.dm3)
  chr3R <- Dmelanogaster$chr3R
  chr3R

  ## Create a PWM from a PFM or directly from a rectangular[#创建一个PWM从PFM或直接从一个长方形]
  ## DNAStringSet object:[#DNAStringSet对象:]
  pfm <- consensusMatrix(HNF4alpha)
  pwm &lt;- PWM(pfm)  # same as 'PWM(HNF4alpha)'[同样作为“PWM(HNF4alpha)”]

  ## Perform some general routines on the PWM:[#执行一些一般性的PWM例程:]
  round(pwm, 2)
  maxWeights(pwm)
  maxScore(pwm)
  reverseComplement(pwm)

  ## Score the first 5 positions:[#得分第5的位置:]
  PWMscoreStartingAt(pwm, unmasked(chr3R), starting.at=1:5)

  ## Match the plus strand:[#匹配加股:]
  hits <- matchPWM(pwm, chr3R)
  nhit &lt;- countPWM(pwm, chr3R)  # same as 'length(hits)'[相同的长度(点击)“]

  ## Post-calculate the scores of the hits:[#发表计算命中得分:]
  scores <- PWMscoreStartingAt(pwm, subject(hits), start(hits))

  ## Match the minus strand:[#匹配负链:]
  matchPWM(reverseComplement(pwm), chr3R)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-25 05:08 , Processed in 0.022382 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表