calc.fdr(rtfbs)
calc.fdr()所属R语言包:rtfbs
Calculate FDR
计算FDR
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Calculate False Discovery Rate (FDR) of possible binding sites. This function uses two sets of scores, realSeqsScores and simSeqsScores. realSeqsScores are scores for the sequences being scanned for binding sites. simSeqsScores are scores for the simulated sequence. The simulated sequences and simSeqsScores must be made using the same Markov Model as the realSeqsScores.
计算错误发现率(FDR)的可能结合位点。这功能使用两套的成绩,realSeqsScores和simSeqsScores。 realSeqsScores是被扫描的结合位点序列的分数。 simSeqsScores是模拟序列的分数。模拟序列和simSeqsScores,必须使用相同马尔可夫模型的realSeqsScores。
用法----------Usage----------
calc.fdr(realSeqs, realSeqsScores, simSeqs,
simSeqsScores, interval = 0.01)
参数----------Arguments----------
参数:realSeqs
MS object containing non-simulated sequences
MS对象,其中包含非模拟序列
参数:realSeqsScores
Feat object obtained from scoring realSeqs
壮举对象获得得分realSeqs
参数:simSeqs
MS object containing simulated sequences
MS对象,其中包含模拟序列
参数:simSeqsScores
Feat object obtained from scoring simSeqs
壮举对象获得得分simSeqs
参数:interval
Float specifying distance between steps at which the FDR will be calculated (lower is better). If NULL, calculate FDR for each unique score.
浮法指定的FDR(越低越好)将计算各步骤之间的距离。如果为NULL,计算FDR为每一个独特的得分。
值----------Value----------
Data.Frame with two columns 'score' and 'FDR' mapping a single score to a single FDR. Data frame is sorted by score if any exist.
数据框两列“得分”和“FDR的映射一个到一个单一的FDR得分。数据框是按得分排序,如果存在的话。
注意----------Note----------
realSeqsScores and simSeqsScores are both objects returned by score.ms; the same arguments (threshold, conservative, strand) should be used in both calls to score.ms or FDR will not be valid.
realSeqsScores和simSeqsScores是两个对象返回score.ms;在两个通话到score.ms或FDR将是无效的,应使用相同的参数(阈值,保守,链)。
If calc.fdr returns an fdr of zero for all scores, then you can probably increase the number of significant results by re-running score.ms with a lower threshold for both simulated and real sequences.
如果calc.fdr返回一个零FDR所有的分数,那么你或许可以增加的显著成效,由重新运行score.ms,用一个较低的阈值模拟和真实序列。
参见----------See Also----------
score.ms
score.ms
实例----------Examples----------
require("rtfbs")
exampleArchive <- system.file("extdata", "NRSF.zip", package="rtfbs")
seqFile <- "input.fas"
unzip(exampleArchive, seqFile)
# Read in FASTA file "input.fas" from the examples into an [阅读在FASTA的文件“input.fas”中的例子成]
# MS (multiple sequences) object[多个序列(MS)对象]
ms <- read.ms(seqFile);
pwmFile <- "pwm.meme"
unzip(exampleArchive, pwmFile)
# Read in Position Weight Matrix (PWM) from MEME file from[阅读中的位置权重矩阵(PWM)MEME文件]
# the examples into a Matrix object[到Matrix对象的例子]
pwm <- read.pwm(pwmFile)
# Build a 3rd order Markov Model to represent the sequences[建立一个3阶Markov模型来表示序列]
# in the MS object "ms". The Model will be a list of[在MS对象“MS”。该模型将一个列表]
# matrices corrisponding in size to the order of the [矩阵的大小的顺序corrisponding]
# Markov Model[马尔可夫模型]
mm <- build.mm(ms, 3);
# Match the PWM against the sequences provided to find[对找到的序列匹配PWM]
# possible transcription factor binding sites. A [可能的转录因子结合位点。一]
# Features object is returned, containing the location[返回对象特点的,包含位置]
# of each possible binding site and an associated score.[每个可能的结合位点和一个相关的得分。]
# Sites with a negative score are not returned unless [除非网站不会返回一个负的成绩]
# we set threshold=-Inf as a parameter.[我们作为一个参数设置的阈值=-INF。]
cs <- score.ms(ms, pwm, mm, threshold=-2)
# Generate a sequence 1000 bases long using the supplied[术语使用提供的1000个碱基生成一个序列]
# Markov Model and random numbers[马尔可夫模型和随机数]
v <- simulate.ms(mm, 100000)
# Match the PWM against the sequences provided to find[对找到的序列匹配PWM]
# possible transcription factor binding sites. A [可能的转录因子结合位点。一]
# Features object is returned, containing the location[返回对象特点的,包含位置]
# of each possible binding site and an associated score.[每个可能的结合位点和一个相关的得分。]
# Sites with a negative score are not returned unless [除非网站不会返回一个负的成绩]
# we set threshold=-Inf as a parameter. Any identified[我们作为一个参数设置的阈值=-INF。任何确定的]
# binding sites from simulated data are false positives[从模拟数据的结合位点是误报]
# and used to calculate False Discovery Rate[并用于计算假发现率]
xs <- score.ms(v, pwm, mm, threshold=-2)
# Calculate the False Discovery Rate for each possible[每一个可能的假发现率计算]
# binding site in the Features object CS. Return[结合位点的特点反对CS。返回]
# a mapping between each binding site score and the[每个结合位点的得分和之间的映射]
# associated FDR.[相关的FDR。]
fdr <- calc.fdr(ms, cs, v, xs)
# Print the Data.Frame containing the FDR/Score mapping[打印数据框包含FDR /分数映射]
fdr
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|