annotatePeakInBatch(ChIPpeakAnno)
annotatePeakInBatch()所属R语言包:ChIPpeakAnno
obtain the distance to the nearest TSS, miRNA, exon et al for a list of peak intervals
获得距离最近的可溶性固形物,miRNA的,外显子等高峰期的时间间隔列表
译者:生物统计家园网 机器人LoveR
描述----------Description----------
obtain the distance to the nearest TSS, miRNA, exon et al for a list of peak locations leveraging IRanges and biomaRt package
获得距离最近的可溶性固形物,miRNA的,外显子等为峰值位置,利用IRanges和biomaRt包列表
用法----------Usage----------
annotatePeakInBatch(myPeakList, mart, featureType = c("TSS", "miRNA","Exon"),
AnnotationData,output=c("nearestStart", "overlapping","both"),multiple=c(TRUE,FALSE),
maxgap=0,PeakLocForDistance = c("start", "middle", "end"),
FeatureLocForDistance = c("TSS", "middle","start", "end","geneEnd"), select=c("all", "first","last","arbitrary"))
参数----------Arguments----------
参数:myPeakList
RangedData: See example below
RangedData:参见下面的例子
参数:mart
used if AnnotationData not supplied, a mart object, see useMart of bioMaRt package for details
使用如果AnnotationData不提供,集市对象,详见bioMaRt包useMart,
参数:featureType
used if AnnotationData not supplied, TSS, miRNA or exon
使用如果AnnotationData不提供,可溶性固形物,miRNA的或外显子
参数:AnnotationData
annotation data obtained from getAnnotation or customized annotation of class RangedData containing additional variable: strand (1 or + for plus strand and -1 or - for minus strand). For example, data(TSS.human.NCBI36),data(TSS.mouse.NCBIM37), data(TSS.rat.RGSC3.4) and data(TSS.zebrafish.Zv8) . If not supplied, then annotation will be obtained from biomaRt automatically using the parameters of mart and featureType
注释类RangedData包含额外的变量从getAnnotation或定制的注释得到的数据:股(1加链和-1或+或 - 负链)。例如,数据(TSS.human.NCBI36),数据(TSS.mouse.NCBIM37),数据(TSS.rat.RGSC3.4)和数据(TSS.zebrafish.Zv8)。如果没有提供,那么注释将从biomaRt获得自动使用集市和featureType的参数
参数:output
nearestStart (default): will output the nearest features calculated as peak start - feature start (feature end if feature resides at minus strand); overlapping: will output overlapping features with maximum gap specified as maxgap between peak range and feature range; both: will output all the nearest features, in addition, will output any features that overlap the peak that is not the nearest features.
nearestStart(预设):就近输出功能计算为高峰开始 - 启动功能(功能结束,如果功能驻留在零下链);重叠:将输出重叠功能之间的峰值范围和功能范围maxgap指定的最大差距,既:输出所有最近的功能,此外,将输出任何功能重叠的高峰,是不是最近的功能。
参数:multiple
not applicable when output is nearestStart. TRUE: output multiple overlapping features for each peak. FALSE: output at most one overlapping feature for each peak. This parameter is kept for backward compatibility, please use select.
不适用输出是nearestStart时。真:输出多个重叠的功能,为每个高峰。 FALSE:输出每个峰重叠功能最。此参数为保持向后兼容性,请选择。
参数:maxgap
Non-negative integer. Intervals with a separation of maxgap or less are considered to be overlapping
非负整数。或更少的maxgap分离的间隔都被认为是重叠
参数:PeakLocForDistance
Specify the location of peak for calculating distance,i.e., middle means using middle of the peak to calculate distance to feature, start means using start of the peak to calculate the distance to feature. To be compatible with previous version, by default using start
指定的峰值计算距离的位置,即中间是指使用中间的高峰期,以功能来计算距离,开始使用的高峰期开始计算距离功能。为了与以前的版本兼容,默认情况下使用开始
参数:FeatureLocForDistance
Specify the location of feature for calculating distance,i.e., middle means using middle of the feature to calculate distance of peak to feature, start means using start of the feature to calculate the distance to feature, TSS means using start of feature when feature is on plus strand and using end of feature when feature is on minus strand, geneEnd means using end of feature when feature is on plus strand and using start of feature when feature is on minus strand. To be compatible with previous version, by default using TSS
指定的功能计算距离的位置,即中间意味着使用功能的中间计算距离峰值功能,启动意味着使用功能开始计算距离功能,可溶性固形物是指开始时的特点是使用功能加链和使用功能是负链上的功能结束时,geneEnd意味着使用功能结束时功能,加上股和功能是负链时使用的功能开始。为了与以前的版本兼容,默认使用TSS
参数:select
all may return multiple overlapping peaks, first will return the first overlapping peak, last will return the last overlapping peak and arbitrary will return one of the overlapping peaks.
所有可能返回多个叠嶂,首先将返回第一个重叠的高峰,最后会回到过去的重叠峰,并任意将返回叠嶂之一。
值----------Value----------
RangedData with slot start holding the start position of the peak, slot end holding the end position of the peak, slot space holding the chromosome location where the peak is located, slot rownames holding the id of the peak. In addition, the following variables are included.
与插槽RangedData开始举行的高峰期的开始位置,槽年底举行的高峰期的结束位置,槽空间峰位于染色体上的位置,峰的ID的的插槽rownames控股。此外,包括下列变量。
参数:<code>feature</code>
id of the feature such as ensembl gene ID
的功能,如ENSEMBL基因ID ID
参数:<code>insideFeature</code>
upstream: peak resides upstream of the feature; downstream: peak resides downstream of the feature; inside: peak resides inside the feature; overlapStart: peak overlaps with the start of the feature; overlapEnd: peak overlaps with the end of the feature; includeFeature: peak include the feature entirely
驻留上游:峰值特征的上游,下游:高峰下游驻留功能;内:高峰内部功能所在; overlapStart:高峰重叠的功能开始; overlapEnd:高峰重叠与功能; includeFeature:高峰,包括功能完全
参数:<code>distancetoFeature</code>
distance to the nearest feature such as transcription start site. By default, the distance is calculated as the distance between the start of the binding site and the TSS that is the gene start for genes located on the forward strand and the gene end for genes located on the reverse strand. The user can specify the location of peak and location of feature for calculating this
距离最近的功能,如转录起始位点。默认情况下,距离计算的结合位点的启动和TSS的是基因的基因位于正向链和反向链上的基因位于基因年底开始之间的距离。用户可以指定计算峰的位置和功能的位置
参数:<code>start_position</code>
start position of the feature such as gene
启动的功能,如基因的位置
参数:<code>end_position</code>
end position of the feature such as the gene
结束位置的功能,如基因
参数:<code>strand</code>
1 or + for positive strand and -1 or - for negative strand where the feature is located
1或+正股和-1或 - 位于负链的功能
参数:<code>shortestDistance</code>
The shortest distance from either end of peak to either end the feature.
的最短距离,无论从高峰期的结束,或者结束该功能。
参数:<code>fromOverlappingOrNearest</code>
NearestStart: indicates this feature's start (feature's end for features at minus strand) is closest to the peak start; Overlapping: indicates this feature overlaps with this peak although it is not the nearest feature start
NearestStart:表示此功能的启动(功能的负链的特点)是最接近高峰的开端;重叠:表示此功能重叠,此峰虽然不是最近的功能开始
作者(S)----------Author(s)----------
Lihua Julie Zhu
参考文献----------References----------
<h3>See Also</h3>
举例----------Examples----------
if (interactive())
{
## example 1: annotate myPeakList (RangedData) with TSS.human.NCBI36 (RangedData)[#示例1:,注释myPeakList(RangedData)与TSS.human.NCBI36(RangedData)]
data(myPeakList)
data(TSS.human.NCBI36)
annotatedPeak = annotatePeakInBatch(myPeakList[1:6,], AnnotationData=TSS.human.NCBI36)
as.data.frame(annotatedPeak)
## example 2: you have a list of transcription factor biding sites from literature and[例2:你有一个转录因子伺机从文学网站的名单,]
## are interested in determining the extent of the overlap to the list of peaks from [#有兴趣在确定程度的重叠峰列表]
## your experiment. Prior calling the function annotatePeakInBatch, need to represent[#您的实验。之前调用函数annotatePeakInBatch,,需要代表]
## both dataset as RangedData where start is the start of the binding site, end is [#都为RangedData集的地方,开始是结合位点的启动,结束]
## the end of the binding site, names is the name of the binding site, [#结束的结合位点,名称是结合位点的名称,]
## space and strand are the chromosome name and strand where the binding site is located.[#空间和链的结合位点位于染色体的名称和钢绞线。]
myexp = RangedData(IRanges(start=c(1543200,1557200,1563000,1569800,167889600,100,1000),
end=c(1555199,1560599,1565199,1573799,167893599,200,1200),
names=c("p1","p2","p3","p4","p5","p6", "p7")),strand=as.integer(1),space=c(6,6,6,6,5,4,4))
literature = RangedData(IRanges(start=c(1549800,1554400,1565000,1569400,167888600,120,800),
end=c(1550599,1560799,1565399,1571199,167888999,140,1400),
names=c("f1","f2","f3","f4","f5","f6","f7")),strand=c(1,1,1,1,1,-1,-1),space=c(6,6,6,6,5,4,4))
annotatedPeak1= annotatePeakInBatch(myexp, AnnotationData = literature)
pie(table(as.data.frame(annotatedPeak1)$insideFeature))
as.data.frame(annotatedPeak1)
### use BED2RangedData or GFF2RangedData to convert BED format or GFF format[#使用BED2RangedData或GFF2RangedData床格式或GFF格式转换]
### to RangedData before calling annotatePeakInBatch[,#RangedData前致电annotatePeakInBatch]
test.bed = data.frame(cbind(chrom = c("4", "6"), chromStart=c("100", "1000"),
chromEnd=c("200", "1100"), name=c("peak1", "peak2")))
test.rangedData = BED2RangedData(test.bed)
annotatePeakInBatch(test.rangedData, AnnotationData = literature)
test.GFF = data.frame(cbind(seqname = c("chr4", "chr4"), source=rep("Macs", 2),
feature=rep("peak", 2), start=c("100", "1000"), end=c("200", "1100"),
score=c(60, 26), strand=c(1, 1), frame=c(".", 2), group=c("peak1", "peak2")))
test.rangedData = GFF2RangedData(test.GFF)
as.data.frame(annotatePeakInBatch(test.rangedData, AnnotationData = literature))
}
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|