processTilingArray(nucleR)
processTilingArray()所属R语言包:nucleR
Obtain and clean nucleosome positioning data from tiling array
获得核小体定位数据,并清理瓦片阵列
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Process and transform the microarray data coming from tiling array nucleosome positioning experiments.
处理和改造平铺阵列核小体定位实验的微阵列数据。
用法----------Usage----------
processTilingArray(data, exprName, chrPattern, inferLen = 50,
mc.cores = 1, quiet=FALSE)
参数----------Arguments----------
参数:data
ExpressionSet object wich contains the data of the tiling array.
ExpressionSet对象wich包含平铺阵列的数据。
参数:exprName
Name of the sample in ExpressionSet which contains the ratio between nucleosomal and genomic dna (if using Starr, the description argument supplied to getRatio function). If this name is not provided, it is assumed data has only one column.
sample这ExpressionSet在名称中包含比之间的核小体和基因组DNA(如果使用Starr,descriptiongetRatio函数参数提供)。如果这个名字是不提供的,它假定data只有一列。
参数:chrPattern
Only chromosomes that contain chrPattern string will be selected from ExpressionSet. Sometimes tilling arrays contain control quality information that is imported as a chromosome. This allows filtering it. If no value is supplied, all chromosomes will be used.
只包含染色体将从ExpressionSet选择chrPattern字符串。有时翻耕数组包含控制质量信息作为染色体进口。这可以过滤它。如果没有提供值,将被用于所有的染色体。
参数:inferLen
Maximum length (in basepairs) for allowing data gaps inference. See details for further information.
允许数据差距推理的最大长度(碱基对)。看到details为进一步的信息。
参数:mc.cores
Number of cores available to parallel data processing.
并行数据处理核心的数量。
参数:quiet
Avoid printing on going information (TRUE | FALSE)
避免持续信息印刷(TRUE | FALSE)
Details
详情----------Details----------
The processing of tiling arrays could be complicated as many types exists on the market. This function deals ok with Affymetrix Tiling Arrays in yeast, but hasn't been tested on other species or platforms.
市场上存在多种类型的瓦片阵列处理可并发。此功能处理确定与Affymetrix平铺在酵母中的数组,但尚未对其他物种或平台测试。
The main aim is convert the output of preprocessing steps (supplied by third-parties packages) to a clean genome wide nucleosome occupancy profile.
主要目的是预处理步骤的输出(提供第三方包)转换到一个干净的全基因组的核小体占用空间。
Tiling arrays doesn't use to provide a one-basepair resolution data, so one gets one value per probe in the array, covering X basepairs and shifted (tiled) Y basepairs respect the surrounding ones. So, one gets a piece of information every Y basepairs.
平铺阵列不使用提供一个碱基分辨率的数据,所以获得数组中每一个探针的价值,涵盖X碱基和转移(平铺)Ÿ碱基尊重周围的。因此,得到了一块的信息每Y碱基对。
This function tries to convert this noisy, low resolution data, to a one-basepair signal, which allows a fast recognition of nucleosomes without using large and artificious statistical machinery as Hidden Markov Models using posterionr noise cleaning process.
这个函数尝试转换这个喧闹的,低分辨率数据,一个碱基的信号,它允许一个核小体的快速识别隐马尔可夫模型使用posterionr噪声清洗过程中不使用的统计大和artificious的机械。
As example, imagine your array has probes of 20mers and a tiling between probes of 10bp. Starting at position 1 (covering coordinates from 1 to 20), the next probe will be in position 10 (covering the coordinates 10 to 29). This can be represented as two hybridization intensity values on coordinates 1 and 10. This function will try to infer (using a lineal distribution) the values from 2 to 9 using the existing values of probes in coordinate 1 and coordinate 10.
作为例子,想象您的数组20mers的探针和10bp探针之间的平铺。在位置1(覆盖从1到20坐标)开始,未来的探测器将在10的位置(包括坐标10至29日)。这可以代表坐标1和10两个杂交强度值。此功能会尝试推断(使用线性分布)值从2到9的坐标和坐标10探针使用现有的值。
The tiling space between adjacent array probes could be not constant, or could be also there are regions not covered in the used microarray. With the function argument inferLen you can specify wich amout of space (in basepairs) you allow to infer the non-present values.
平铺空间相邻阵探针可能不是恒定的,或者可能也有不使用的芯片覆盖的区域。与功能参数inferLen你可以指定wich大写金额(碱基对)的空间,你可以推断出非现值。
If at some point the range not covered (gap) between two adjacent probes of the array is greater than inferLen value, then the coordinates between these probes will be setted to NA.
如果在某些点的范围不包括两个相邻的探针阵列之间(GAP)是比inferLen值,然后将这些探针之间的坐标请先试烫到NA。
值----------Value----------
RleList with the observed/inferred values for each coordinate.
RleList观察/推断出每个坐标值。
警告----------Warning----------
This function could not cover all kind of arrays in the market. This package assumes the data is processed and normalized prior this processing, using standard microarray packages existing for R, like Starr.
此功能无法涵盖所有在市场上的一种阵列。假定的数据处理和标准化之前,加工,使用标准的芯片封装现有的R,像Starr这个包。
注意----------Note----------
This function should be suitable for all data objects of kind ExpressionSet coding the annotations "chr" for chromosome and "pos" for position (acccessible by pData(data@featureData)) and a expression value (accessible by exprs(data)
此功能应该是适合所有data那种ExpressionSet的编码注释的对象"chr"染色体和"pos"位置(acccessible由pData(data@featureData))和表达式的值(访问由exprs(data)
作者(S)----------Author(s)----------
Oscar Flores <a href="mailtoflores@mmb.pcb.ub.es">oflores@mmb.pcb.ub.es</a>
参见----------See Also----------
ExpressionSet, getRatio
ExpressionSet,getRatio
举例----------Examples----------
## Not run: [#无法运行:]
#Dataset cannot be provided for size restrictions[数据集不能提供大小限制]
#This is the code used to get the hybridization ratio with Starr from CEL files[这是用来获得杂交率与斯塔尔从CEL文件的代码]
library("Starr")
TA_parsed = readCelFile(BPMap, CELfiles, CELnames, CELtype, featureData=TRUE, log.it=TRUE)
TA_loess = normalize.Probes(TA_parsed, method="loess")
TA_ratio = getRatio(TA_loess, TA_loess$type=="IP", TA_loess$type=="CONTROL", "myRatio")
#From here, we use nucleR:[从这里,我们使用nucleR:]
#Preprocess the array, using the calculated ratio feature we named "myRatio".[预处理的阵列,使用计算比特征命名“myRatio”。]
#This will also select only those chromosomes with the pattern "Sc:Oct_2003;chr",[这也将选择与模式“SC:Oct_2003; CHR”只有那些染色体,]
#removing control data present in that tiling array.[消除控制数据,目前,平铺阵列。]
#Finally, we allow that loci not covered by a prove being inferred from adjacent[最后,我们允许的位点不包括从相邻的推断1证明]
#ones, as far as they are separated by 50bp or less[的,尽量分开小于或等于50个基点]
arr = processTilingArray(TA_ratio, "myRatio", chrPattern="Sc:Oct_2003;chr", inferLen=50)
#From here we can proceed with the analysis:[从这里我们可以进行分析:]
arr_fft = filterFFT(arr)
arr_pea = peakDetection(arr_fft)
plotPeaks(arr_pea, arr_fft)
#...[...]
## End(Not run)[#结束(不运行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|