找回密码
 注册
查看: 547|回复: 0

R语言 mosaics包 mosaicsRunAll()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-26 01:08:22 | 显示全部楼层 |阅读模式
mosaicsRunAll(mosaics)
mosaicsRunAll()所属R语言包:mosaics

                                         Analyze ChIP-seq data using the MOSAiCS framework
                                         分析的ChIP-seq的数据,使用马赛克框架

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Construct bin-level ChIP-sep data from aligned read files of ChIP and control samples, fit MOSAiCS model, call peaks, and export peak calling results and reports for diagnostics.
建设斌级芯片九月数据对齐芯片和控制样品,模型拟合,马赛克,通话高峰,阅读文件和出口的高峰期调用的结果和诊断报告。


用法----------Usage----------


mosaicsRunAll( chipDir=NULL, chipFileName=NULL, chipFileFormat=NULL,
    controlDir=NULL, controlFileName=NULL, controlFileFormat=NULL,
    binfileDir=NULL, peakDir=NULL, peakFileName=NULL, peakFileFormat=NULL,
    reportSummary=FALSE, summaryDir=NULL, summaryFileName=NULL,
    reportExploratory=FALSE, exploratoryDir=NULL, exploratoryFileName=NULL,
    reportGOF=FALSE, gofDir=NULL, gofFileName=NULL, byChr=FALSE,
    excludeChr=NULL, FDR=0.05, fragLen=200, binSize=fragLen, capping=0,
    analysisType="IO", bgEst=NA, d=0.25,
    signalModel="BIC", maxgap=fragLen, minsize=50, thres=10, parallel=FALSE, nCore=8 )



参数----------Arguments----------

参数:chipDir
Directory of the aligned read file of ChIP sample to be processed.  
芯片样品对齐的读文件目录进行处理。


参数:chipFileName
Name of the aligned read file of ChIP sample to be processed.  
芯片样品对齐的读文件的名称来进行处理。


参数:chipFileFormat
Format of the aligned read file of ChIP sample to be processed. Currently, mosaicsRunAll permits the following aligned read file formats: "eland_result" (Eland result), "eland_extended" (Eland extended), "eland_export" (Eland export), "bowtie" (default Bowtie), and "sam" (SAM).  
对齐的读芯片的样品要处理的文件的格式。目前,mosaicsRunAll允许不对齐的读下列文件格式:"eland_result"(伊兰结果),"eland_extended"(伊兰扩展),"eland_export"(伊兰出口),"bowtie" (默认的蝴蝶结),和"sam"器(SAM)。


参数:controlDir
Directory of the aligned read file of control sample to be processed.  
控制样品对齐的读文件目录进行处理。


参数:controlFileName
Name of the aligned read file of control sample to be processed.  
控制样品对齐的读文件名进行处理。


参数:controlFileFormat
Format of the aligned read file of control sample to be processed. Currently, mosaicsRunAll permits the following aligned read file formats: "eland_result" (Eland result), "eland_extended" (Eland extended), "eland_export" (Eland export), "bowtie" (default Bowtie), and "sam" (SAM).  
格式的文件要处理的对照样品对齐的读。目前,mosaicsRunAll允许不对齐的读下列文件格式:"eland_result"(伊兰结果),"eland_extended"(伊兰扩展),"eland_export"(伊兰出口),"bowtie" (默认的蝴蝶结),和"sam"器(SAM)。


参数:binfileDir
Directory to store processed bin-level files.  
目录来存储处理滨级的文件。


参数:peakDir
Directory to store the peak list generated from the analysis.  
目录存储高峰从分析产生的列表。


参数:peakFileName
Name of the peak list generated from the analysis.  
从分析产生的峰值列表的名称。


参数:peakFileFormat
Format of the peak list generated from the analysis. Possible values are "txt", "bed", and "gff".  
从分析产生的峰值列表格式。可能值"txt","bed","gff"。


参数:reportSummary
Report the summary of model fitting and peak calling?  Possible values are TRUE and FALSE. Default is FALSE.   
报告摘要模型拟合和峰值呼叫?可能的值是TRUE和FALSE。默认FALSE。


参数:summaryDir
Directory to store the summary report of model fitting and peak calling.  
目录存储模型拟合和峰值呼叫的总结报告。


参数:summaryFileName
Name of the summary report of model fitting and peak calling. The summary report is a text file.  
模型拟合和峰值呼叫的总结报告的名称。总结报告是一个文本文件。


参数:reportExploratory
Report the exploratory analysis plots?  Possible values are TRUE and FALSE. Default is FALSE.   
报告探索性分析图?可能的值是TRUE和FALSE。默认FALSE。


参数:exploratoryDir
Directory to store the exploratory analysis plots.  
目录来存储的探索性分析图。


参数:exploratoryFileName
Name of the file for exploratory analysis plots. The exploratory analysis results are exported as PDF.  
探索性分析图文件的名称。探索性分析结果导出为PDF格式。


参数:reportGOF
Report the goodness of fit (GOF) plots?  Possible values are TRUE and FALSE. Default is FALSE.   
报告善良的拟合图(GOF)来吗?可能的值是TRUE和FALSE。默认FALSE。


参数:gofDir
Directory to store the goodness of fit (GOF) plots.  
目录来存放拟合图(GOF)来的美好。


参数:gofFileName
Name of the file for goodness of fit (GOF) plots. The exploratory analysis results are exported as PDF.  
善良的拟合图(GOF)来的文件的名称。探索性分析结果导出为PDF格式。


参数:byChr
Analyze ChIP-seq data for each chromosome separately or analyze it genome-wide? Possible values are TRUE or FALSE. byChr=TRUE and byChr=FALSE mean  chromosome-wise and genome-wide analysis, respectively. Default is FALSE (genome-wide analysis).  
分析的ChIP-seq的数据分别为每个染色体或全基因组分析?可能的值是TRUE或FALSE。 byChr=TRUE和byChr=FALSE的意思是染色体明智和全基因组分析,分别。默认是FALSE(全基因组的分析)。


参数:excludeChr
Vector of chromosomes that are excluded from the analysis.   
从分析中排除染色体的向量。


参数:FDR
False discovery rate. Default is 0.05.   
虚假的发现率。默认值是0.05。


参数:fragLen
Average fragment length. Default is 200.  
平均片段长度。默认是200。


参数:binSize
Size of bins. By default, bin size equals to fragLen (average fragment length).  
箱的大小。默认情况下,箱大小等于fragLen(平均片段长度)。


参数:capping
Maximum number of reads allowed to start at each nucleotide position.  To avoid potential PCR amplification artifacts, the maximum number of reads that can start at a nucleotide position is capped at capping.  Capping is not applied if non-positive capping is used. Default is 0 (no capping).  
读取允许的最大数量,开始在每个核苷酸的位置。为了避免潜在的PCR扩增文物,在核苷酸位置,就可以开始读取的最大数量的上限是在capping。不适用,如果非正capping用于封顶。默认为0(没有上限)。


参数:analysisType
Analysis type. Currently, only "IO" is supported.      
分析类型。目前,只有“IO”的支持。


参数:bgEst
Parameter to determine background estimation approach. Possible values are "matchLow" (estimation using bins with low tag counts) and "rMOM" (estimation using robust method of moment (MOM)). If bgEst is not specified, this method tries to guess its best for bgEst,  based on the data provided.     
背景估计方法来确定参数。可能的值是“matchLow”(估计用标记计数低箱)和的“rMOM”(使用强大的时刻法(MoM)的估计)。 bgEst如果不指定,此方法试图猜测bgEst根据所提供的数据,其最好的。


参数:d
Parameter for estimating background distribution. Default is 0.25.   
参数估计的背景分布。默认值是0.25。


参数:signalModel
Signal model.     Possible values are "BIC" (automatic model selection using BIC),  "1S" (one-signal-component model), and "2S" (two-signal-component model). Default is "BIC".   
信号模型。可能的值是(模型自动选择使用的BIC)“的BIC”,“1”(一个信号组件模型),“2S”(两个信号组件模型)。默认是“BIC的”。


参数:maxgap
Initial nearby peaks are merged if the distance (in bp)  between them is less than maxgap.  By default, maxgap equals to fragLen (average fragment length).      
初步附近的峰合并,如果它们之间的距离(BP)是比maxgap少。默认情况下,maxgap=fragLen(平均片段长度)。


参数:minsize
An initial peak is removed if its width is narrower than minsize.  Default is 50.  
如果其宽度窄,比minsize初始峰被删除。默认值是50。


参数:thres
A bin within initial peak is removed if its ChIP tag counts are less than thres. Default is 10.  
初始峰值内的垃圾桶被删除,如果其芯片的标记计数比thres少。默认值为10。


参数:parallel
Utilize multiple CPUs for parallel computing  using "multicore" package? Possible values are TRUE (use "multicore")  or FALSE (not use "multicore"). Default is FALSE (not use "multicore").
利用多个CPU并行计算使用"multicore"包?可能的值是TRUE("multicore")FALSE(不使用"multicore")。默认是FALSE(不使用"multicore")。


参数:nCore
Number of maximum number of CPUs used for the analysis.  Default is 8.  
用于分析的CPU的最大数量的数量。默认值是8。


Details

详情----------Details----------

This method implements the work flow to analyze ChIP-seq data using the MOSAiCS framework. It imports aligned read files of ChIP and control samples, process them into bin-level files, fit MOSAiCS model, call peaks, and export the peak lists.  This method is a wrapper function of constructBins, readBins, mosaicsFit, mosaicsPeak, export, and methods of classes BinData, MosaicsFit, and MosaicsPeak.
这种方法实现的工作流分析的ChIP-seq的数据,使用马赛克框架。进口对齐读取芯片和控制样本的文件,加工成斌级文件,模型拟合,马赛克,通话高峰,他们和出口高峰名单。此方法是一个包装函数,constructBinsreadBins,mosaicsFit,mosaicsPeak,export,和类的方法BinData,<X >,MosaicsFit。

See the vignette of the package for the illustration of the work flow and the description of employed methods and their options. Exploratory analysis plots and goodness of fit (GOF) plots are generated  using the methods plot of the classes BinData and MosaicsFit, respectively. See the help of constructBins for details of the options chipFileFormat, controlFileFormat, byChr, fragLen, binSize, and capping. See the help of readBins for details of the option excludeChr. See the help of mosaicsFit for details of the options analysisType, bgEst, and d. See the help of mosaicsPeak for details of the options FDR,  signalModel, maxgap, minsize, and thres. See the help of export for details of the option peakFileFormat.
看到包的工作流程的插图的小插曲和介绍就业的方法和他们的选择。探索性分析图和拟合优度(GOF)来图生成plot类BinData和MosaicsFit,分别使用方法。见选项constructBins,chipFileFormat,controlFileFormat,byChr,fragLen,binSize的详细信息,帮助capping 。见选项readBinsexcludeChr帮助。看到选项的详细信息,帮助mosaicsFitanalysisType,bgEst,d。看到选项的详细信息,帮助mosaicsPeakFDR,signalModel,maxgap,minsize,thres。见选项exportpeakFileFormat帮助。

When the data contains multiple chromosomes,  parallel computing can be utilized for faster preprocessing and model fitting if parallel=TRUE and multicore package is installed. nCore determines number of CPUs used for parallel computing.
当数据包含多个染色体,并行计算,可以用于更快的预处理和模型拟合,如果parallel=TRUE和multicore包安装。 nCore确定用于并行计算的CPU数量。


值----------Value----------

Processed bin-level files are exported to the directory specified in binfileDir. If byChr=FALSE (genome-wide analysis),  one bin-level file is exported for each of ChIP and control samples, where file names are [chipFileName]_fragL[fragLen]_bin[binSize].txt and [controlFileName]_fragL[fragLen]_bin[binSize].txt, respectively. If byChr=TRUE (chromosome-wise analysis), bin-level files are exported for each chromosome of each of ChIP and control samples, where file names are [chrID]_[chipFileName]_fragL[fragLen]_bin[binSize].txt and [chrID]_[controlFileName]_fragL[fragLen]_bin[binSize].txt ([chrID] is chromosome ID that reads align to). The peak list generated from the analysis are exported to the directory specified in peakDir with the file name specified in peakFileName. If reportSummary=TRUE, the summary of model fitting and peak calling is exported  to the directory specified in summaryDir  with the file name specified in summaryFileName (text file). If reportExploratory=TRUE, the exploratory analysis plots are exported  to the directory specified in exploratoryDir with the file name specified in exploratoryFileName (PDF file). If reportGOF=TRUE, the goodness of fit (GOF) plots are exported  to the directory specified in gofDir  with the file name specified in gofFileName (PDF file).
斌级处理的文件导出到指定的目录binfileDir。如果byChr=FALSE(全基因组分析),一个槽级别的文件是为每个芯片和控制样本,其中文件名是[chipFileName]_fragL[fragLen]_bin[binSize].txt和[controlFileName]_fragL[fragLen]_bin[binSize].txt,分别出口。如果出口byChr=TRUE(明智的染色体分析),宾级文件每个染色体的每个芯片和控制样本,其中文件名是[chrID]_[chipFileName]_fragL[fragLen]_bin[binSize].txt和[chrID]_[controlFileName]_fragL[fragLen]_bin[binSize].txt([chrID] 是染色体ID读取对齐)。分析产生的峰值列表远销到指定的目录peakDir与peakFileName指定的文件名。如果reportSummary=TRUE,模型拟合的总结和峰值呼叫导出到指定的目录summaryDir与summaryFileName(文本文件的文件名)指定。如果reportExploratory=TRUE,探索性分析图导出到指定的目录exploratoryDirexploratoryFileName(PDF文件)中指定的文件名。如果reportGOF=TRUE(GOF)来拟合图导出到指定的目录gofDirgofFileName(PDF文件)中指定的文件名。


作者(S)----------Author(s)----------


Dongjun Chung, Pei Fen Kuan, Sunduz Keles



参考文献----------References----------

"A Statistical Framework for the Analysis of ChIP-Seq Data",  Journal of the American Statistical Association, Vol. 106, pp. 891-903.

参见----------See Also----------

constructBins, readBins,  mosaicsFit, mosaicsPeak, export, BinData, MosaicsFit, MosaicsPeak.
constructBins,readBins,mosaicsFit,mosaicsPeak,export,BinData,MosaicsFit,MosaicsPeak。


举例----------Examples----------


## Not run: [#无法运行:]
# minimal input (without any reports for diagnostics)[最小的投入(不含任何诊断报告)]

mosaicsRunAll(
    chipDir="/scratch/eland/",
    chipFileName="STAT1_eland_results.txt",
    chipFileFormat="eland_result",
    controlDir="/scratch/eland/",
    controlFileName="input_eland_results.txt",
    controlFileFormat="eland_result",
    binfileDir="/scratch/bin/",
    peakDir="/scratch/peak/",
    peakFileName="STAT1_peak_list.txt",
    peakFileFormat="txt" )
   
# generate all reports for diagnostics  [生成诊断报告]
   
mosaicsRunAll(
    chipDir="/scratch/eland/",
    chipFileName="STAT1_eland_results.txt",
    chipFileFormat="eland_result",
    controlDir="/scratch/eland/",
    controlFileName="input_eland_results.txt",
    controlFileFormat="eland_result",
    binfileDir="/scratch/bin/",
    peakDir="/scratch/peak/",
    peakFileName="STAT1_peak_list.txt",
    peakFileFormat="txt",
    reportSummary=TRUE,
    summaryDir="/scratch/reports/",
    summaryFileName="mosaics_summary.txt",
    reportExploratory=TRUE,
    exploratoryDir="/scratch/reports/",
    exploratoryFileName="mosaics_exploratory.pdf",
    reportGOF=TRUE,
    gofDir="/scratch/reports/",
    gofFileName="mosaics_GOF.pdf",
    byChr=FALSE,
    FDR=0.05,
    fragLen=200,
    capping=0,
    parallel=FALSE,
    nCore=8 )

## End(Not run)[#结束(不运行)]

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-3 07:15 , Processed in 0.024000 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表