R语言 PAnnBuilder包 processData()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-26 10:19:26

processData(PAnnBuilder)
processData()所属R语言包：PAnnBuilder

                                    Convert Data Format
                                       转换数据格式

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Convert data format by R function, or produce perl program to process data.
数据格式转换由R功能，或产生的perl程序来处理数据。

用法----------Usage----------

getBaseParsers(baseMapType, db=FALSE)

fileMuncher(outName, dataFile, parser, organism)
fileMuncher_DB(dataFile, parser, organism)

writeInput(parser, perlName, organism, dataFile)
writeInputSP(perlName,organism)
writeInputIPI(perlName,organism)
writeInputREFSEQ(perlName,organism)
writeInputBLAST(perlName,organism, dataFile)
writeInputPFAM(perlName,organism)
writeInputINTERPRO(perlName,organism)
writeOutput(parser, perlName)
.callPerl(script, os)

getSrcObjs(srcUrls, organism, built, fromWeb = TRUE)
getBaseData(srcObjs)

splitEntry(dataRow, sep = ";", asNumeric = FALSE)
twoStepSplit(dataRow, entrySep = ";", eleSep = "@", asNumeric = FALSE)
mergeRowByKey(mergeMe, keyCol = 1, sep = ";")

参数----------Arguments----------

参数：baseMapType
a character string to indicate which database will be  parsed. It can be "sp","trembl","ipi","refseq","equal", "merge","mppi", "PeptideAtlas","DBSubLoc","Pfam", "pfamname", "prositede" or "blast".
一个字符串，表明该数据库将被解析。它可以是“SP”，“TrEMBL这”，“IPI”，“的RefSeq”，“平等”，“合并”，“MPPI”，“PeptideAtlas”，“DBSubLoc”，“ PFAM“，的”pfamname“，”prositede“或”爆炸“。

参数：db
a boolean to indicate whether the parser file for the SQLite-based  package will be returned.
一个布尔值，指示是否解析器基于SQLite的封装文件将被退回。

参数：outName
a character string for the output file name of perl program.
一个perl程序的输出文件名的字符串。

参数：dataFile
a character string for the input file name of perl program.
一个perl程序的输入文件名的字符串。

参数：parser
a character string for the path of the parser file.
一个分析器文件的路径字符串。

参数：organism
a character string for the name of the organism of concern. (eg: "Homo sapiens")
为关注的有机体的名称的字符串。（例如：“智人”）

参数：perlName
a character string for the name of perl program.
一个perl程序的名称的字符串。

参数：script
a character string for the name of perl program.
一个perl程序的名称的字符串。

参数：os
character string, giving the Operating System (family) of  the computer.
字符串，给电脑的操作系统（家庭）。

参数：srcUrls
character string, giving the url of concerned database.
字符串，提供有关数据库的url。

参数：built
a character string for the release/version information of source data.
为源数据的发布/版本信息的字符串。

参数：fromWeb
a boolean to indicate whether the source data will be  downloaded from the web or read from a local file
一个布尔值，指示是否将源数据从网上下载或从本地文件读取

参数：srcObjs
a object of class "pBase".
的PBASE“类”对象。

参数：dataRow
character vector, each element of which is to be split.
特征向量，其中的每个元素被分裂。

参数：sep
a character string containing regular expression(s) to use as  "split".
一个字符串包含正则表达式（S），使用“拆分”。

参数：asNumeric
a boolean to indicate whether the elements will be  converted to objects of type "numeric".
一个布尔值，指示元素是否将被转换为“数字”类型的对象。

参数：entrySep
a character string containing regular expression(s) to use  in the first "split".
包含正则表达式的字符串（S）使用中的第一个“分裂”。

参数：eleSep
a character string containing regular expression(s) to use  in the second "split".
包含正则表达式的字符串（S）使用中的第二个“分裂”。

参数：mergeMe
a vector or a matrix which duplicating values for the same id will be merged
向量或矩阵的重复相同的ID值将合并

参数：keyCol
a integer indicating the column index to be regarded as key.
被视为关键的一个整数表示列索引。

Details

详情----------Details----------

These functions are from Bioconductor "AnnBuilder" package, but add many  new operations depend on the requirements of building proteomic annotation  data packages.
这些功能是从的的Bioconductor“AnnBuilder”包，但添加了许多新的业务依赖的要求，建设蛋白质注解的数据包。

getBaseParsers return a character string of the name of a  parser file based on the given database. Each parser file is a part of  perl script and used to parse relevant data.
getBaseParsers返回一个字符串在给定的数据库为基础的解析器文件的名称。每个分析器文件是一个perl脚本的一部分，用于分析相关数据。

fileMuncher produce perl file based on given parser and  additional input files, then perform this perl program via R.  fileMuncher_DB produce perl file based on given parser and  additional input, then perform this perl program via R. Result data are  stored in the relative ouput files. It is designed for the SQLite-based annotation package. writeInput write additional information including input files  into the perl script. writeOutput write information about ouput files into the perl  script. .callPerl perform perl program via R.
fileMuncher产生perl文件解析器和其他输入文件为基础，然后再执行这个perl程序通过河fileMuncher_DB生产基于解析器和额外的输入perl的文件，然后执行这个perl程序通过R.的结果数据存储在相对的输出功率和文件。它是专为基于SQLite的注解包。 writeInput写更多的信息，包括输入文件的perl脚本。 writeOutput写入到perl脚本的输出功率和文件信息。 .callPerlR. perl程序执行的

getSrcObjs given url of database and concerned organism,  define objects of class "pBase". pBase is a sub class of "pubRepo", and it  is used for SwissProt, TREMBL, IPI and NCBI RefSeq data.  getBaseData get basic protein annotation data and sequence  data from protein database: SwissProt, TREMBL, IPI, NCBI PefSeq.
getSrcObjs给定的URL数据库和有关生物体，定义“PBASE”类的对象。 PBASE是一个“pubRepo”的子类，它是SwissProt数据库，TrEMBL这，IPI和NCBI的RefSeq数据。 getBaseData得到蛋白质注解的基本数据和序列数据从蛋白质数据库：SwissProt数据库，TrEMBL这，IPI，NCBI的PefSeq。

splitEntry split multiple entry for a given mapping. twoStepSplit split multiple entry with two separaters  (e.g. 12345@18;67891@18). mergeRowByKey merge duplicating values for the same key.
splitEntry分裂为一个给定的映射多次入境。 twoStepSplit分裂多个条目有两个separaters（例如12345 @ 18; 67891 @ 18）。 mergeRowByKey合并重复相同的密钥值。

值----------Value----------

getBaseParsers returns the path of parser file.
getBaseParsers返回分析器文件的路径。

getSrcObjs returns a list of defined the objects of class  "pBase".
getSrcObjs返回的PBASE“类”的定义的对象的名单。

getBaseData returns a matrix of protein annotation data.
getBaseData返回蛋白质注解的数据矩阵。

splitEntry returns a vector.
splitEntry返回一个向量。

twoStepSplit returns a vector.
twoStepSplit返回一个向量。

mergeRowByKey returns a data frame containing the merged values.
mergeRowByKey返回一个数据框包含合并后的值。

作者（S）----------Author(s)----------

Hong Li

参考文献----------References----------

assembling annotation for genomic data.Bioinformatics 19(1), 155-156.
转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 PAnnBuilder包 processData()函数中文帮助文档(中英文对照)

浏览过的版块