R语言 GenomicFeatures包 makeTranscriptDb()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 19:27:05

makeTranscriptDb(GenomicFeatures)
makeTranscriptDb()所属R语言包：GenomicFeatures

                                       Making a TranscriptDb object from user supplied annotations
                                       从用户提供的注解TranscriptDb对象

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

makeTranscriptDb is a low-level constructor for making a TranscriptDb object from user supplied transcript annotations. See ?makeTranscriptDbFromUCSC and ?makeTranscriptDbFromBiomart for higher-level functions that feed data from the UCSC or BioMart sources to makeTranscriptDb.
makeTranscriptDb是一个低级别的构造TranscriptDb对象从用户提供的成绩单注释。看到?makeTranscriptDbFromUCSC和?makeTranscriptDbFromBiomart更高级别的功能，从加州大学圣克鲁兹分校或BioMart来源makeTranscriptDb饲料数据。

用法----------Usage----------

  makeTranscriptDb(transcripts, splicings,
               genes=NULL, chrominfo=NULL, metadata=NULL, ...)

参数----------Arguments----------

参数：transcripts
data frame containing the genomic locations of a set of transcripts
数据框的含有一套成绩单基因位置

参数：splicings
data frame containing the exon and cds locations of a set of transcripts
数据框包含一套成绩单外显子和CD的位置，

参数：genes
data frame containing the genes associated to a set of transcripts
数据框包含一套成绩单相关的基因

参数：chrominfo
data frame containing information about the chromosomes hosting the set of transcripts
数据框包含的信息有关的染色体主办成绩单

参数：metadata
2-column data frame containing meta information about this set of transcripts like species, organism, genome, UCSC table, etc... The names of the columns must be "name" and "value" and their type must be character.
2列的数据框包含元左右这一套，如物种的成绩单，生物，基因组，UCSC的表等信息..列的名称必须是"name"和"value"和他们的类型必须是字符。

参数：...
ignored for now
现在忽略

Details

详情----------Details----------

The transcripts (required), splicings (required) and genes (optional) arguments must be data frames that describe a set of transcripts and the genomic features related to them (exons, cds and genes at the moment). The chrominfo (optional) argument must be a data frame containing chromosome information like the length of each chromosome.
transcripts（必需），splicings（必需）和genes（可选）参数必须是数据框描述的转录组和基因组特征与他们（外显子，CD和目前的基因）。 chrominfo（可选）参数必须是一个数据框包含染色体信息，如每个染色体的长度。

transcripts must have 1 row per transcript and the following columns:
transcripts必须有1行，每誊本及以下各列：

tx_id: Transcript ID. Integer vector. No NAs. No duplicates.
tx_id：成绩单编号。整数向量。没有定居。没有重复。

tx_name: [optional] Transcript name. Character vector (or factor).
tx_name：[可选]谈话名称。特征向量（或因素）。

tx_chrom: Transcript chromosome. Character vector (or factor) with no NAs.
tx_chrom：谈话染色体。没有NAS的特征向量（或因素）。

tx_strand: Transcript strand. Character vector (or factor) where each element is either "+" or "-".
tx_strand：成绩单链。特征向量（或因素），其中每个元素是要么"+"或"-"。

tx_start, tx_end: Transcript start and end. Integer vectors with no NAs.
tx_start，tx_end：谈话的开始和结束。没有NAS的整数向量。

Other columns, if any, are ignored (with a warning).
被忽略，如果有的话，其他列（警告）。

splicings must have N rows per transcript, where N is the nb of exons in the transcript. Each row describes an exon plus eventually the cds contained in this exon. Its columns must be:
splicings必须有N行，每成绩单，其中N是谈话中的外显子的NB。每一行描述一个外显子加最终在此外显子中的光盘。它的列必须是：

tx_id: Foreign key that links each row in the splicings data frame to a unique row in the transcripts data frame. Note that more than 1 row in splicings can be linked to the same row in transcripts (many-to-one relationship). Same type as transcripts$tx_id (integer vector). No NAs. All the values in this column must be present in transcripts$tx_id.
tx_id：splicings数据框中的每个行链接到transcripts数据框中的唯一行外键。请注意，超过1splicings行可以链接到同一行transcripts（许多一对一的关系）。同一类型的transcripts$tx_id（整数向量）。没有定居。在此列的所有值必须是目前在transcripts$tx_id。

exon_rank: The rank of the exon in the transcript. Integer vector with no NAs. (tx_id, exon_rank) pairs must be unique.
exon_rank：外显子在谈话的排名。没有NAS的整数向量。（tx_id，exon_rank）对必须是唯一的。

exon_id: [optional] Exon ID. Integer vector with no NAs.
exon_id：[可选]埃克森编号。没有NAS的整数向量。

exon_name: [optional] Exon name. Character vector (or factor).
exon_name：[可选]外显子的名字。特征向量（或因素）。

exon_chrom: [optional] Exon chromosome. Character vector (or factor) with no NAs. If missing then transcripts$tx_chrom is used. If present then exon_strand must be present too.
exon_chrom：[可选]外显子染色体。没有NAS的特征向量（或因素）。如果缺少transcripts$tx_chrom使用。如果存在exon_strand必须是本太。

exon_strand: [optional] Exon strand. Character vector (or factor) with no NAs. If missing then transcripts$tx_strand is used and exon_chrom must be missing too.
exon_strand：[可选]外显子链。没有NAS的特征向量（或因素）。如果缺少transcripts$tx_strand使用exon_chrom必须也消失。

exon_start, exon_end: Exon start and end. Integer vectors with no NAs.
exon_start，exon_end：外显子的开始和结束。没有NAS的整数向量。

cds_id: [optional] cds ID. Integer vector. If present then cds_start and cds_end must be too. NAs are allowed and must match NAs in cds_start and cds_end.
cds_id：[可选]光盘编号。整数向量。如果存在cds_start和cds_end必须是太多。 NAS是允许的，必须符合NAScds_start和cds_end。

cds_name: [optional] cds name. Character vector (or factor). If present then cds_start and cds_end must be too. NAs are allowed and must match NAs in cds_start and cds_end.
cds_name：[可选]光盘名称。特征向量（或因素）。如果存在cds_start和cds_end必须是太多。 NAS是允许的，必须符合NAScds_start和cds_end。

cds_start, cds_end: [optional] cds start and end. Integer vectors. If one of the 2 columns is missing then all cds_* columns must be missing. NAs are allowed and must occur at the same positions in cds_start and cds_end.
cds_start，cds_end：[可选] CD的开始和结束。整数向量。如果2列之一是缺少所有的cds_*列必须失踪。定居是允许的，必须出现在相同的位置cds_start和cds_end。

Other columns, if any, are ignored (with a warning).
被忽略，如果有的话，其他列（警告）。

genes must have N rows per transcript, where N is the nb of genes linked to the transcript (N will be 1 most of the time). Its columns must be:
genes必须有N行，每成绩单，其中N是NB与谈话（N将大部分的时间）的基因。它的列必须是：

tx_id: [optional] genes must have either a tx_id or a tx_name column but not both. Like splicings$tx_id, this is a foreign key that links each row in the genes data frame to a unique row in the transcripts data frame.
tx_id：[可选]genes必须有一个tx_id或tx_name列，但不能同时。像splicings$tx_id，这是一个外键行在genes数据框的独特行transcripts数据框中的每个环节。

tx_name: [optional]  Can be used as an alternative to the genes$tx_id foreign key.
tx_name：[可选]可以使用作为genes$tx_id外键的替代。

gene_id: Gene ID. Character vector (or factor). No NAs.
gene_id：基因ID。特征向量（或因素）。没有定居。

Other columns, if any, are ignored (with a warning).
被忽略，如果有的话，其他列（警告）。

chrominfo must have 1 row per chromosome and the following columns:
chrominfo必须有1行，每染色体和下面的列：

chrom: Chromosome name. Character vector (or factor) with no NAs.
chrom：染色体的名字。没有NAS的特征向量（或因素）。

length: Chromosome length. Either all NAs or an integer vector with no NAs.
length：染色体的长度。所有NAS或没有NAS的整数向量。

is_circular: [optional] Chromosome circularity flag. Either all NAs or a logical vector with no  NAs.
is_circular：[可选]染色体圆标志。所有NAS或没有NAS的逻辑向量。

Other columns, if any, are ignored (with a warning).
被忽略，如果有的话，其他列（警告）。

值----------Value----------

A TranscriptDb object.
TranscriptDb对象。

作者（S）----------Author(s)----------

H. Pages

参见----------See Also----------

TranscriptDb, makeTranscriptDbFromUCSC, makeTranscriptDbFromBiomart
TranscriptDb，makeTranscriptDbFromUCSC，makeTranscriptDbFromBiomart

举例----------Examples----------

transcripts <- data.frame(
               tx_id=1:3,
               tx_chrom="chr1",
               tx_strand=c("-", "+", "+"),
               tx_start=c(1, 2001, 2001),
               tx_end=c(999, 2199, 2199))
splicings <-  data.frame(
               tx_id=c(1L, 2L, 2L, 2L, 3L, 3L),
               exon_rank=c(1, 1, 2, 3, 1, 2),
               exon_start=c(1, 2001, 2101, 2131, 2001, 2131),
               exon_end=c(999, 2085, 2144, 2199, 2085, 2199),
               cds_start=c(1, 2022, 2101, 2131, NA, NA),
               cds_end=c(999, 2085, 2144, 2193, NA, NA))

txdb <- makeTranscriptDb(transcripts, splicings)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册