RangedData-class(IRanges)
RangedData-class()所属R语言包:IRanges
Data on ranges
对范围内的数据
译者:生物统计家园网 机器人LoveR
描述----------Description----------
RangedData supports storing data, i.e. a set of variables, on a set of ranges spanning multiple spaces (e.g. chromosomes). Although the data is split across spaces, it can still be treated as one cohesive dataset when desired and extends DataTable. In order to handle large datasets, the data values are stored externally to avoid copying, and the rdapply function facilitates the processing of each space separately (divide and conquer).
RangedData支持数据存储,即一组变量,一组跨越多个空格(如染色体)的范围。虽然跨空间分割的数据,它仍然可以被视为一个有凝聚力的数据集所需的时延长DataTable。为了处理大型数据集,数据值存储在外部,以避免拷贝,rdapply功能便于分别处理每个空间(分而治之)。
Details
详情----------Details----------
A RangedData object consists of two primary components: a RangesList holding the ranges over multiple spaces and a parallel SplitDataFrameList, holding the split data. There is also an universe slot for denoting the source (e.g. the genome) of the ranges and/or data.
一个RangedData对象包括两个主要部分组成:一个RangesList多个空间的范围和并行SplitDataFrameList,持有分割数据。也有一个universe表示的范围和/或数据源(如基因组)的插槽。
There are two different modes of interacting with a RangedData. The first mode treats the object as a contiguous "data frame" annotated with range information. The accessors start, end, and width get the corresponding fields in the ranges as atomic integer vectors, undoing the division over the spaces. The [[ and matrix-style [, extraction and subsetting functions unroll the data in the same way. [[<- does the inverse. The number of rows is defined as the total number of ranges and the number of columns is the number of variables in the data. It is often convenient and natural to treat the data this way, at least when the data is small and there is no need to distinguish the ranges by their space.
有两个不同与RangedData相互作用的模式。第一种模式,将作为一个连续的“数据框”范围的信息标注对象。的存取start,end,width在原子整数向量的范围内得到相应的字段,撤销该部门在空格。 [[“矩阵式[,提取和子集的功能,以同样的方式展开的数据。 [[<-逆。被定义为总人数的范围内的行数和列数数据中的变量数目。它往往是方便和自然的方式对待这个数据,至少在小数据,有没有必要区分的空间范围。
The other mode is to treat the RangedData as a list, with an element (a virtual Ranges/DataFrame pair) for each space. The length of the object is defined as the number of spaces and the value returned by the names accessor gives the names of the spaces. The list-style [ subset function behaves analogously. The rdapply function provides a convenient and formal means of applying an operation over the spaces separately. This mode is helpful when ranges from different spaces must be treated separately or when the data is too large to process over all spaces at once.
另一种模式是治疗的RangedData作为一个名单,每个空间元素(一个虚拟的Ranges/DataFrame一双)。被定义为对象的长度的空格数和names存取返回值给出的空间的名称。列表样式[子集函数的行为类似于。 rdapply函数分别应用在空间操作提供了方便和正式的手段。这种模式是有帮助的,从不同的空间范围时,必须分开处理,或当数据过大,超过所有的空格处理一次。
存取方法----------Accessor methods----------
In the code snippets below, x is a RangedData object.
在下面的代码片段,x是RangedData对象。
The following accessors treat the data as a contiguous dataset, ignoring the division into spaces:
以下存取治疗作为一个连续的数据集的数据,而忽略分裂成空格:
Array accessors:
阵列的存取:
nrow(x): The number of ranges in x.
nrow(x):不等x的数量。
ncol(x): The number of data variables in x.
ncol(x):x数据变量的数目。
dim(x): An integer vector of length two, essentially c(nrow(x), ncol(x)).
dim(x):整数向量长度为二,基本上c(nrow(x), ncol(x))。
rownames(x), rownames(x) <- value: Gets or sets the names of the ranges in x.
rownames(x),rownames(x) <- value:获取或设置在x范围的名称。
colnames(x), colnames(x) <- value: Gets the names of the variables in x.
colnames(x),colnames(x) <- value:获取x变量的名称。
dimnames(x): A list with two elements, essentially list(rownames(x), colnames(x)).
dimnames(x):一个两个元素的列表,基本上list(rownames(x), colnames(x))的。
dimnames(x) <- value: Sets the row and column names, where value is a list as described above.
dimnames(x) <- value:设置行和列名,其中值是一个列表如上所述。
columnMetadata(x): Get the DataFrame of metadata along the value columns, i.e., where each column in x is represented by a row in the metadata. Note that calling elementMetadata(x) returns the metadata on each space in x.
columnMetadata(x):DataFrame沿着价值列,即,其中每个列x“是由一排在元数据中表示的元数据。注意呼吁elementMetadata(x)返回每个空间元数据x。
columnMetadata(x) <- value: Set the DataFrame of metadata for the columns.
columnMetadata(x) <- value:设置元数据列的DataFrame。
within(data, expr, ...): Evaluates expr within data, a RangedData. Any values assigned in expr will be stored as value columns in data, unless they match one of the reserved names: ranges, start, end, width and space. Behavior is undefined if any of the range symbols are modified inconsistently. Modifications to space are ignored.
within(data, expr, ...):计算expr在data,RangedData,。 expr分配任何值将存储值列在data,除非它们符合一个保留名称:ranges,start,end, width和space。如果任何范围内的符号修改不一致的行为是未定义。修改space被忽略。
Range accessors. The type of the return value depends on the type of Ranges. For IRanges, an integer vector. Regardless, the number of elements is always equal to nrow(x).
范围的存取。返回值的类型取决于上Ranges。为IRanges,整数向量。无论如何,元素的数量始终是平等nrow(x)。
start(x), start(x) <- value: Get or set the starts of the ranges. When setting the starts, value can be an integer vector of length(sum(elementLengths(ranges(x)))) or an IntegerList object of length length(ranges(x)) and names names(ranges(x)).
start(x), start(x) <- value:获取或设置范围的开始。当设置的开始,value可以是整数向量的length(sum(elementLengths(ranges(x))))或的长度length(ranges(x))名names(ranges(x))IntegerList对象。
end(x), end(x) <- value: Get or set the ends of the ranges. When setting the ends, value can be an integer vector of length(sum(elementLengths(ranges(x)))) or an IntegerList object of length length(ranges(x)) and names names(ranges(x)).
end(x), end(x) <- value:获取或设置范围的两端。当设置的结束,value能length(sum(elementLengths(ranges(x))))整数向量或一个的长度IntegerList对象length(ranges(x))名names(ranges(x))。
width(x), width(x) <- value: Get or set the widths of the ranges. When setting the widths, value can be an integer vector of length(sum(elementLengths(ranges(x)))) or an IntegerList object of length length(ranges(x)) and names names(ranges(x)).
width(x), width(x) <- value:获取或设置范围的宽度。设置宽度时,value可以是一个整数向量的length(sum(elementLengths(ranges(x))))或的长度length(ranges(x))名names(ranges(x))IntegerList对象。
These accessors make the object seem like a list along the spaces:
这些访问使物体看起来像是沿着空间的列表:
length(x): The number of spaces (e.g. chromosomes) in x.
length(x):x的空间(如染色体)。
names(x), names(x) <- value: Get or set the names of the spaces (e.g. "chr1"). NULL or a character vector of the same length as x.
names(x),names(x) <- value:获取或设置空格的名称(如"chr1")。 NULL或相同的长度为x特征向量。
Other accessors:
其他存取:
universe(x), universe(x) <- value: Get or set the scalar string identifying the scope of the data in some way (e.g. genome, experimental platform, etc). The universe may be NULL.
universe(x),universe(x) <- value:获取或设置标量字符串,确定以某种方式(如基因组,实验平台等)的数据的范围。宇宙可能是NULL。
ranges(x), ranges(x) <- value: Gets or sets the ranges in x as a RangesList.
ranges(x), ranges(x) <- value:获取或设置在xRangesList的范围。
space(x): Gets the spaces from ranges(x).
space(x):获取从ranges(x)的空间。
values(x), values(x) <- value: Gets or sets the data values in x as a SplitDataFrameList.
values(x), values(x) <- value:获取或设置数据值在xSplitDataFrameList。
score(x), score(x) <- value: Gets or sets the column representing a "score" in x, as a vector. This is the column named score, or, if this does not exist, the first column, if it is numeric. The get method return NULL if no suitable score column is found. The set method takes a numeric vector as its value.
score(x), score(x) <- value:获取或设置“分数”x作为向量,代表列。这是名为score,或如果不存在,第一列的列,如果它是数字。 get方法返回NULL如果没有找到合适的得分列。集方法作为其值的数字向量。
构造----------Constructor----------
RangedData(ranges = IRanges(), ..., space = NULL, universe = NULL): Creates a RangedData with the ranges in ranges and variables given by the arguments in .... See the constructor DataFrame for how the ... arguments are interpreted.
RangedData(ranges = IRanges(), ..., space = NULL, universe = NULL):创建RangedData和ranges参数给定的变量的范围...的。见的构造DataFrame...参数如何解释。
If ranges is a Ranges object, the space argument is used to split of the data into spaces. If space is NULL, all of the ranges and values are placed into the same space, resulting in a single-space (length one) RangedData object. Otherwise, the ranges and values are split into spaces according to space, which is treated as a factor, like the f argument in split.
如果ranges是Ranges对象,space参数分为空间数据。如果space是NULL的范围和价值都被放置到同一个空间,在一个单一的空间(长度为一)RangedData对象。否则,空格分成的范围和价值根据space,这是作为一个因素处理,像fsplit参数。
If ranges is a RangesList object, then the supplied space argument is ignored and its value is derived from ranges.
如果ranges是RangesList对象,然后提供space参数将被忽略,其值从ranges派生。
If ranges is not a Ranges or RangesList object, this function calls as(ranges, "RangedData") and returns the result if successful.
ranges如果不Ranges或RangesList对象,这个函数调用as(ranges, "RangedData")并返回结果,如果成功的。
The universe may be specified as a scalar string by the universe argument.
宇宙可能被指定为一个标量字符串universe参数。
强迫----------Coercion----------
as.data.frame(x, row.names=NULL, optional=FALSE, ...): Copy the start, end, width of the ranges and all of the variables as columns in a data.frame. This is a bridge to existing functionality in R, but of course care must be taken if the data is large. Note that optional and ... are ignored.
as.data.frame(x, row.names=NULL, optional=FALSE, ...):复制开始,结束,范围宽和列在data.frame所有的变量。这是R中的现有功能的桥梁,但当然必须小心,如果数据是大。请注意,optional和...被忽略。
as(from, "DataFrame"): Like as.data.frame above, except the result is an DataFrame and it probably involves less copying, especially if there is only a single space.
as(from, "DataFrame")像as.data.frame,除了上述的结果,是一个DataFrame,它可能涉及较少的复制,尤其是如果只有一个单一的空间。
as(from, "RangedData"): Coerce from to a RangedData, according to the type of from:
as(from, "RangedData"):强迫fromRangedData,根据from:
Converts each run to a range and stores the run values in a column named "score".
每次运行转换的范围,并存储在一个名为“得分”一栏的运行值。
Creates a RangedData with only the ranges in from; no data columns.
创建一个唯一的范围在RangedDatafrom;没有数据列。
data.frame or DataTable Constructs a RangedData, using the columns “start”, “end”, and, optionally, “space” columns in from. The other columns become data columns in the result. Any “width” column is ignored.
data.frame或DataTable构造RangedData,使用列的“开始”,“结束”,并选择性地“空间”列在from。其他列成为数据列的结果。任何“宽度”列将被忽略。
as.env(x, enclos = parent.frame()): Creates an environment with a symbol for each variable in the frame, as well as a ranges symbol for the ranges. This is efficient, as no copying is performed.
as.env(x, enclos = parent.frame())创建environment为每个变量的象征框架,以及一个ranges符号的范围。这是有效的,因为没有进行复制。
子集和更换----------Subsetting and Replacement----------
In the code snippets below, x is a RangedData object.
在下面的代码片段,x是RangedData对象。
x[i]: Subsets x by indexing into its spaces, so the result is of the same class, with a different set of spaces. i can be numerical, logical, NULL or missing.
x[i]:亚群x索引到它的空间,所以结果是具有不同的空间,在同一类的。 i可以是数值,逻辑NULL或失踪。
x[i,j]: Subsets x by indexing into its rows and columns. The result is of the same class, with a different set of rows and columns. The row index i can either treat x as a flat table by being a character, integer, or logical vector or treat x as a partitioned table by being a RangesList, LogicalList, or IntegerList of the same length as x.
x[i,j]:亚群x分为行和列的索引。其结果是同一类的,用不同的行和列集。行的索引i可以把x作为一个字符,整数,或逻辑向量或治疗平表x作为RangesList的分区表, LogicalList或IntegerListx的长度相同。
x[[i]]: Extracts a variable from x, where i can be a character, numeric, or logical scalar that indexes into the columns. The variable is unlisted over the spaces.
x[[i]]“:从变量中提取x,其中i可以是一个字符,数字或逻辑标量,成列的索引。变量是空格,非上市。
For convenience, values of "space" and "ranges" are equivalent to space(x) and unlist(ranges(x)) respectively.
为方便起见,"space"和"ranges"值相当于space(x)和unlist(ranges(x))分别为。
x$name: similar to above, where name is taken literally as a column name in the data.
x$name:类似以上,其中name字面上被当作一个数据列的名称。
x[[i]] <- value: Sets value as column i in x, where i can be a character, numeric, or logical scalar that indexes into the columns. The length of value should equal nrow(x). x[[i]] should be identical to value after this operation.
x[[i]] <- value:设置值列ix,其中i可以是一个字符,数字或逻辑标,成列的索引。 value长度应等于nrow(x)。 x[[i]]应该是相同的value经过此操作。
For convenience, i="ranges" is equivalent to ranges(x) <- value.
为方便起见,i="ranges"相当于ranges(x) <- value的。
x$name <- value: similar to above, where name is taken literally as a column name in the data.
x$name <- value:类似以上,其中name字面上被当作一个数据列的名称。
分裂和结合----------Splitting and Combining----------
In the code snippets below, x is a RangedData object.
在下面的代码片段,x是RangedData对象。
split(x, f, drop = FALSE): Split x according to f, which should be of length equal to nrow(x). Note that drop is ignored here. The result is a RangedDataList where every element has the same length (number of spaces) but different sets of ranges within each space.
split(x, f, drop = FALSE):斯普利特x根据f,这应该是长度等于nrow(x)。注意drop这里被忽略。结果是一个RangedDataList每个元素具有相同的长度(空格数),但每一个空间范围内的两套不同。
rbind(...): Matches the spaces from the RangedData objects in ... by name and combines them row-wise. In a way, this is the reverse of the split operation described above.
rbind(...):匹配RangedData...名称由对象的空间,并结合他们的行明智的。在某种程度上,这是上面描述的反向split操作。
c(x, ..., recursive = FALSE): Combines x with arguments specified in ..., which must all be RangedData objects. This combination acts as if x is a list of spaces, meaning that the result will contain the spaces of the first concatenated with the spaces of the second, and so on. This function is useful when creating RangedData objects on a space-by-space basis and then needing to combine them.
c(x, ..., recursive = FALSE):结合x...,都必须RangedData对象中指定的参数。这种结合作用仿佛x是一个空白的列表,这意味着,其结果将包含第一与第二位的串联的空间,等等。空间,空间的基础上创建RangedData对象,然后需要把它们混合起来,此功能非常有用。
公用事业----------Utilities----------
In the code snippets below, x is a RangedData object.
在下面的代码片段,x是RangedData对象。
reduce(x, by = character(), drop.empty.ranges=FALSE, min.gapwidth=1L, with.inframe.attrib = FALSE): Merges the ranges in each of the spaces after grouping by the by values columns and returns the result as a RangedData containing the reduced ranges and the by value columns.
reduce(x, by = character(), drop.empty.ranges=FALSE, min.gapwidth=1L, with.inframe.attrib = FALSE):by值的列,并返回分组结果后,在每个空格合并范围作为一个RangedData包含的范围和减少by值的列。
应用----------Applying----------
There are two ways explicitly supported ways to apply a function over the spaces of a RangedData. The richest interface is rdapply, which is described in its own man page. The simpler interface is an lapply method:
有两种方式申请比RangedData的空间功能明确支持的方式。最富有的接口是的rdapply,这是在自己的手册页描述。简单的界面是一个lapply方法:
lapply(X, FUN, ...): Applies FUN to each space in X with extra parameters in ....
lapply(X, FUN, ...):适用于FUN每一个空间,在X...额外的参数。
作者(S)----------Author(s)----------
Michael Lawrence
参见----------See Also----------
DataTable, the parent of this class, with more utilities. RangedData-utils for utilities and the rdapply function for applying a function to each space separately.
DataTable中,更多的实用程序,这个类的父。 RangedData-utils有关公用事业和rdapply分别应用功能,每个空间的功能。
举例----------Examples----------
ranges <- IRanges(c(1,2,3),c(4,5,6))
filter <- c(1L, 0L, 1L)
score <- c(10L, 2L, NA)
## constructing RangedData instances[#建设RangedData实例]
## no variables[#没有变量]
rd <- RangedData()
rd <- RangedData(ranges)
ranges(rd)
## one variable[#一个变量]
rd <- RangedData(ranges, score)
rd[["score"]]
## multiple variables[#多个变量]
rd <- RangedData(ranges, filter, vals = score)
rd[["vals"]] # same as rd[["score"]] above[路相同[“得分”]以上]
rd$vals
rd[["filter"]]
rd <- RangedData(ranges, score + score)
rd[["score...score"]] # names made valid[名称作出有效]
## use a universe[#使用一个宇宙]
rd <- RangedData(ranges, universe = "hg18")
universe(rd)
rd <- RangedData(
RangesList(
chrA = IRanges(start = c(1, 4, 6), width=c(3, 2, 4)),
chrB = IRanges(start = c(1, 3, 6), width=c(3, 3, 4))),
score = c(2, 7, 3, 1, 1, 1))
rd
reduce(rd)
## split some data over chromosomes[#分裂对染色体的一些数据]
range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5))
both <- c(ranges, range2)
score <- c(score, c(0L, 3L, NA, 22L))
filter <- c(filter, c(0L, 1L, NA, 0L))
chrom <- paste("chr", rep(c(1,2), c(length(ranges), length(range2))), sep="")
rd <- RangedData(both, score, filter, space = chrom, universe = "hg18")
rd[["score"]] # identical to score[得分相同]
rd[1][["score"]] # identical to score[1:3][相同的得分[1:3]]
## subsetting[#子集]
## list style: [i][#列表样式:[I]]
rd[numeric()] # these three are all empty[这三个都是空的]
rd[logical()]
rd[NULL]
rd[] # missing, full instance returned[丢失,返回完整实例]
rd[FALSE] # logical, supports recycling[逻辑,支持回收]
rd[c(FALSE, FALSE)] # same as above[与上述相同]
rd[TRUE] # like rd[][像路[]]
rd[c(TRUE, FALSE)]
rd[1] # numeric index[数字索引]
rd[c(1,2)]
rd[-2]
## matrix style: [i,j][#矩阵式:[I,J]]
rd[,NULL] # no columns[没有列]
rd[NULL,] # no rows[任何行]
rd[,1]
rd[,1:2]
rd[,"filter"]
rd[1,] # now by the rows[现在行]
rd[c(1,3),]
rd[1:2, 1] # row and column[行和列]
rd[c(1:2,1,3),1] ## repeating rows[#重复行]
## dimnames[#dimnames]
colnames(rd)[2] <- "foo"
colnames(rd)
rownames(rd) <- head(letters, nrow(rd))
rownames(rd)
## space names[#空间名称]
names(rd)
names(rd)[1] <- "chr1"
## variable replacement[#变量替换]
count <- c(1L, 0L, 2L)
rd <- RangedData(ranges, count, space = c(1, 2, 1))
## adding a variable[#增加一个变量]
score <- c(10L, 2L, NA)
rd[["score"]] <- score
rd[["score"]] # same as 'score'[相同的“得分”]
## replacing a variable[#替换变量]
count2 <- c(1L, 1L, 0L)
rd[["count"]] <- count2
## numeric index also supported[#数字指标也支持]
rd[[2]] <- score
rd[[2]] # gets 'score'[获得“得分”]
## removing a variable[#删除一个变量]
rd[[2]] <- NULL
ncol(rd) # is only 1[仅1]
rd$score2 <- score
## combining/splitting[#合并/拆分]
rd <- RangedData(ranges, score, space = c(1, 2, 1))
c(rd[1], rd[2]) # equal to 'rd'[等于RD]
rd2 <- RangedData(ranges, score)
unlist(split(rd2, c(1, 2, 1))) # same as 'rd'[同为RD]
## applying[#应用]
lapply(rd, `[[`, 1) # get first column in each space[在每个空间的第一列]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|