A simulated data set for CNV detection from NGS data.
模拟数据为CNV的检测从农工商数据。
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This data set gives the read counts of 40 samples in 5000 genomic locations. The rows correspond to genomic segments of 25kbp length and the columns to samples. An entry is the number of reads that map to the specific segment of the sample. The rownames contain the information of the genomic location - they are in the format refseqname_startposition_endposition. The simulated data contains CNVs given in the CNVRanges object. It was generated using distributions of read counts as they appear in real sequencing experiments. CNVs were implanted under the assumption that the expected read count is linear dependent on the copy number (e.g. in a certain genomic we expect
这组数据给出了40个样品在5000基因的位置读取计数。行对应的25kbp长度和列样品的基因组片段。一个条目的读取次数,样品的特定部分的图。 rownames包含的基因位置的信息 - 他们在格式refseqname_startposition_endposition是。模拟数据包含CNVRanges对象给予的CNVs。它产生的使用只读计数的分布,因为他们在真正的测序实验中出现的。 CNVs被植入的假设下,预期读取计数线性依赖某个基因拷贝数(例如:我们所期望的
reads for copy number 2, then we expect
读取拷贝数2,然后我们期待
reads for copy number 4).
读取拷贝数4)。
用法----------Usage----------
X
格式----------Format----------
A data matrix of 5000 rows and 40 columns.
一个5000行和40列的数据矩阵。