integrated.analysis(SIM)
integrated.analysis()所属R语言包:SIM
Integrated analysis of dependent and indepedent microarray data
综合分析,依赖和indepedent的微阵列数据
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Runs the Integrated Analysis to test for associations between dependent and independent microarray data
运行测试协会之间的依赖和独立的芯片数据的综合分析
用法----------Usage----------
input.regions = "all chrs",
input.region.indep = NULL,
zscores = FALSE,
method = c("full", "smooth", "window", "overlap"),
dep.end = 1e5,
window = c(1e6, 1e6),
smooth.lambda=2,
adjust = ~1,
run.name = "analysis_results",
...)
参数----------Arguments----------
参数:samples
vector with either the names of the columns in the dependent and independent data corresponding to the samples, or a numerical vector containing the column numbers to include in the analysis, e.g. 5:10 means columns 5 till 10. Make sure that both datasets have the same number of samples with the same column names!
vector在依赖和独立的样品,或包含在分析中,列数,包括一个数值向量,如相应的数据列的名称5时10分手段列5至10。确保两集有相同数量的样品具有相同的列名!
参数:input.regions
vector indicating the dependent regions to be analyzed. Can be defined in four ways: 1) predefined input region: insert a predefined input region, choices are: “all chrs”, “all chrs auto”, “all arms”, “all arms auto” In the predefined regions “all arms” and “all arms auto” the arms 13p, 14p, 15p, 21p and 22p are left out, because in most studies there are no or few probes in these regions. To include them, just make your own vector of arms. 2) whole chromosome(s): insert a single chromosome or a list of chromosomes as a vector: c(1, 2, 3). 3) chromosome arms: insert a single chromosome arm or a list of chromosome arms like c("1q", "2p", "2q"). 4) subregions of a chromosome: insert a chromosome number followed by the start and end position like "chr1:1-1000000" These regions can also be combined, e.g. c("chr1:1-1000000","2q", 3). See details for more information.
vector表明依赖区域进行分析。可以定义在四个方面:1) predefined input region: 插入一个预定义的输入区域,选择是:“所有CHRS”,“所有CHRS汽车”,“武器”,“所有武器的汽车”在预定区域“所有武器”和“所有自动武器”的武器,13P,14P,15P,21P和22P冷落,因为在大多数研究中,有没有在这些区域或几个探针。包括他们,才使自己的武器向量。 2) whole chromosome(s): 插入一个vector:c(1, 2, 3)的一个单一的染色体或染色体列表。 3) chromosome arms: 插入一个单一的染色体或染色体臂像c("1q", "2p", "2q")名单。 4) subregions of a chromosome: 插入一个染色体数目的开始和结束位置,如"chr1:1-1000000"这些区域也可以结合,如c("chr1:1-1000000","2q", 3)。看到details更多信息。
参数:input.region.indep
indicating the independent region which will be analysed in combination of the dependent region. Only one input region can given using the same format as the dependent input region.
将依赖区域相结合的分析表明独立的区域。只有一个输入区域可以使用相同的格式依赖输入区域。
参数:zscores
logical indicates whether the Z-scores are calculated (takes longer time to run). If zscores = FALSE, only P-values are calculated.
logicalZ分数计算是否需要较长的时间来运行。如果zscores = FALSE,只有P-值计算。
参数:method
either one of “full”, “window”, “overlap” or “smooth”. This defines how the data is used for theintegrated.analysis. full: the whole dependent data region is taken. window: takes the middle of the dependent probe and does the integration on the independent probes that are within the window given at window-size given by window. overlap: does the integration on the independent probes that are within the start and end of the dependent probes given at dep.end. smooth: does smooth on the dependent probes with smoothing factor given at smooth.lambda, finds the value of smooth for each independent probe and does the integration on them. Only needed when method = "smooth", default smooth.lambda = 2
无论是一个“全”,“窗口”,“重叠”或“顺利”。这定义数据是如何integrated.analysis使用。采取全:整个区域的相关数据。窗口:中间的探针,并在窗口大小,窗口内的独立探针的集成window给出。重叠:整合独立的探针内的开始和结束在dep.end依赖探针。顺利:不给予平滑因子依赖的探针顺利smooth.lambda,发现每个独立的探针平稳的价值并没有对他们的集成。只需要method = "smooth",默认smooth.lambda = 2
参数:dep.end
numeric or character either the name of the column “end” in the dependent data or, when not available, an numeric value which indicates the end deviating from the start. When a numeric value is inserted, the function will do: start + dep.end = end. Only needed when method = "window" or “overlap”.
numeric或character名称列“的终结”中的相关数据,或无法使用,这表明在年底从一开始就偏离数值。插入一个数值时,该函数将做:start + dep.end = end。只需要method = "window"“或”重叠“。
参数:window
numeric values. Window to search for overlapping independent features per dependent probe. First value is the number of positions to the left from the middle of the probe, the second value is the number of positions to the right from the middle of the probe. Only needed when method = "window".
数值。搜索每探针的独立功能重叠的窗口。首先看重的是从探针中左侧的位置,第二个值是从探针中的权利的职位数量。只需要当method = "window"。
参数:smooth.lambda
numeric factor used for smoothing the dependent data. Only needed when method = "smooth". See quantsmooth for more information. By default the segment = min(nrow(dep.data), 100).
numeric因子用于平滑的相关数据。只需要当method=“平滑”。 quantsmooth看到更多的信息。默认情况下segment = min(nrow(dep.data), 100)。
参数:adjust
formula a formula like ~gender, where gender is a vector of the same size as samples. The regression models is correct for the gender effect, see gt.
formula像~性别,性别是一个向量作为样本大小相同的公式。回归模型是正确的性别效应,看到GT。
参数:run.name
character name of the analysis. The results will be stored in a folder with this name in the current working directory (use getwd() to print the current working directory). If missing the default folder "analysis\_results" will be generated.
character名称分析。结果将被储存在一个具有此名称在当前工作目录的文件夹(使用getwd()打印当前工作目录)。如果缺少默认的文件夹"analysis\_results"将产生。
参数:...
additional arguments for gt e.g. model="logistic" or when permutations > 0 the null distribution is estimated using permutations, see gt. See Details.
为GT例如额外的参数model="logistic"或permutations > 0空分布的估计,使用排列,看到GT。查看详细信息。
Details
详情----------Details----------
The Integrated Analysis is a regression of the independent data on the dependent features. The regression itself is done using the gt, which means that the genes in a region (e.g. a chromosome arm) are tested as a gene set. The individual associations between each dependent and each independent feature are calculated as Z-scores (standardized influences, see ?gt).
综合分析,是一个独立的数据相关的功能回归。回归本身使用,这意味着,在一个区域的基因(如染色体臂)作为基因组测试的GT。彼此之间的依赖性和各独立功能的个人协会计算Z分数(标准化的影响,看到?gt)。
This function splits the datasets into separate sets for each region (as specified by the input.regions) and runs the analysis for each region separately.
此功能分割成单独设置每个区域的数据集(由input.regions指定)和运行,分别为每个区域的分析。
When running the Integrated Analysis for a predefined input region, like “all arms” and “all chrs”, output can be obtained for all input regions, as well as subsets of it. But note that the genomic unit must be the same: if integrated.analysis was run using chromosomes as units, any of the functions and plots must also use chromosomes as units, and not chromosome arms. Similarly, if integrated analysis was run using chromosome arms as units, these units must also be used to produce plots and outputs. For example if the input.regions = "all arms" was used, P-value plots (see sim.plot.pvals.on.region can be produced by inserting the input.regions = "all arms", but also for instance “1p” or “20q”. However, to produce a plot of the whole chromosome, for example chromosome 1, the integrated should be re-run with input.region=1. The same goes for “all chrs”: P-value plots etc. can be produced for chromosome 1,2 and so on... but to produce plots for an arm, the integrated.analysis should be re-run for that region. This also goes for subregions of the chromosome like "chr1:1-1000000".
当运行一个预定义的输入区域的综合分析,像“武器”和“一切CHRS”,可以得到输出输入区域,以及它的子集。但要注意,基因组的单位必须是相同的:如果integrated.analysis运行使用染色体为单位,任何功能和图也必须使用单位的染色体,染色体臂。同样,如果integrated analysis运行染色体臂为单位,这些单位也必须被用来生产图和输出。例如,如果在input.regions = "all arms"使用P-值图(见的sim.plot.pvals.on.region可以插入input.regions = "all arms",但也为实例“1P”或制作的“ 20Q“。然而,产生一个整个染色体的图,例如1号染色体,综合应用input.region=1重新运行同样适用于”所有CHRS“:P-值图等integrated.analysis产生染色体1,2等......但手臂曲线,应重新运行该区域这也为次区域的染色体是这样“chr1:1。 -1000000“。
By default the gt uses a “linear” model, only when the dependent data is a logical matrix containing TRUE and FALSE a “logistic” model is selected. All other models need model = "", see gt for available models.
默认情况下,GT采用“线性”模式,只有当相关数据是一个logical matrix含TRUE和FALSE“MF”的模式选择。所有其他型号需要model = "",看到GT可用的模型。
值----------Value----------
No values are returned. Instead, the results of the analysis are stored in the subdirectories of the directory specified in run.name. E.g. the z-score matrices are saved in subfolder method.
没有返回值。相反,分析结果存储在run.name指定的目录的子目录。例如Z-得分矩阵保存在子文件夹method。
The following functions can be used to visualize the data:
可视化的数据,可用于以下功能:
参数:1)
sim.plot.zscore.heatmap (only possible when zscores = TRUE)
sim.plot.zscore.heatmap(仅当zscores = TRUE)
参数:2)
sim.plot.pvals.on.region
sim.plot.pvals.on.region
参数:3)
sim.plot.pvals.on.genome
sim.plot.pvals.on.genome
参数:4)
sim.plot.overlapping.indep.dep.features
sim.plot.overlapping.indep.dep.features
参数:
Other functions can be used to tabulate the results:
其他功能,可用于制表的结果:
参数:1)
tabulate.pvals
tabulate.pvals
参数:2)
tabulate.top.dep.features
tabulate.top.dep.features
参数:3)
tabulate.top.indep.features (only possible when zscores = TRUE
tabulate.top.indep.features(仅当zscores = TRUE
参数:4)
getoverlappingregions (only possible when tablulate.top.dep.features and tabulate.top.indep.features were run.
getoverlappingregions(只可能被运行,当tablulate.top.dep.features和tabulate.top.indep.features。
作者(S)----------Author(s)----------
Marten Boetzer, Melle Sieswerda, Renee X. de Menezes <a href="mailto:R.X.Menezes@lumc.nl">R.X.Menezes@lumc.nl</a>
参考文献----------References----------
Integrated analysis of DNA copy number and gene expression microarray data using gene sets. BMC Bioinformatics, 10, 203-.
A global test for groups of genes: testing association with a clinical outcome. Bioinformatics, 20, 93-109.
参见----------See Also----------
SIM, sim.plot.zscore.heatmap, sim.plot.pvals.on.region, sim.plot.pvals.on.genome, tabulate.pvals, tabulate.top.dep.features, tabulate.top.indep.features, getoverlappingregions, sim.plot.overlapping.indep.dep.features, gt
SIM卡,sim.plot.zscore.heatmap,sim.plot.pvals.on.region,sim.plot.pvals.on.genome,tabulate.pvals,tabulate.top.dep.features,tabulate.top.indep.features getoverlappingregions,sim.plot.overlapping.indep.dep.features,GT
举例----------Examples----------
#first run example(assemble.data)[第一次运行的例子(assemble.data)]
data(samples)
#perform integrated analysis without Z-scores using the method = "full"[不使用Z分数=“全”的方法进行综合分析]
integrated.analysis(samples=samples,
input.regions="8q",
zscores=FALSE,
method="full",
run.name="chr8q")
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|