找回密码
 注册
查看: 587|回复: 0

R语言 phenoTest包 findCopyNumber()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-26 11:04:02 | 显示全部楼层 |阅读模式
findCopyNumber(phenoTest)
findCopyNumber()所属R语言包:phenoTest

                                         Find copy number regions using expression data in a similar way ACE does.
                                         使用ACE的不以类似的方式表达数据的拷贝数区域。

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Given enrichment scores between two groups of samples and the chromosomical positions of those enrichment scores this function finds areas where the enrichemnt is bigger/lower than expected if the positions where assigned at random. Plots of the regions and positions of the enriched regions are provided.
鉴于此功能富集分数两组样品和那些富集分数chromosomical位置找到地方的enrichemnt是大/低于预期,如果在随机分配的位置。富集区域的区域和位置的图。


用法----------Usage----------


findCopyNumber(x, minGenes = 15, B = 100, p.adjust.method = "BH",
pvalcutoff = 0.05, exprScorecutoff = NA, mc.cores = 1, useAllPerm = F,
genome = "hg19", chrLengths, sampleGenome = TRUE, useOneChr = FALSE,
useIntegrate = TRUE,plot=TRUE)



参数----------Arguments----------

参数:x
An object of class data.frame with gene or probe identifiers as row names and the following columns: es (the enrichment score), chr (the chromosome where the gene or probe belong to) and pos (position in the chromosome in megabases). It can be obtained (from an epheno object) with the function getEsPositions.  
一个类的对象data.frame基因或探针识别为行名及以下的列:ES(富集得分),CHR(染色体基因或探针属于)和POS(位置在染色体碱基)。可以得到(从epheno对象)的功能getEsPositions。


参数:minGenes
Minimum number of genes in a row that have to be enriched to mark the region as enriched. Has to be bigger than 2.  
在连续的基因,以纪念为丰富的区域,有丰富的最低数量。必须是大于2。


参数:B
Number of permuations that will be computed to calculate pvalues. If useAllPerm is FALSE this value has to be bigger than 100.  If useAllPerm is TRUE the computations are much more expensive, therefore it is not recommended to use a B bigger than 100.  
数,将计算计算pvalues permuations。 useAllPerm如果是FALSE这个值要大于100。如果useAllPerm是真正的计算是昂贵得多,因此它不建议使用大于100的B。


参数:p.adjust.method
P value adjustment method to be used. p.adjust.methods provides a list of available methods.  
要使用P值调整方法。 p.adjust.methods提供了一个可用的方法列表。


参数:pvalcutoff
All genes with an adjusted p value lower than this parameter will be considered enriched.  
调整p值比这个参数更低的所有基因,将被视为丰富。


参数:exprScorecutoff
Genes with a smoothed score that is not bigger (lower if the given number is negative) than the specified value will not be considered significant.  
这是一个平滑的得分并不大(下如果给定的数字为负数)超过规定值的基因不会被认为是显着。


参数:mc.cores
Number of cores to be used in the computation. If mc.cores is bigger than 1 the multicore library has to be loaded.  
在计算中使用的核心数量。 mc.cores如果大于1multicore库被加载。


参数:useAllPerm
If FALSE for each gene only permutations of genes that are in an area with similar density (similar number of genes close to them) are used to compute pvalues.  If TRUE all permutations are used for each gene.  We recommend to use the option FALSE after having observed that the enrichment can depend on the number of genes that are in the area.  We recommend to use the option TRUE if the positions of the enrichment score are equidistant. Take into account that this option is much slower and needs less permutations, therefore a smaller B is preferred.  See details for more info.  
如果为FALSE,每个基因只排列的基因具有类似密度的区域(接近他们的基因相似的数字)用于计算pvalues。如果是TRUE,用于每一个基因的所有排列。我们建议使用后观察富集取决于在该区域的基因数目的选项为FALSE。我们建议使用的选项为TRUE,如果浓缩得分的位置是等距离的。考虑到这个选项是要慢得多,需要较少的排列,因此,一个较小的B优先。看到更多的细节。


参数:genome
Genome that will be used to draw cytobands.  
基因组将被用于绘制cytobands。


参数:chrLengths
An object of class numeric containing chromosome names as names. This names have to be the same as the ones used in x$chr If missing the last position is used.  
一个对象类numeric如姓名含有染色体名。这个名称必须作为x$chr如果错过了最后一个位置用于使用的相同。


参数:sampleGenome
If positions are sampled over the hole genome (across chromosomes) or within each chromosome. This is TRUE by default.   
如果过孔的基因组(跨染色体)或在每个染色体的位置采样。这是默认为true。


参数:useOneChr
Use only one chromosome to build the distribution under the null hypothesis that genes/probes are not enriched. By default this is FALSE. The chromosome that is used is chosen as follows: after removing small chromosomes we select the one closest to the median quadratic distance to 0. Setting this parameter to TRUE decreases processing time.  
只使用一个染色体分布下建立的零假设,没有丰富的基因/探针。默认情况下,这是假的。用于染色体被选为如下:消除小染色体后,我们选择一个最接近中位数为0的二次距离。将此参数设置为TRUE,跌幅处理时间。


参数:useIntegrate
If we want to use integrate or pnorm to compute pvalues. The first does not assume any distribution for the distribution under the null hypothesis, the second assumes it is normally distributed.  
如果我们想使用integrate或pnorm计算pvalues。第一,不承担任何分布零假设下的分布,第二个假定是正态分布。


参数:plot
If FALSE the function will make no plots.  
如果为FALSE,函数将不作任何图谋。


Details

详情----------Details----------

Enrichemnt scores can be either log fold changes, log hazard ratios, log variabiliy ratios or any other score.
enrichemnt分数可以是记录倍的变化,记录的危险比,登录variabiliy比率或任何其他的得分。

Within each chromosome a smoothed score for each gene is obtained via generalized additive models, the smoothing parameter for each chromosome being chosen via cross-validation. The obtained smoothing parameter of each chromosome will be used in permutations.
每一个染色体内每一个基因的平滑得分获得通过广义相加模型,每个染色体的平滑参数选择通过交叉验证。获得每个染色体的平滑参数将用于置换。

We assessed statistical significance by permuting the positions thrue the hole genome. If useAllPerm is FALSE for each gene the permutations of genes that are in an area with similar density (distance to tenth gene) are used to compute pvalues. We observed that genes with similar densities tend to have similar smoothed scores. If we set 1000 permutations (B=1000) scores are permuted thrue  the hole genome 10 times (1000/100). For each smoothed scored the permutations of the 100 smoothed scores with most similar density (distance to tenth gene) are used. Therefore each smoothed score will be compared to 1000 smoothed scores obtained from permutations.
我们评估置换的位置thrue洞基因组统计学意义。 useAllPerm如果是FALSE,每个基因的基因,在一个区域具有类似密度的排列(第十届基因的距离)用来计算pvalues的。我们观察到的具有类似密度的基因往往有类似的平滑分数。如果我们把1000的排列(B= 1000)分数置换thrue洞基因组的10倍(1000/100)。对于每一个平滑的得分与100平滑的分数的排列密度最相似的(第十届基因的距离)。因此,每一个平滑的得分将较1000平滑的排列获得分数。

If scores are at the same distance in the genome from each other (for instance when we have a score every fixed certain bases) the option useAllPerm=TRUE is recommended. In this case every smoothed score is compared to all smoothed scores obtained via permutations. In this case having 20,000 genes and setting the paramter B=10 would mean that the scores are permuted 10 times times thrue the hole genome, obtaining 200,000 permuted smoothed scores. Each observed smoothed score will be tested against the distribution of the 200,000 permuted smoothed scores.
如果分数在彼此的基因组中相同的距离(例如,当我们有一个得分,每一个固定的某些碱基)选项useAllPerm= TRUE,建议。在这种情况下,每一个平滑的得分相比,所有平滑,通过置换获得的分数。在这种情况下,有2万个基因,并设置参数置B=10将意味着分数置换10倍thrue孔的基因组,获得200,000置换平滑分数。每个观测平滑的得分将被测试反对200,000置换平滑分数的分布。

Only regions with as many genes as told in minGenes being statistically significant (pvalue lower than parameter pvalcutoff) after adjusting pvalues with the method specified in p.adjust.method will be selected as enriched. If exprScorecutoff is different form NA, a gene to be statistically significant will need (aditionally to the pvalue cutoff) to have a smoothed score bigger (lower if exprScorecutoff is negative) than the specified value.
与许多基因的唯一区域,如告诉记者,在minGenes统计学意义后调整pvalcutoff将丰富的选择指定的方法pvalues。(pvalue比参数p.adjust.method)如果exprScorecutoff是不同形式的适用,有统计学显着的基因会需要(aditionally的pvalue截止)有更大平滑得分(较低exprScorecutoff如果为负)超过规定值。


值----------Value----------

Plots all chromomes and marks the enriched regions. Also returns a data.frame containing the positions of the enriched regions. This output can be passed by to the genesInArea function to obtain the names of the genes that are in each region.
所有chromomes图和商标丰富的区域。也返回一个data.frame含有丰富的区域的位置。该输出可以通过genesInArea函数获得的基因,在每个区域的名称。


作者(S)----------Author(s)----------



Evarist Planet




参见----------See Also----------

getEsPositions, genesInArea
getEsPositions,genesInArea


举例----------Examples----------


#data(epheno)[数据(epheno)]
#phenoNames(epheno)[phenoNames(epheno)]
#mypos &lt;- getEsPositions(epheno,'Relapse')[mypos < -  getEsPositions(epheno票价,复发)]
#mypos$chr &lt;- '1' #we set all probes to chr one for illustration purposes[mypos美元CHR < -  1#我们设置的所有探针作说明用途以CHR一]
#                 #(we want a minimum number of probes per chromosome) [#(我们希望每个染色体的探针的最低数量)]
#head(mypos)[主管(mypos)]
#set.seed(1)[set.seed(1)]
#regions &lt;- findCopyNumber(mypos,B=10,plot=FALSE) [区域< -  findCopyNumber人(mypos,= 10,积= FALSE,)]
#head(regions)[头(区域)]

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-1 15:50 , Processed in 0.022107 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表