找回密码
 注册
查看: 830|回复: 0

R语言 siggenes包 chisq.ebam()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-26 14:10:48 | 显示全部楼层 |阅读模式
chisq.ebam(siggenes)
chisq.ebam()所属R语言包:siggenes

                                        EBAM Analysis for Categorical Data
                                         EBAM分类数据的分析

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Generates the required statistics for an Empirical Bayes Analysis of Microarrays (EBAM) of categorical data such as SNP data.
经验Bayes分类数据,如SNP数据的微阵列分析(EBAM)生成所需的统计资料。

Should not be called directly, but via ebam(..., method = chisq.ebam).
不应直接调用,而是通过ebam(...,方法= chisq.ebam)。

This function replaces cat.ebam.
此功能取代cat.ebam。


用法----------Usage----------


chisq.ebam(data, cl, approx = NULL, B = 100, n.split = 1,
   check.for.NN = FALSE, lev = NULL, B.more = 0.1, B.max = 50000,
   n.subset = 10, fast = FALSE, n.interval = NULL, df.ratio = 3,
   df.dens = NULL, knots.mode = NULL, type.nclass = "wand",
   rand = NA)



参数----------Arguments----------

参数:data
a matrix, data frame, or list. If a matrix or data frame, then each row  must correspond to a variable (e.g., a SNP), and each column to a sample (i.e.\ an observation). If the number of observations is huge it is better to specify data as a list consisting of matrices, where each matrix represents one group and summarizes how many observations in this group show which level at which variable. These matrices can be generated using the function rowTables from the package scrime. For details on how to specify this list, see the examples section on this man page, and the help for  rowChisqMultiClass in the package scrime.
矩阵,数据框,或列表。如果一个矩阵或数据框,然后每行必须对应一个变量(例如,一个SNP),每个样品的列(即\观察)。如果观测的数量是巨大的,最好是指定data作为一个列表组成的矩阵,每个矩阵代表一组,并总结在这组节目多少观测水平的变量。这些矩阵可以使用的功能包rowTablesscrime产生。有关如何指定此列表的详细信息,看到这名男子页面上的“一节的例子,帮助rowChisqMultiClass包scrime。


参数:cl
a numeric vector of length ncol(data) indicating to which class a sample belongs. Must consist of the integers between 1 and c, where  c is the number of different groups. Needs only to be specified if data is a matrix or a data frame.
数值向量的长度ncol(data)表明,样品属于哪一类。必须由整数1至c,其中c是不同群体的数量。需要只被指定data如果是一个矩阵或一个数据框。


参数:approx
should the null distribution be approximated by a ChiSquare-distribution? Currently only available if data is a matrix or data frame. If not specified,  approx = FALSE is used, and the null distribution is estimated by employing a  permutation method.
空分布近似ChiSquare分布?目前仅当data是一个矩阵或数据框。如果没有指定,approx = FALSE使用,空分布的估计,采用置换的方法。


参数:B
the number of permutations used in the estimation of the null distribution, and hence, in the computation of the expected z-values.
排列在空分布的估计使用人数,因此,在预期z值的计算。


参数:n.split
number of chunks in which the variables are splitted in the computation of the values of the test statistic. Currently, only available if approx = TRUE and data is a matrix or data frame. By default, the test scores of all variables are calculated simultaneously. If the number of variables or observations is large, setting n.split to a larger value than 1 can help to avoid memory problems.
在分拆的检验统计量的值计算变量的块数。目前,仅在approx = TRUE和data是一个矩阵或数据框。默认情况下,所有变量的测试成绩,同时计算。如果变量或意见的数量大,设置n.split一个大于1的值,可以帮助避免内存问题。


参数:check.for.NN
if TRUE, it will be checked if any of the genotypes is equal to "NN". Can be very time-consuming when the data set is high-dimensional.
如果TRUE,它会进行检查,如果有任何的基因型是平等的“神经网络”。高维数据集时可以非常费时。


参数:lev
numeric or character vector specifying the codings of the levels of the variables/SNPs. Can only be specified if data is a matrix or a data frame. Must only be specified if the variables are not coded by the integers between 1 and the number of levels. Can also be a list. In this case, each element of this list must be a numeric or character vector specifying the codings, where all elements must have the same length.
指定的变量/单核苷酸多态性的水平编码中的数字或字符向量。如果data是一个矩阵或一个数据框只能被指定。如果变量不是由1和层次之间的整数编码必须只指定。也可以是一个列表。在这种情况下,这个名单中的每个元素必须是一个数字或字符向量指定编码中,所有元素都必须具有相同的长度。


参数:B.more
a numeric value. If the number of all possible permutations is smaller than or equal to (1+B.more)*B, full permutation will be done.  Otherwise, B permutations are used.
一个数值。如果所有可能的排列数小于或等于(1 +B.more)*B,全置换将完成。否则,使用B排列。


参数:B.max
a numeric value. If the number of all possible permutations is smaller than or equal to B.max, B randomly selected permutations will be used in the computation of the null distribution. Otherwise, B random draws of the group labels are used.   
一个数值。如果所有可能的排列数小于或等于B.max的,B随机选择的排列将在空分布的计算。否则,B随机组标签提请使用。


参数:n.subset
a numeric value indicating in how many subsets the B  permutations are divided when computing the permuted z-values. Please note that the meaning of n.subset differs between the SAM and the EBAM functions.
B排列分为许多子集时计算置换z值表示的数值。请注意,n.subset意义之间的SAM和EBAM的功能不同。


参数:fast
if FALSE the exact number of permuted test scores that are more extreme than a particular observed test score is computed for each of the variables/SNPs. If TRUE, a crude estimate of this number is used.
如果FALSE置换的测试成绩是超过一个特定的观察测试得分的极端的确切数目计算每个变量/单核苷酸多态性。如果TRUE,用于粗略估计这个数字。


参数:n.interval
the number of intervals used in the logistic regression with repeated observations for estimating the ratio f0/f  (if approx = FALSE), or in the Poisson regression used to estimate the density of the observed z-values (if approx = TRUE). If NULL, n.interval is set to 139 if approx = FALSE, and estimated by the method specified by type.nclass if approx = TRUE.
用反复观察,在logistic回归估计比f0/f(如果approx = FALSE),或在使用泊松回归估计密度的观测z值的时间间隔的数目(如果approx = TRUE)。如果NULL,n.interval设置为139,如果approx = FALSE,type.nclass如果approx = TRUE指定的方法估计。


参数:df.ratio
integer specifying the degrees of freedom of the natural cubic spline used in the logistic regression with repeated observations. Ignored if approx = TRUE.  
整数,指定立方米用于在反复观察的logistic回归自然样条的自由程度。如果approx = TRUE忽略。


参数:df.dens
integer specifying the degrees of freedom of the natural cubic spline used in the Poisson regression to estimate the density of the observed z-values. Ignored if approx = FALSE.  If NULL, df.dens is set to 3 if the degrees of freedom of the appromimated null distribution, i.e.\ the ChiSquare-distribution, are less than or equal to 2, and otherwise df.dens is set to 5.
整数,指定使用泊松回归估计观测到的z值的密度自然三次样条的自由程度。如果approx = FALSE忽略。如果NULL,df.dens设置为3,如果空分布的appromimated自由的程度,即\分布,ChiSquare小于或等于2,否则df.dens设置为5。


参数:knots.mode
if TRUE the df.dens - 1 knots are centered around the mode and not the median of the density when fitting the Poisson regression model. Ignored if approx = FALSE.  If not specified, knots.mode is set to TRUE if the degrees of freedom of the approximated null distribution, i.e.\ tht ChiSquare-distribution, are larger than or equal to 3, and otherwise knots.mode is set to FALSE. For details on this density estimation,  see denspr.
如果TRUEdf.dens -  1节围绕模式,而不是密度的中位数的泊松回归模型拟合。如果approx = FALSE忽略。如果没有指定,knots.mode设置为TRUE如果近似空分布的自由程度,即\,THTChiSquare分布,大于或等于3,否则 knots.mode设置FALSE。这个密度估计的详细信息,请参阅denspr。


参数:type.nclass
character string specifying the procedure used to compute the number of cells of the histogram. Ignored if approx = FALSE or  n.interval is specified. Can be either "wand" (default), "scott", or "FD". For details, see denspr.
字符串指定的程序,用来计算直方图的单元数量。如果approx = FALSE或n.interval指定忽略。可以是"wand"(默认),"scott"或"FD"。有关详细信息,请参阅denspr。


参数:rand
numeric value. If specified, i.e. not NA, the random number generator will be set into a reproducible state.
数值。如果指定,即不NA,随机数发生器将被设置成一个可重复的状态。


Details

详情----------Details----------

For each variable, Pearson's Chi-Square statistic is computed to test if the distribution of the variable differs between several groups.  Since only one null distribution is estimated for all variables as proposed in the original EBAM application of Efron et al. (2001), all variables must have the same number of levels/categories.
对于每个变量,Pearson的卡方统计计算,以测试变量的分布几组之间的不同。由于只有一个空分布估计埃弗龙等的原始EBAM应用中提出的所有变量。 (2001年),所有的变量都必须有相同数量的级别/类别。


值----------Value----------

A list containing statistics required by ebam.
列表包含ebam需要的统计数据。


警告----------Warning----------

This procedure will only work correctly if all SNPs/variables have the same
此过程将正常工作,如果所有的SNPs /变量具有相同的


作者(S)----------Author(s)----------


Holger Schwender, <a href="mailto:holger.schw@gmx.de">holger.schw@gmx.de</a>



参考文献----------References----------

Empirical Bayes Analysis of a Microarray Experiment, JASA,  96, 1151-1160.
of Single Nucleotide Polymorphisms. BMC Bioinformatics, 9, 144.
the Empirical Bayes and the Significance Analysis of Microarrays. Technical Report, SFB 475, University of Dortmund, Germany.

参见----------See Also----------

EBAM-class,ebam, chisq.stat
EBAM-class,ebam,chisq.stat


举例----------Examples----------


  # Generate a random 1000 x 40 matrix consisting of the values[生成一个随机的1000×40的值组成的矩阵]
  # 1, 2, and 3, and representing 1000 variables and 40 observations.[1,2,3,相当于1000个变量和40意见。]
  
  mat <- matrix(sample(3, 40000, TRUE), 1000)
  
  # Assume that the first 20 observations are cases, and the[假设前20个观测情况下,和]
  # remaining 20 are controls.[其余20控制。]
  
  cl <- rep(1:2, e=20)
  
  # Then an EBAM analysis for categorical data can be done by[然后可以通过一个分类数据EBAM分析]
  
  out <- ebam(mat, cl, method=chisq.ebam, approx=TRUE)
  out
  
  # approx is set to TRUE to approximate the null distribution[约设置为TRUE,以近似的空分布]
  # by the ChiSquare-distribution (usually, for such a small[由卡方分布(通常情况下,这样一个小]
  # number of observations this might not be a good idea[的若干意见,这可能不是一个好主意]
  # as the assumptions behind this approximation might not[这种近似背后的假设可能不]
  # be fulfilled).[履行)。]
  
  # The same results can also be obtained by employing[也可以得到相同的结果,由用人]
  # contingency tables, i.e. by specifying data as a list.[应急表,即由一个列表指定的数据。]
  # For this, we need to generate the tables summarizing[对于这一点,我们需要生成的表总结]
  # groupwise how many observations show which level at[GroupWise的观测表明多少级]
  # which variable. These tables can be obtained by[哪些变量。这些表可以得到]
  
  library(scrime)
  cases <- rowTables(mat[, cl==1])
  controls <- rowTables(mat[, cl==2])
  ltabs <- list(cases, controls)
  
  # And the same EBAM analysis as above can then be [然后可以和上面的的相同EBAM分析]
  # performed by [执行由]
  
  out2 <- ebam(ltabs, method=chisq.ebam, approx=TRUE)
  out2


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-1-24 07:24 , Processed in 0.032367 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表