R语言 iterativeBMA包 iterateBMAglm.train()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 22:46:38

iterateBMAglm.train(iterativeBMA)
iterateBMAglm.train()所属R语言包：iterativeBMA

                                    Iterative Bayesian Model Averaging: training step
                                       迭代贝叶斯模型平均：训练步骤

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Classification and variable selection on microarray data. This is a multivariate technique to select a small number of relevant variables (typically genes) to classify microarray samples.  This function performs the training phase. The data is assumed to consist of
微阵列数据分类和变量选择。这是一个多元的技术，选择一个小数目的相关变量（通常是基因）芯片的样品进行分类。执行此功能的训练阶段。假设数据包括

用法----------Usage----------

iterateBMAglm.train (train.expr.set, train.class, p=100, nbest=10, maxNvar=30, maxIter=20000, thresProbne0=1)

参数----------Arguments----------

参数：train.expr.set
an ExpressionSet object. We assume the rows in the expression data represent variables (genes),  while the columns  represent  samples or experiments. This training data is used to select relevant genes (variables) for classification.
ExpressionSet对象。我们假设在表达数据行代表变量（基因），而列代表样本或实验。这个训练数据用来选择分类相关的基因（变量）。

参数：train.class
class vector for the observations (samples or  experiments) in the training data.  Class numbers are assumed to start from 0, and the length of this class vector should be equal to the number of rows in train.dat. Since we assume 2-class data, we expect the class vector consists of zero's and one's.
在训练数据的意见（样品或实验）类向量。假设类数从0开始的，这一类向量的长度应该是平等的行在train.dat数。由于我们假设2级的数据，我们期待零和一个人的类向量组成。

参数：p
a number indicating the maximum number of top univariate genes used in the iterative BMA algorithm.  This number is assumed to be less than the total number of genes in the training data. A larger p usually requires longer computational time as more iterations of the BMA algorithm are potentially applied. The default is 100.
数字说明中所使用的顶级单因素基因的最大数量的迭代BMA的算法。这个数字被认为是比在训练数据的基因总数少。一个更大的P级通常需要较长的计算时间的BMA算法迭代潜在应用。默认是100。

参数：nbest
a number specifying the number of models of each size  returned to bic.glm in the BMA package.  The default is 10.
返回一个数字，指定每个大小的模型bic.glmBMA包。默认为10。

参数：maxNvar
a number indicating the maximum number of variables used in each iteration of bic.glm from the BMA package. The default is 30.
数字显示中用于bic.glmBMA包的每个迭代变量的最大数目。默认值为30。

参数：maxIter
a number indicating the maximum of iterations of  bic.glm. The default is 20000.
bic.glm迭代的最大的一个数字，指示。默认是20000。

参数：thresProbne0
a number specifying the threshold for the posterior probability that each variable (gene) is non-zero (in percent).  Variables (genes) with such posterior  probability less than this threshold are dropped in the iterative application of bic.glm.  The default is 1 percent.
一个数字，指定每个变量（基因）是非零（％）为后验概率的阈值。在bic.glm的迭代应用后验概率小于这个阈值的变量（基因）被丢弃。默认是1％。

Details

详情----------Details----------

The training phase consists of first ordering all the variables (genes) by a univariate measure called between-groups to within-groups sums-of-squares (BSS/WSS) ratio, and then iteratively applying the bic.glm algorithm from the BMA package.  In the first application of the bic.glm algorithm, the top maxNvar univariate ranked genes are used.  After each application of the bic.glm algorithm, the genes with probne0 < thresProbne0 are dropped, and the next univariate ordered genes are added
训练阶段，由第一顺序组间组内款项的平方（BSS / WSS）的比例称为一元的措施的所有变量（基因），然后从反复申请bic.glm算法BMA包。 bic.glm算法的首次应用，在顶端maxNvar单因素排名基因。经过每个bic.glm算法，probne0<thresProbne0下降，下单因素下令基因添加的基因中的应用

值----------Value----------

An object of class bic.glm returned by the last iteration of bic.glm.  The object is a list consisting of the following components:
一个类的对象bic.glm由bic.glm最后一次迭代返回。对象是一个列表，包含以下组件：

参数：namesx
the names of the variables in the last iteration of  bic.glm.
在bic.glm最后一次迭代变量的名称。

参数：postprob
the posterior probabilities of the models selected.
选择模型的后验概率。

参数：deviance
the estimated model deviances.
估计模型deviances。

参数：label
labels identifying the models selected.
标签标识选定的模型。

参数：bic
values of BIC for the models.
模型的BIC值。

参数：size
the number of independent variables in each of the models.
每个模型的独立变量的数目。

参数：which
a logical matrix with one row per model and one column per  variable indicating whether that variable is in the model.
与一列每个模型和列表示该变量是否是模型中的每一个变量的逻辑矩阵。

参数：probne0
the posterior probability that each variable is non-zero  (in percent).
每个变量是非零的后验概率（％）。

参数：postmean
the posterior mean of each coefficient (from model averaging).
每个系数后平均（平均模型）。

参数：postsd
the posterior standard deviation of each coefficient  (from model averaging).
后每个系数的标准偏差（平均模型）。

参数：condpostmean
the posterior mean of each coefficient conditional on  the variable being included in the model.
后，平均每个条件被包括在模型中的变量的系数。

参数：condpostsd
the posterior standard deviation of each coefficient  conditional on the variable being included in the model.
每个条件被包括在模型中的变量的系数后的标准偏差。

参数：mle
matrix with one row per model and one column per variable giving  the maximum likelihood estimate of each coefficient for each model.
与一列每个模型，并给每个模型各系数的最大似然估计每一个变量列的矩阵。

参数：se
matrix with one row per model and one column per variable giving  the standard error of each coefficient for each model.
矩阵与一列每个模型和每一个变量列给每个模型的每个系数的标准误差。

参数：reduced
a logical indicating whether any variables were dropped  before model averaging.
逻辑模型平均下降之前，是否有任何变数。

参数：dropped
a vector containing the names of those variables dropped  before model averaging.
前模型平均下降矢量包含这些变量的名称。

参数：call
the matched call that created the bma.lm object.
匹配的的呼叫创建的bma.lm对象。

注意----------Note----------

The BMA and Biobase packages are required.
BMA和Biobase包需要。

参考文献----------References----------

Bayesian model selection in social research (with Discussion). Sociological Methodology 1995 (Peter V. Marsden, ed.), pp. 111-196, Cambridge, Mass.: Blackwells.
Bayesian Model Averaging: Development of an improved multi-class, gene selection and classification tool for microarray data.  Bioinformatics 21: 2394-2402.

参见----------See Also----------

iterateBMAglm.train.predict, iterateBMAglm.train.predict.test, bma.predict, brier.score
iterateBMAglm.train.predict，iterateBMAglm.train.predict.test，bma.predict，brier.score

举例----------Examples----------

library (Biobase)
library (BMA)
library (iterativeBMA)
data(trainData)
data(trainClass)

## training phase: select relevant genes[训练阶段：选择相关的基因]
ret.bic.glm <- iterateBMAglm.train (train.expr.set=trainData, trainClass, p=100)

## get the selected genes with probne0 > 0[＃得到与probne0> 0选定的基因]
ret.gene.names <- ret.bic.glm$namesx[ret.bic.glm$probne0 > 0]

## show the posterior probabilities of selected models[＃显示所选模型的后验概率]
ret.bic.glm$postprob

data (testData)

## get the subset of test data with the genes from the last iteration of bic.glm[＃得到最后一次迭代从对bic.glm的基因测试数据的子集]
curr.test.dat <- t(exprs(testData)[ret.gene.names,])

## to compute the predicted probabilities for the test samples[＃计算测试样本的预测概率]
y.pred.test <- apply (curr.test.dat, 1, bma.predict, postprobArr=ret.bic.glm$postprob, mleArr=ret.bic.glm$mle)

## compute the Brier Score if the class labels of the test samples are known[＃计算分数，如果测试样本类的标签被称为马库斯 - 布莱尔]
data (testClass)
brier.score (y.pred.test, testClass)

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 iterativeBMA包 iterateBMAglm.train()函数中文帮助文档(中英文对照)

浏览过的版块