R语言 MiPP包 mipp.seq()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-26 01:01:54

mipp.seq(MiPP)
mipp.seq()所属R语言包：MiPP

                                    MiPP-based Classification
                                       MIPP  - 基于分类

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

sequentially finds optimal sets of genes for classification
依次找到最佳的分类基因集

用法----------Usage----------

mipp.seq(x, y, x.test = NULL, y.test = NULL, probe.ID = NULL,
rule = "lda", method.cut = "t.test", percent.cut = 0.01,
model.sMiPP.margin = 0.01, min.sMiPP = 0.85, n.drops = 2,
n.fold = 5, p.test = 1/3, n.split = 20, n.split.eval = 100,
n.seq=3, cutoff.sMiPP=0.7, remove.gene.each.model="all")

参数----------Arguments----------

参数：x
data matrix
数据矩阵

参数：y
class vector
类向量

参数：x.test
test data matrix if available
如果有测试数据矩阵

参数：y.test
test class vector if available
如果可用的测试类向量

参数：probe.ID
probe set IDs; if NULL, row numbers are assigned.
探针集ID为NULL，如果行数分配。

参数：rule
classification rule: "lda","qda","logistic","svmlin","svmrbf";  the default is "lda".
分类规则：“LDA”，“QDA”，“MF”，“svmlin”，“svmrbf”默认为“LDA”。

参数：method.cut
method for pre-selection; t-test is available.
预选; t-检验的方法是可用的。

参数：percent.cut
proportion of pre-selected genes; the default is 0.01.
预先选定的基因的比例，默认为0.01。

参数：model.sMiPP.margin
smallest set of genes s.t. sMiPP <= (max sMiPP-model.sMiPP.margin); the default is 0.01.
最小集合基因s.t. sMiPP <=（最大sMiPP model.sMiPP.margin）;默认为0.01。

参数：min.sMiPP
Adding genes stops if max sMiPP is at least min.sMiPP;  the default is 0.85.
添加基因停止，如果最大sMiPP是至少min.sMiPP;默认为0.85。

参数：n.drops
Adding genes stops if sMiPP decreases (n.drops) times, in  addition to min.sMiPP criterion.; the default is 2.
添加基因停止，如果sMiPP跌幅（n.drops）次，除了min.sMiPP标准;默认是2。

参数：n.fold
number of folds; default is 5.
褶皱的数量，默认是5。

参数：p.test
partition percent of train and test samples when test samples are not available; the default is 1/3 for test set.
分区％的训练和测试样品测试样品时，无法使用;默认的是1/3为测试集。

参数：n.split
number of splits; the default is 20.
数分裂，默认为20。

参数：n.split.eval
numbr of splits for evalutation; the default is 100.
numbr的分裂evalutation;默认是100。

参数：n.seq
Number of sequential gene model selection; the default is 3.
顺序基因模型选择的数量;默认为3次。

参数：cutoff.sMiPP
Cutoff point of 5 percent sMiPP to select gene models
截止点5％sMiPP的选择基因模型

参数：remove.gene.each.model
Re-run after removing all genes in the selected models if "all"  and the first gene for each of the selected models if "first"
重新运行后，删除“所有”，如果所选的模型，如果“第一”的每一个基因的某些型号，所有的基因

值----------Value----------

参数：model
candiadate genes (for each split if no indep set is available
如果没有INDEP集candiadate基因（每个分割

参数：model.eval
Optimal sets of genes for each split when no indep set is available
最佳组合的基因，每个分割时没有INDEP集

参数：genes.selected
a list of genes selected by sequential selection
基因顺序选择选择列表

作者（S）----------Author(s)----------

Soukup M, Cho H, and Lee JK

参考文献----------References----------

using misclassification penalized posterior, Bioinformatics, 21 (Suppl): i423-i430.
using gene expression data, Journal of Bioinformatics and Computational Biology, 1(4) 681-694

举例----------Examples----------

##########[＃＃＃＃＃＃＃＃＃]
#Example 1: When an independent test set is available[例1：当一个独立的测试集是可用]

data(leukemia)

#Normalize combined data[规范化的综合数据]
leukemia <- cbind(leuk1, leuk2)
leukemia <- mipp.preproc(leukemia, data.type="MAS4")

#Train set[列车集]
x.train <- leukemia[,1:38]
y.train <- factor(c(rep("ALL",27),rep("AML",11)))

#Test set[测试集]
x.test <- leukemia[,39:72]
y.test <- factor(c(rep("ALL",20),rep("AML",14)))

#Compute MiPP[计算MIPP]
out <- mipp.seq(x=x.train, y=y.train, x.test=x.test, y.test=y.test, n.fold=5, percent.cut=0.01, rule="lda", n.seq=3)

#Print candidate models[打印候选机型]
out$model

#Print the genes selected[打印选定的基因]
out$genes.selected

##########[＃＃＃＃＃＃＃＃＃]
#Example 2: When an independent test set is not available[例2：是不是可以当一个独立的测试集]

data(colon)

#Normalize data[标准化数据]
x <- mipp.preproc(colon)
y <- factor(c("T", "N", "T", "N", "T", "N", "T", "N", "T", "N",
   "T", "N", "T", "N", "T", "N", "T", "N", "T", "N",
   "T", "N", "T", "N", "T", "T", "T", "T", "T", "T",
   "T", "T", "T", "T", "T", "T", "T", "T", "N", "T",
   "T", "N", "N", "T", "T", "T", "T", "N", "T", "N",
   "N", "T", "T", "N", "N", "T", "T", "T", "T", "N",
   "T", "N"))

#Deleting comtaminated chips[删除comtaminated芯片]
x <- x[,-c(51,55,45,49,56)]
y <- y[ -c(51,55,45,49,56)]

#Compute MiPP[计算MIPP]
out <- mipp.seq(x=x, y=y, n.fold=5, p.test=1/3, n.split=5, n.split.eval=100,
percent.cut= 0.05, rule="lda", n.seq=2)

#Print candidate models for each split[打印每个分割的候选机型]
out$model

#Print optimal models and independent evaluation for each split[打印每个分割的优化模型和独立的评估]
out$model.eval

#Print the genes selected[打印选定的基因]
out$genes.selected

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 MiPP包 mipp.seq()函数中文帮助文档(中英文对照)

浏览过的版块