mipp.seq(MiPP)
mipp.seq()所属R语言包:MiPP
MiPP-based Classification
MIPP - 基于分类
译者:生物统计家园网 机器人LoveR
描述----------Description----------
sequentially finds optimal sets of genes for classification
依次找到最佳的分类基因集
用法----------Usage----------
mipp.seq(x, y, x.test = NULL, y.test = NULL, probe.ID = NULL,
rule = "lda", method.cut = "t.test", percent.cut = 0.01,
model.sMiPP.margin = 0.01, min.sMiPP = 0.85, n.drops = 2,
n.fold = 5, p.test = 1/3, n.split = 20, n.split.eval = 100,
n.seq=3, cutoff.sMiPP=0.7, remove.gene.each.model="all")
参数----------Arguments----------
参数:x
data matrix
数据矩阵
参数:y
class vector
类向量
参数:x.test
test data matrix if available
如果有测试数据矩阵
参数:y.test
test class vector if available
如果可用的测试类向量
参数:probe.ID
probe set IDs; if NULL, row numbers are assigned.
探针集ID为NULL,如果行数分配。
参数:rule
classification rule: "lda","qda","logistic","svmlin","svmrbf"; the default is "lda".
分类规则:“LDA”,“QDA”,“MF”,“svmlin”,“svmrbf”默认为“LDA”。
参数:method.cut
method for pre-selection; t-test is available.
预选; t-检验的方法是可用的。
参数:percent.cut
proportion of pre-selected genes; the default is 0.01.
预先选定的基因的比例,默认为0.01。
参数:model.sMiPP.margin
smallest set of genes s.t. sMiPP <= (max sMiPP-model.sMiPP.margin); the default is 0.01.
最小集合基因s.t. sMiPP <=(最大sMiPP model.sMiPP.margin);默认为0.01。
参数:min.sMiPP
Adding genes stops if max sMiPP is at least min.sMiPP; the default is 0.85.
添加基因停止,如果最大sMiPP是至少min.sMiPP;默认为0.85。
参数:n.drops
Adding genes stops if sMiPP decreases (n.drops) times, in addition to min.sMiPP criterion.; the default is 2.
添加基因停止,如果sMiPP跌幅(n.drops)次,除了min.sMiPP标准;默认是2。
参数:n.fold
number of folds; default is 5.
褶皱的数量,默认是5。
参数:p.test
partition percent of train and test samples when test samples are not available; the default is 1/3 for test set.
分区%的训练和测试样品测试样品时,无法使用;默认的是1/3为测试集。
参数:n.split
number of splits; the default is 20.
数分裂,默认为20。
参数:n.split.eval
numbr of splits for evalutation; the default is 100.
numbr的分裂evalutation;默认是100。
参数:n.seq
Number of sequential gene model selection; the default is 3.
顺序基因模型选择的数量;默认为3次。
参数:cutoff.sMiPP
Cutoff point of 5 percent sMiPP to select gene models
截止点5%sMiPP的选择基因模型
参数:remove.gene.each.model
Re-run after removing all genes in the selected models if "all" and the first gene for each of the selected models if "first"
重新运行后,删除“所有”,如果所选的模型,如果“第一”的每一个基因的某些型号,所有的基因
值----------Value----------
参数:model
candiadate genes (for each split if no indep set is available
如果没有INDEP集candiadate基因(每个分割
参数:model.eval
Optimal sets of genes for each split when no indep set is available
最佳组合的基因,每个分割时没有INDEP集
参数:genes.selected
a list of genes selected by sequential selection
基因顺序选择选择列表
作者(S)----------Author(s)----------
Soukup M, Cho H, and Lee JK
参考文献----------References----------
using misclassification penalized posterior, Bioinformatics, 21 (Suppl): i423-i430.
using gene expression data, Journal of Bioinformatics and Computational Biology, 1(4) 681-694
举例----------Examples----------
##########[#########]
#Example 1: When an independent test set is available[例1:当一个独立的测试集是可用]
data(leukemia)
#Normalize combined data[规范化的综合数据]
leukemia <- cbind(leuk1, leuk2)
leukemia <- mipp.preproc(leukemia, data.type="MAS4")
#Train set[列车集]
x.train <- leukemia[,1:38]
y.train <- factor(c(rep("ALL",27),rep("AML",11)))
#Test set[测试集]
x.test <- leukemia[,39:72]
y.test <- factor(c(rep("ALL",20),rep("AML",14)))
#Compute MiPP[计算MIPP]
out <- mipp.seq(x=x.train, y=y.train, x.test=x.test, y.test=y.test, n.fold=5, percent.cut=0.01, rule="lda", n.seq=3)
#Print candidate models[打印候选机型]
out$model
#Print the genes selected[打印选定的基因]
out$genes.selected
##########[#########]
#Example 2: When an independent test set is not available[例2:是不是可以当一个独立的测试集]
data(colon)
#Normalize data[标准化数据]
x <- mipp.preproc(colon)
y <- factor(c("T", "N", "T", "N", "T", "N", "T", "N", "T", "N",
"T", "N", "T", "N", "T", "N", "T", "N", "T", "N",
"T", "N", "T", "N", "T", "T", "T", "T", "T", "T",
"T", "T", "T", "T", "T", "T", "T", "T", "N", "T",
"T", "N", "N", "T", "T", "T", "T", "N", "T", "N",
"N", "T", "T", "N", "N", "T", "T", "T", "T", "N",
"T", "N"))
#Deleting comtaminated chips[删除comtaminated芯片]
x <- x[,-c(51,55,45,49,56)]
y <- y[ -c(51,55,45,49,56)]
#Compute MiPP[计算MIPP]
out <- mipp.seq(x=x, y=y, n.fold=5, p.test=1/3, n.split=5, n.split.eval=100,
percent.cut= 0.05, rule="lda", n.seq=2)
#Print candidate models for each split[打印每个分割的候选机型]
out$model
#Print optimal models and independent evaluation for each split[打印每个分割的优化模型和独立的评估]
out$model.eval
#Print the genes selected[打印选定的基因]
out$genes.selected
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|