找回密码
 注册
查看: 599|回复: 0

R语言 MiPP包 mipp()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-26 01:02:00 | 显示全部楼层 |阅读模式
mipp(MiPP)
mipp()所属R语言包:MiPP

                                        MiPP-based Classification
                                         MIPP  - 基于分类

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Finds optimal sets of genes for classification
发现基因的最佳组合,分类


用法----------Usage----------


mipp(x, y, x.test = NULL, y.test = NULL, probe.ID = NULL,
    rule = "lda", method.cut = "t.test", percent.cut = 0.01,
    model.sMiPP.margin = 0.01, min.sMiPP = 0.85, n.drops = 2,
    n.fold = 5, p.test = 1/3, n.split = 20,
    n.split.eval = 100)



参数----------Arguments----------

参数:x
data matrix
数据矩阵


参数:y
class vector
类向量


参数:x.test
test data matrix if available
如果有测试数据矩阵


参数:y.test
test class vector if available
如果可用的测试类向量


参数:probe.ID
probe set IDs; if NULL, row numbers are assigned.
探针集ID为NULL,如果行数分配。


参数:rule
classification rule: "lda","qda","logistic","svmlin","svmrbf";  the default is "lda".
分类规则:“LDA”,“QDA”,“MF”,“svmlin”,“svmrbf”默认为“LDA”。


参数:method.cut
method for pre-selection; t-test is available.
预选; t-检验的方法是可用的。


参数:percent.cut
proportion of pre-selected genes; the default is 0.01.
预先选定的基因的比例,默认为0.01。


参数:model.sMiPP.margin
smallest set of genes s.t. sMiPP <= (max sMiPP-model.sMiPP.margin); the default is 0.01.
最小集合基因s.t. sMiPP <=(最大sMiPP model.sMiPP.margin);默认为0.01。


参数:min.sMiPP
Adding genes stops if max sMiPP is at least min.sMiPP;  the default is 0.85.  
添加基因停止,如果最大sMiPP是至少min.sMiPP;默认为0.85。


参数:n.drops
Adding genes stops if sMiPP decreases (n.drops) times, in  addition to min.sMiPP criterion.; the default is 2.  
添加基因停止,如果sMiPP跌幅(n.drops)次,除了min.sMiPP标准;默认是2。


参数:n.fold
number of folds; default is 5.
褶皱的数量,默认是5。


参数:p.test
partition percent of train and test samples when test samples are not available; the default is 1/3 for test set.
分区%的训练和测试样品测试样品时,无法使用;默认的是1/3为测试集。


参数:n.split
number of splits; the default is 20.
数分裂,默认为20。


参数:n.split.eval
numbr of splits for evalutation; the default is 100.
numbr的分裂evalutation;默认是100。


值----------Value----------


参数:model
candiadate genes (for each split if no indep set is available
如果没有INDEP集candiadate基因(每个分割


参数:model.eval
Optimal sets of genes for each split when no indep set is available
最佳组合的基因,每个分割时没有INDEP集


作者(S)----------Author(s)----------



Soukup M, Cho H, and Lee JK




参考文献----------References----------

using misclassification penalized posterior, Bioinformatics, 21 (Suppl): i423-i430.
using gene expression data, Journal of Bioinformatics and Computational Biology, 1(4) 681-694

举例----------Examples----------



##########[#########]
#Example 1: When an independent test set is available[例1:当一个独立的测试集是可用]

data(leukemia)

#Normalize combined data[规范化的综合数据]
leukemia <- cbind(leuk1, leuk2)
leukemia <- mipp.preproc(leukemia, data.type="MAS4")

#Train set[列车集]
x.train <- leukemia[,1:38]
y.train <- factor(c(rep("ALL",27),rep("AML",11)))

#Test set[测试集]
x.test <- leukemia[,39:72]
y.test <- factor(c(rep("ALL",20),rep("AML",14)))


#Compute MiPP[计算MIPP]
out <- mipp(x=x.train, y=y.train, x.test=x.test, y.test=y.test, probe.ID = 1:nrow(x.train), n.fold=5, percent.cut=0.05, rule="lda")

#Print candidate models[打印候选机型]
out$model



##########[#########]
#Example 2: When an independent test set is not available[例2:是不是可以当一个独立的测试集]

data(colon)

#Normalize data[标准化数据]
x <- mipp.preproc(colon)
y <- factor(c("T", "N", "T", "N", "T", "N", "T", "N", "T", "N",
       "T", "N", "T", "N", "T", "N", "T", "N", "T", "N",
       "T", "N", "T", "N", "T", "T", "T", "T", "T", "T",
       "T", "T", "T", "T", "T", "T", "T", "T", "N", "T",
       "T", "N", "N", "T", "T", "T", "T", "N", "T", "N",
       "N", "T", "T", "N", "N", "T", "T", "T", "T", "N",
       "T", "N"))


#Deleting comtaminated chips[删除comtaminated芯片]
x <- x[,-c(51,55,45,49,56)]
y <- y[ -c(51,55,45,49,56)]

#Compute MiPP[计算MIPP]
out <- mipp(x=x, y=y, probe.ID = 1:nrow(x), n.fold=5, p.test=1/3, n.split=5, n.split.eval=100,
percent.cut= 0.1, rule="lda")

#Print candidate models for each split[打印每个分割的候选机型]
out$model

#Print optimal models and independent evaluation for each split[打印每个分割的优化模型和独立的评估]
out$model.eval


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-3 12:51 , Processed in 0.022629 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表