fabiap(fabia)
fabiap()所属R语言包:fabia
Factor Analysis for Bicluster Acquisition: Post-Projection (FABIAP)
因子分析Bicluster收购后投影(FABIAP)
译者:生物统计家园网 机器人LoveR
描述----------Description----------
fabiap: C implementation of fabiap.
fabiap:C的fabiap实施。
用法----------Usage----------
fabiap(X,p=5,alpha=0.1,cyc=500,spl=0,spz=0.5,sL=0.6,sZ=0.6,non_negative=0,random=1.0,center=2,norm=1,scale=0.0,lap=1.0,nL=0,lL=0,bL=0)
参数----------Arguments----------
参数:X
the data matrix.
数据矩阵。
参数:p
number of hidden factors = number of biclusters; default = 5.
隐性因素数=的biclusters;默认值= 5。
参数:alpha
sparseness loadings (0-1.0); default = 0.1.
稀疏负荷(0-1.0);默认值= 0.1。
参数:cyc
number of iterations; default = 500.
迭代次数,默认为500。
参数:spl
sparseness prior loadings (0 - 2.0); default = 0 (Laplace).
稀疏的前负荷(0 - 2.0);默认值= 0(拉普拉斯)。
参数:spz
sparseness factors (0.5 - 2.0); default = 0.5 (Laplace).
稀疏的因素(0.5 - 2.0);默认值= 0.5(拉普拉斯)。
参数:sL
final sparseness loadings; default = 0.6.
最后稀疏负荷;默认值= 0.6。
参数:sZ
final sparseness factors; default = 0.6.
最后稀疏因素;默认值= 0.6。
参数:non_negative
Non-negative factors and loadings if non_negative > 0; default = 0.
如果非消极因素和负荷non_negative> 0;默认= 0。
参数:random
<=0: by SVD, >0: random initialization of loadings in [-random,random]; default = 1.0.
用SVD <= 0:0:[随机,随机];默认值= 1.0负荷的随机初始化。
参数:center
data centering: 1 (mean), 2 (median), > 2 (mode), 0 (no); default = 2.
数据定心:1(平均),2(中位数),2(模式),0(无);默认为2。
参数:norm
data normalization: 1 (0.75-0.25 quantile), >1 (var=1), 0 (no); default = 1.
数据标准化:1(0.75-0.25位数),1(VAR = 1),0(无);默认值= 1。
参数:scale
loading vectors are scaled in each iteration to the given variance. 0.0 indicates non scaling; default = 0.0.
加载矢量缩放在每次迭代中给定的差异。 0.0表示不结垢;默认值= 0.0。
参数:lap
minimal value of the variational parameter; default = 1.0.
极小值变分参数的默认值= 1.0;
参数:nL
maximal number of biclusters at which a row element can participate; default = 0 (no limit)
最大数量时biclusters行元素可以参与;默认= 0(无限制)
参数:lL
maximal number of row elements per bicluster; default = 0 (no limit)
最大数量每bicluster行元素;默认= 0(没有限制)
参数:bL
cycle at which the nL or lL maximum starts; default = 0 (start at the beginning)
NL或LL最大的周期开始;默认值= 0(从头开始)
Details
详情----------Details----------
Biclusters are found by sparse factor analysis where both the factors and the loadings are sparse. Post-processing by projecting the final results to a given sparseness criterion.
biclusters发现稀疏的因子分析的因素和负荷稀疏。后处理,最后的结果来预测一个给定的稀疏准则。
Essentially the model is the sum of outer products of vectors:
模型本质上是向量外产品的总和:
where the number of summands p is the number of biclusters. The matrix factorization is
加数p是的biclusters数量。矩阵分解
Here λ_i are from R^n, z_i from R^l, L from R^{n \times p}, Z from R^{p \times l}, and X, U from R^{n \times l}.
这里λ_iR^n,z_iR^l,LR^{n \times p},ZR^{p \times l} X,UR^{n \times l}。
If the nonzero components of the sparse vectors are grouped together then the outer product results in a matrix with a nonzero block and zeros elsewhere.
如果稀疏向量的非零组件被组合在一起,然后在同一个非零块矩阵和零别处外产品的结果。
The model selection is performed by a variational approach according to Girolami 2001 and Palmer et al. 2006.
模型的选择是由变分法,根据2001年和Girolami Palmer等人。 2006年。
We included a prior on the parameters and minimize a lower bound on the posterior of the parameters given the data. The update of the loadings includes an additive term which pushes the loadings toward zero (Gaussian prior leads to an multiplicative factor).
我们包括之前的参数,并最大限度地减少后提供的数据参数约束的下限。负荷的更新包括添加剂术语推趋于零(高斯之前导致乘法因子)的负荷。
Post-processing: The final results of the loadings and the factors are projected to a sparse vector according to Hoyer, 2004: given an l_1-norm and an l_2-norm minimize the Euclidean distance to the original vector (currently the l_2-norm is fixed to 1). The projection is a convex quadratic problem which is solved iteratively where at each iteration at least one component is set to zero. Instead of the l_1-norm a sparseness measurement is used which relates the l_1-norm to the l_2-norm:
后处理负荷的因素,最终结果预计到稀疏向量,根据2004年霍耶,给予l_1规范和l_2范数减少到原来的欧氏距离向量(l_2-范被固定为1)。投影是一个凸二次其中至少一个组件,在每一次迭代设置为零,这是解决反复的问题。稀疏测量l_1的规范,而不是1l_1范数关系的l_2,规范使用:
The code is implemented in C and the projection in R.
实现代码在C和R中的投影
值----------Value----------
参数:
object of the class Factorization. Containing LZ (estimated noise free data L Z), L (loadings L), Z (factors Z), U (noise X-LZ), center (centering vector), scaleData (scaling vector), X (centered and scaled data X), Psi (noise variance σ), lapla (variational parameter), avini (the information which the factor z_{ij} contains about x_j averaged over j) xavini (the information which the factor z_{j} contains about x_j) ini (for each j the information which the factor z_{ij} contains about x_j).
对象类Factorization。含有LZ(估计无噪音的数据L Z)L(负荷L)Z(因素Z)U (噪音X-LZ)center(定心向量),scaleData(缩放向量),X(中心和缩放的数据X)<X >(噪声方差Psi)σ(变参数),lapla(因素avini约z_{ij}平均包含的信息<X >)x_j(信息因素j包含有关xavini)z_{j}(每x_j的信息因素ini包含有关j)。
作者(S)----------Author(s)----------
Sepp Hochreiter
参考文献----------References----------
‘FABIA: Factor Analysis for Bicluster Acquisition’, Bioinformatics 26(12):1520-1527, 2010. http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btq227
‘A Variational Method for Learning Sparse and Overcomplete Representations’, Neural Computation 13(11): 2517-2532, 2001.
‘Variational EM algorithms for non-Gaussian latent variable models’, Advances in Neural Information Processing Systems 18, pp. 1059-1066, 2006.
‘Non-negative Matrix Factorization with Sparseness Constraints’, Journal of Machine Learning Research 5:1457-1469, 2004.
参见----------See Also----------
fabia, fabias, fabiap, spfabia, fabi, fabiasp, mfsc, nmfdiv, nmfeu, nmfsc, plot, extractPlot, extractBic, plotBicluster, Factorization, projFuncPos, projFunc, estimateMode, makeFabiaData, makeFabiaDataBlocks, makeFabiaDataPos, makeFabiaDataBlocksPos, matrixImagePlot, summary, show, showSelected, fabiaDemo, fabiaVersion
fabia,fabias,fabiap,spfabia,fabi,fabiasp,mfsc,nmfdiv,nmfeu,nmfsc,plot,extractPlot,extractBic,plotBicluster,Factorization,projFuncPos,projFunc ,estimateMode,makeFabiaData,makeFabiaDataBlocks,makeFabiaDataPos,makeFabiaDataBlocksPos,matrixImagePlot,summary,show showSelected,fabiaDemo,fabiaVersion
举例----------Examples----------
#---------------[---------------]
# TEST[试验]
#---------------[---------------]
dat <- makeFabiaDataBlocks(n = 100,l= 50,p = 3,f1 = 5,f2 = 5,
of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)
X <- dat[[1]]
Y <- dat[[2]]
resEx <- fabiap(X,3,0.1,50)
## Not run: [#无法运行:]
#-----------------[-----------------]
# DEMO1: Toy Data[DEMO1:玩具资料]
#-----------------[-----------------]
n = 1000
l= 100
p = 10
dat <- makeFabiaDataBlocks(n = n,l= l,p = p,f1 = 5,f2 = 5,
of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)
X <- dat[[1]]
Y <- dat[[2]]
ZC <- dat[[3]]
LC <- dat[[4]]
gclab <- rep.int(0,l)
gllab <- rep.int(0,n)
clab <- as.character(1:l)
llab <- as.character(1:n)
for (i in 1:p){
for (j in ZC[i]){
clab[j] <- paste(as.character(i),"_",clab[j],sep="")
}
for (j in LC[i]){
llab[j] <- paste(as.character(i),"_",llab[j],sep="")
}
gclab[unlist(ZC[i])] <- gclab[unlist(ZC[i])] + p^i
gllab[unlist(LC[i])] <- gllab[unlist(LC[i])] + p^i
}
groups <- gclab
#### FABIAP[###FABIAP]
resToy3 <- fabiap(X,13,0.1,400)
extractPlot(resToy3,ti="FABIAP",Y=Y)
raToy3 <- extractBic(resToy3)
if ((raToy3$bic[[1]][1]>1) && (raToy3$bic[[1]][2])>1) {
plotBicluster(raToy3,1)
}
if ((raToy3$bic[[2]][1]>1) && (raToy3$bic[[2]][2])>1) {
plotBicluster(raToy3,2)
}
if ((raToy3$bic[[3]][1]>1) && (raToy3$bic[[3]][2])>1) {
plotBicluster(raToy3,3)
}
if ((raToy3$bic[[4]][1]>1) && (raToy3$bic[[4]][2])>1) {
plotBicluster(raToy3,4)
}
colnames(resToy3@X) <- clab
rownames(resToy3@X) <- llab
plot(resToy3,dim=c(1,2),label.tol=0.1,col.group = groups,lab.size=0.6)
plot(resToy3,dim=c(1,3),label.tol=0.1,col.group = groups,lab.size=0.6)
plot(resToy3,dim=c(2,3),label.tol=0.1,col.group = groups,lab.size=0.6)
#------------------------------------------[------------------------------------------]
# DEMO2: Laura van't Veer's gene expression [DEMO2:劳拉·范特韦埃尔的基因表达]
# data set for breast cancer [乳腺癌数据集]
#------------------------------------------[------------------------------------------]
avail <- require(fabiaData)
if (!avail) {
message("")
message("")
message("#####################################################")[################################################## ##“)]
message("Package 'fabiaData' is not available: please install.")
message("#####################################################")[################################################## ##“)]
} else {
data(Breast_A)
X <- as.matrix(XBreast)
resBreast3 <- fabiap(X,5,0.1,400)
extractPlot(resBreast3,ti="FABIAP Breast cancer(Veer)")
raBreast3 <- extractBic(resBreast3)
if ((raBreast3$bic[[1]][1]>1) && (raBreast3$bic[[1]][2])>1) {
plotBicluster(raBreast3,1)
}
if ((raBreast3$bic[[2]][1]>1) && (raBreast3$bic[[2]][2])>1) {
plotBicluster(raBreast3,2)
}
if ((raBreast3$bic[[3]][1]>1) && (raBreast3$bic[[3]][2])>1) {
plotBicluster(raBreast3,3)
}
if ((raBreast3$bic[[4]][1]>1) && (raBreast3$bic[[4]][2])>1) {
plotBicluster(raBreast3,4)
}
plot(resBreast3,dim=c(1,2),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(1,3),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(1,4),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(1,5),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(2,3),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(2,4),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(2,5),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(3,4),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(3,5),label.tol=0.03,col.group=CBreast,lab.size=0.6)
plot(resBreast3,dim=c(4,5),label.tol=0.03,col.group=CBreast,lab.size=0.6)
}
#-----------------------------------[-----------------------------------]
# DEMO3: Su's multiple tissue types[DEMO3:苏的多种组织类型]
# gene expression data set [基因表达数据集]
#-----------------------------------[-----------------------------------]
avail <- require(fabiaData)
if (!avail) {
message("")
message("")
message("#####################################################")[################################################## ##“)]
message("Package 'fabiaData' is not available: please install.")
message("#####################################################")[################################################## ##“)]
} else {
data(Multi_A)
X <- as.matrix(XMulti)
resMulti3 <- fabiap(X,5,0.1,300)
extractPlot(resMulti3,ti="FABIAP Multiple tissues(Su)")
raMulti3 <- extractBic(resMulti3)
if ((raMulti3$bic[[1]][1]>1) && (raMulti3$bic[[1]][2])>1) {
plotBicluster(raMulti3,1)
}
if ((raMulti3$bic[[2]][1]>1) && (raMulti3$bic[[2]][2])>1) {
plotBicluster(raMulti3,2)
}
if ((raMulti3$bic[[3]][1]>1) && (raMulti3$bic[[3]][2])>1) {
plotBicluster(raMulti3,3)
}
if ((raMulti3$bic[[4]][1]>1) && (raMulti3$bic[[4]][2])>1) {
plotBicluster(raMulti3,4)
}
plot(resMulti3,dim=c(1,2),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(1,3),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(1,4),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(1,5),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(2,3),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(2,4),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(2,5),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(3,4),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(3,5),label.tol=0.01,col.group=CMulti,lab.size=0.6)
plot(resMulti3,dim=c(4,5),label.tol=0.01,col.group=CMulti,lab.size=0.6)
}
#-----------------------------------------[-----------------------------------------]
# DEMO4: Rosenwald's diffuse large-B-cell[DEMO4:罗森沃尔德的弥漫性大B单元]
# lymphoma gene expression data set [淋巴瘤基因表达数据集]
#-----------------------------------------[-----------------------------------------]
avail <- require(fabiaData)
if (!avail) {
message("")
message("")
message("#####################################################")[################################################## ##“)]
message("Package 'fabiaData' is not available: please install.")
message("#####################################################")[################################################## ##“)]
} else {
data(DLBCL_B)
X <- as.matrix(XDLBCL)
resDLBCL3 <- fabiap(X,5,0.1,400)
extractPlot(resDLBCL3,ti="FABIAP Lymphoma(Rosenwald)")
raDLBCL3 <- extractBic(resDLBCL3)
if ((raDLBCL3$bic[[1]][1]>1) && (raDLBCL3$bic[[1]][2])>1) {
plotBicluster(raDLBCL3,1)
}
if ((raDLBCL3$bic[[2]][1]>1) && (raDLBCL3$bic[[2]][2])>1) {
plotBicluster(raDLBCL3,2)
}
if ((raDLBCL3$bic[[3]][1]>1) && (raDLBCL3$bic[[3]][2])>1) {
plotBicluster(raDLBCL3,3)
}
if ((raDLBCL3$bic[[4]][1]>1) && (raDLBCL3$bic[[4]][2])>1) {
plotBicluster(raDLBCL3,4)
}
plot(resDLBCL3,dim=c(1,2),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(1,3),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(1,4),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(1,5),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(2,3),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(2,4),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(2,5),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(3,4),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(3,5),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
plot(resDLBCL3,dim=c(4,5),label.tol=0.03,col.group=CDLBCL,lab.size=0.6)
}
## End(Not run)[#结束(不运行)]
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|