输入是两个矩阵X,Z,其中X是888*635维的,Z是888*1385维的。
先是对X和Z做处理:
xmean <- apply(X,2,mean); zmean <- apply(Z,2,mean)
xsd <- apply(X,2,sd); zsd <- apply(Z,2,sd)
xsd[xsd==0] <- 1; zsd[zsd==0] <- 1
Xs <- X - matrix(xmean, nrow(X), ncol(X), byrow=T)
Zs <- Z - matrix(zmean, nrow(Z), ncol(Z), byrow=T)
Xs <- Xs * matrix(1/xsd, nrow(X), ncol(X), byrow=T)
Zs <- Zs * matrix(1/zsd, nrow(Z), ncol(Z), byrow=T)
然后调用PMA包中的CCA函数:
res <- CCA(x=Xs,z=Zs,typex="standard", typez="standard", penaltyx=0.04, penaltyz=0.04, niter=15, K=80, trace=T, standardize=F)
程序就会提示出错:Error in CheckVs(v, x, z, K) : Problem computing SVD.
其中CheckVs(v, x, z, K)是这样的:
CheckVs <- function(v,x,z,K){ # If v is NULL, then get v as appropriate.
if(!is.null(v) && !is.matrix(v)) v <- matrix(v,nrow=ncol(z))
if(!is.null(v) && ncol(v)<K) v <- NULL
if(!is.null(v) && ncol(v)>K) v <- matrix(v[,1:K],ncol=K)
if(is.null(v) && ncol(z)>nrow(z) && ncol(x)>nrow(x)){
v <- matrix(fastsvd(x,z)v[,1:K],ncol=K)
} else if (is.null(v) && (ncol(z)<=nrow(z) || ncol(x)<=nrow(x))){
v <- matrix(svd(t(x)%*%z)v[,1:K],ncol=K)
}
return(v)
}
但是如果随机取X和Z的子集调用CCA就不会出错:
n <- nrow(X); px <- ncol(X); pz <- ncol(Z); nabs <- n
rsize <- trunc(n / 5)
index <- 1:n
# Create 5 training and test sets based on the entire dataset
for (r in 1:5){
#Delete ids from the index list
rindex <- sample(index, size=trunc(n/(6 - r)))
pindex.r <- rindex
tindex.r <- (1:nabs)[-pindex.r]
# Build training and test sets
Xt <- X[tindex.r, ]
Zt <- Z[tindex.r, ]
xmean <- apply(Xt,2,mean); zmean <- apply(Zt,2,mean)
xsd <- apply(Xt,2,sd); zsd <- apply(Zt,2,sd)
xsd[xsd==0] <- 1; zsd[zsd==0] <- 1
Xts <- Xt - matrix(xmean, nrow(Xt), ncol(Xt), byrow=T)
Zts <- Zt - matrix(zmean, nrow(Zt), ncol(Zt), byrow=T)
Xts <- Xts * matrix(1/xsd, nrow(Xt), ncol(Xt), byrow=T)
Zts <- Zts * matrix(1/zsd, nrow(Zt), ncol(Zt), byrow=T)
result_s.sca <- CCA(x=Xts, z=Zts, typex="standard", typez="standard", penaltyx=0.01, penaltyz=0.01,
niter=15, K=80, trace=T, standardize=F)
这样调用就不会出错,可以正常计算。那么请问第一种情况下为什么会错呢?
|