R语言调用PMA包里的CCA函数，同一组数据运行出错，但随机取出数据的子集就可以运行？

舒颜 · 发表于 2014-10-27 15:48:05

输入是两个矩阵X，Z，其中X是888*635维的，Z是888*1385维的。
先是对X和Z做处理：
xmean <- apply(X,2,mean); zmean <- apply(Z,2,mean)
xsd <- apply(X,2,sd); zsd <- apply(Z,2,sd)
xsd[xsd==0] <- 1; zsd[zsd==0] <- 1

Xs <- X - matrix(xmean, nrow(X), ncol(X), byrow=T)
Zs <- Z - matrix(zmean, nrow(Z), ncol(Z), byrow=T)

Xs <- Xs * matrix(1/xsd, nrow(X), ncol(X), byrow=T)
Zs <- Zs * matrix(1/zsd, nrow(Z), ncol(Z), byrow=T)

然后调用PMA包中的CCA函数：
res <- CCA(x=Xs,z=Zs,typex="standard", typez="standard", penaltyx=0.04, penaltyz=0.04, niter=15, K=80, trace=T, standardize=F)
程序就会提示出错：Error in CheckVs(v, x, z, K) : Problem computing SVD.

其中CheckVs(v, x, z, K)是这样的：
CheckVs <- function(v,x,z,K){ # If v is NULL, then get v as appropriate.
  if(!is.null(v) && !is.matrix(v)) v <- matrix(v,nrow=ncol(z))
  if(!is.null(v) && ncol(v)<K) v <- NULL
  if(!is.null(v) && ncol(v)>K) v <- matrix(v[,1:K],ncol=K)
  if(is.null(v) && ncol(z)>nrow(z) && ncol(x)>nrow(x)){
v <- matrix(fastsvd(x,z)v[,1:K],ncol=K)
  } else if (is.null(v) && (ncol(z)<=nrow(z) || ncol(x)<=nrow(x))){
v <- matrix(svd(t(x)%*%z)v[,1:K],ncol=K)
  }
  return(v)
}

但是如果随机取X和Z的子集调用CCA就不会出错：
n <- nrow(X);  px <- ncol(X);  pz <- ncol(Z); nabs <- n
rsize <- trunc(n / 5)
index <- 1:n
# Create 5 training and test sets based on the entire dataset
for (r in 1:5){
      #Delete ids from the index list
      rindex <- sample(index, size=trunc(n/(6 - r)))
      pindex.r <- rindex
      tindex.r <- (1:nabs)[-pindex.r]
      # Build training and test sets
      Xt <- X[tindex.r, ]
         Zt <- Z[tindex.r, ]

xmean <- apply(Xt,2,mean); zmean <- apply(Zt,2,mean)
      xsd <- apply(Xt,2,sd); zsd <- apply(Zt,2,sd)
      xsd[xsd==0] <- 1; zsd[zsd==0] <- 1

      Xts <- Xt - matrix(xmean, nrow(Xt), ncol(Xt), byrow=T)
      Zts <- Zt - matrix(zmean, nrow(Zt), ncol(Zt), byrow=T)

      Xts <- Xts * matrix(1/xsd, nrow(Xt), ncol(Xt), byrow=T)
      Zts <- Zts * matrix(1/zsd, nrow(Zt), ncol(Zt), byrow=T)

result_s.sca <- CCA(x=Xts, z=Zts, typex="standard", typez="standard", penaltyx=0.01, penaltyz=0.01,
            niter=15, K=80, trace=T, standardize=F)

这样调用就不会出错，可以正常计算。那么请问第一种情况下为什么会错呢？

舒颜 · 发表于 2014-10-27 20:50:00

CheckVs（）是CCA的内部函数么？我把这个函数从程序里删除，在运行一样会报同样的错误

账号		自动登录	找回密码
密码			注册