TransSphere(Tsphere)
TransSphere()所属R语言包:Tsphere
Transposable Sphering Algorithm for Large-Scale Inference.
转座球形算法进行大规模的推理。
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Applies the Transposable Sphering Algorithm to adjust for correlations among the rows and columns when conducting large-scale inference on the rows of a data matrix.
适用的转座子的球形化调整算法的行上的数据矩阵的行和列之间的相关性时,进行大规模推理。
用法----------Usage----------
TransSphere(dat, y, fdr, minlam, maxlam = NULL)
参数----------Arguments----------
参数:dat
Data matrix. Inference will be conducted on the rows and the matrix should be oriented in this manner. For example in gene expression data, the data matrix should be oriented as genes by samples.
数据矩阵。行上,将进行推理和矩阵应该以这种方式取向。例如,在基因表达数据,数据矩阵应该是作为基因的样品取向。
参数:y
A vector of group labels. Labels should be denoted as a numeric 1 or 2.
一个向量组的标签。标签应被表示为一个数字1或2。
参数:fdr
Desired False Discovery Rate to be controlled. Default is 0.1.
所需的假发现率被控制。默认值是0.1。
参数:minlam
Minimum regularization parameter to test via cross-validation for sparse inverse covariance estimation. Default is 0.15. Note that small values of this parameter may result in numerical instabilities. It is recommended to keep this parameter at the default.
稀疏逆协方差估计的最小正则化参数,通过交叉验证测试。默认值是0.15。需要注意的是较小的值,这个参数可能会导致数值不稳定性。在默认情况下,建议保持此参数。
参数:maxlam
Maximum regularization parameter to test via cross-validation for sparse inverse covariance estimation. Default is 0.25.
稀疏逆协方差估计最大的正则化参数通过交叉验证测试。默认值是0.25。
Details
详细信息----------Details----------
The Transposable Sphering Algorithm adjusts for correlations among the rows and columns of a data matrix before conducting large-scale inference. Currently, this method is only written for two-sample problems. The data matrix is row and column centered and two-sample T-statistics are computed for each row. The Transposable Sphering method is applied to the top 500 rows corresponding to the largest absolute T-statistics. The matrix is decomposed into a signal matrix, corresponding to the two classes of interest, and a noise matrix. This noise matrix is sphered so that both the rows and columns are approximately independent. Specifically, sparse inverse covariances of the rows and columns are estimated via Transposable Regularized Covariance Models and used to whiten the noise matrix. Cross-validation is used to estimate the regularization parameters controlling the amount of sparsity. The estimated signal matrix and sphered noise matrix are then added to form the sphered data matrix that is used to conduct large-scale inference. Test statistics are adjusted using central-matching, and the Benjamini-Hochberg step-up procedure is used to control the False Discovery Rate.
转位球化算法调整前的数据矩阵的行和列进行大规模的推论之间的相关性。目前,这种方法是只写两样本问题。数据矩阵的行和列居中,两个样本的T-计算统计数据的每一行。转座子的球形化的方法被施加到顶端500的行对应的绝对值最大的T-统计。该矩阵被分解成的信号矩阵,对应于两个感兴趣的类,和一个噪声矩阵。此噪声矩阵围住,这样,无论是行和列分别约为独立。具体而言,稀疏的行和列的逆协方差估计,通过转座正规化的协方差模型,并用于美白噪声矩阵。交叉验证是用来控制稀疏性的量来估计的正则化参数。估计信号矩阵和带球噪声矩阵,然后添加到形成的带球面的矩阵数据,是用于进行大规模的推理。检验统计量来调整中央匹配,用于控制假发现率的Benjamini霍赫贝格升压过程。
值----------Value----------
参数:sig.rows
The indices of the statistically significant rows after controlling the False Discovery Rate at the value fdr.
行后的价值fdr假发现率控制在统计学上显着的指标。
参数:t.stats
Sphered two-sample T-statistics.
围住两个样本的T-统计。
参数:p.vals
Sphered (unadjusted) p-values.
围住(未经调整)的p值。
参数:x.sphered
The sphered data matrix. Note that only the top 500 rows are used in the algorithm so this data matrix is has row dimension at most 500.
带球数据矩阵。请注意,只有顶端500行的算法中使用的,所以这个数据矩阵是具有行维度至多500。
(作者)----------Author(s)----------
Genevera I. Allen
参考文献----------References----------
Modeling the Effects of Row and Column Correlations", To Appear in Journal of the Royal Statistical Society, Series B (Theory & Methods), 2011.
models with an application to missing data imputation", Annals of Applied Statistics, 4:2, 764-790, 2010.
实例----------Examples----------
#batch-effect simulation[批处理效果模拟]
n = 250
p = 50
y = c(rep(1,25),rep(2,25))
mu1true = c(rep(.5,25),rep(-.5,25),rep(0,n-50))
mu2true = c(rep(-.5,25),rep(.5,25),rep(0,n-50))
Smat = cbind(matrix(mu1true,n,p/2),matrix(mu2true,n,p/2))
mus = c(-.5,-.25,0,.25,.5)
Bmatsig = matrix(1,n,1) %*% t(rep(mus,each=10))
Bmat = Bmatsig + matrix(rnorm(n*p)*.75,n,p)
xxt = matrix(rnorm(2*n^2),n,2*n)
Sig = xxt %*% t(xxt)/(2*n); eSig = eigen(Sig);
xx = matrix(rnorm(n*p),n,p)
x.b = Smat + eSig$vectors %*% diag(sqrt(eSig$values)) %*%
eSig$vectors %*% xx + Bmat
#Transposable Sphering Algorithm[的转座子球化算法]
ans = TransSphere(x.b,y,fdr=.1,.15,.25)
#significant rows[显著行]
ans$sig.rows
#true positive rate[真阳性率]
sum(ans$sig.rows<=50)/50
#false positive rate[假阳性率]
sum(ans$sig.rows>50)/200
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|