normalizeAffine.matrix(aroma.light)
normalizeAffine.matrix()所属R语言包:aroma.light
Weighted affine normalization between channels and arrays
加权的仿射标准化之间的渠道和数组
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Weighted affine normalization between channels and arrays.
加权仿射渠道和数组之间的标准化。
This method will both remove curvature in the M vs A plots that are due to an affine transformation of the data. In other words, if there are (small or large) biases in the different (red or green) channels, biases that can be equal too, you will get curvature in the M vs A plots and this type of curvature will be removed by this normalization method.
此方法将删除在M与一个图,是由于数据的仿射变换的曲率。换句话说,如果有(或大或小)的不同(红色或绿色)通道的偏见,偏见可以等于,你会与一个图在M的曲率,这种类型的曲率将被删除归一化方法。
Moreover, if you normalize all slides at once, this method will also bring the signals on the same scale such that the log-ratios for different slides are comparable. Thus, do not normalize the scale of the log-ratios between slides afterward.
此外,如果你一次标准化的所有幻灯片,这种方法也将带来等不同的幻灯片数比率可比同等规模的信号。因此,不标准化后来幻灯片之间的log比规模。
It is recommended to normalize as many slides as possible in one run. The result is that if creating log-ratios between any channels and any slides, they will contain as little curvature as possible.
建议在尽可能多的幻灯片,一个运行标准化。结果是,如果创建log比任何渠道,任何幻灯片之间,它们将包含尽可能少的曲率。
Furthermore, since the relative scale between any two channels on any two slides will be one if one normalizes all slides (and channels) at once it is possible to add or multiply with the same constant to all channels/arrays without introducing curvature. Thus, it is easy to rescale the data afterwards as demonstrated in the example.
此外,由于任意两个通道上任何两个幻灯片之间的相对规模将是一个如果一个标准化的所有幻灯片和渠道一次很可能添加或乘以相同的常数,所有通道/不引入曲率阵列。因此,它很容易重新调整在事后的例子表明数据。
用法----------Usage----------
参数----------Arguments----------
参数:X
An NxK matrix (K>=2) where the columns represent the channels, to be normalized.
NxKmatrix(钾> = 2),其中列代表的渠道,实现标准化。
参数:weights
If NULL, non-weighted normalization is done. If data-point weights are used, this should be a vector of length N of data point weights used when estimating the normalization function.
如果NULL,非加权标准化完成。如果使用的数据点的权重,这应该是一个长度为N的数据点的权重vector估计标准化的功能时使用。
参数:typeOfWeights
A character string specifying the type of weights given in argument weights.
一个character字符串,指定类型参数weights给定的权重。
参数:method
A character string specifying how the estimates are robustified. See *iwpca() for all accepted values.
一个character字符串,指定如何抗差估计。 *iwpca()所有公认的价值观。
参数:constraint
Constraint making the bias parameters identifiable. See *fitIWPCA() for more details.
约束使得偏置参数识别。看到*fitIWPCA()更多细节。
参数:satSignal
Signals equal to or above this threshold will not be used in the fitting.
等于或高于此阈值的信号将不会被使用在装修。
参数:...
Other arguments passed to *fitIWPCA() and in turn *iwpca(). For example, the weight argument of *iwpca(). See also below.
其他的参数传递*fitIWPCA()的“反过来*iwpca()。例如,*iwpca()重量参数。也见下文。
参数:.fitOnly
If TRUE, the data will not be back-transform.
如果TRUE,数据不会回来变换。
Details
详情----------Details----------
A line is fitted robustly throught the (y_R,y_G) observations using an iterated re-weighted principal component analysis (IWPCA), which minimized the residuals that are orthogonal to the fitted line. Each observation is down-weighted by the inverse of the absolute residuals, i.e. the fit is done in L_1.
A线配备强劲throught(y_R,y_G)观测使用迭代重加权主成分分析(IWPCA),其中最小的残差拟合线是正交的。每个观察逆绝对残差加权,即适合做L_1。
值----------Value----------
A NxK matrix of the normalized channels. The fitted model is returned as attribute modelFit.
一个NxKmatrix归渠道。属性modelFit返回拟合模型。
负,不积极,饱和值----------Negative, non-positive, and saturated values----------
Affine normalization applies equally well to negative values. Thus, contrary to normalization methods applied to log-ratios, such as curve-fit normalization methods, affine normalization, will not set these to NA.
仿射标准化同样适用于为负值。因此,用于登录的比例,如曲线拟合规范化方法,仿射标准化,规范化方法相反,将不会设置这些NA。
Data points that are saturated in one or more channels are not used to estimate the normalization function, but they are normalized.
不使用数据,在一个或多个通道饱和点估计标准化的功能,但它们归。
遗漏值----------Missing values----------
The estimation of the affine normalization function will only be made based on complete non-saturated observations, i.e. observations that contains no NA values nor saturated values as defined by satSignal.
估计仿射标准化功能将只基于完整的非饱和的意见,即观察,其中包含没有NA值也不作为satSignal的定义饱和值。
加权标准化----------Weighted normalization----------
Each data point/observation, that is, each row in X, which is a vector of length K, can be assigned a weight in [0,1] specifying how much it should affect the fitting of the affine normalization function. Weights are given by argument weights, which should be a numeric vector of length N. Regardless of weights, all data points are normalized based on the fitted normalization function.
每个数据点/观察,也就是说,每行X,这是一个长度为k的向量,可以被分配在[0,1]重量指定多少应该影响的仿射标准化函数的拟合。权重由参数weights,这应该是一个numericvector长度为N的重量,所有数据点的基础上拟合标准化功能标准化。
鲁棒性----------Robustness----------
By default, the model fit of affine normalization is done in L_1 (method="L1"). This way, outliers affect the parameter estimates less than ordinary least-square methods.
默认情况下,仿射标准化的模型拟合完成L_1(method="L1")。通过这种方式,离群影响参数估计值小于普通最小二乘方法。
For further robustness, downweight outliers such as saturated signals, if possible.
为进一步的鲁棒性,离群downweight如饱和信号,如果可能的话。
We do not use Tukey's biweight function for reasons similar to those outlined in *calibrateMultiscan().
我们不使用Tukey的原因类似*calibrateMultiscan()概述biweight功能。
使用已知的/以前估计的通道偏移----------Using known/previously estimated channel offsets----------
If the channel offsets can be assumed to be known, then it is possible to fit the affine model with no (zero) offset, which formally is a linear (proportional) model, by specifying argument center=FALSE. In order to do this, the channel offsets have to be subtracted from the signals manually before normalizing, e.g. Xa <- t(t(X)-a) where e is vector of length ncol(X). Then normalize by Xn <- normalizeAffine(Xa, center=FALSE). You can assert that the model is fitted without offset by stopifnot(all(attr(Xn, "modelFit")$adiag == 0)).
如果可以假定被称为通道偏移,那么很可能没有(零)的仿射模型偏移,这正式是一个线性(正比)模型,以适应指定参数center=FALSE。为了做到这一点,要减去从信号手动标准化之前,如通道偏移Xa <- t(t(X)-a)e是vector长度ncol(X)。然后Xn <- normalizeAffine(Xa, center=FALSE)标准化。可以断言,该模型没有安装stopifnot(all(attr(Xn, "modelFit")$adiag == 0))抵销。
作者(S)----------Author(s)----------
Henrik Bengtsson (<a href="http://www.braju.com/R/">http://www.braju.com/R/</a>)
参考文献----------References----------
<br>
参见----------See Also----------
*calibrateMultiscan().
*calibrateMultiscan()。
举例----------Examples----------
pathname <- system.file("data-ex", "PMT-RGData.dat", package="aroma.light")
rg <- read.table(pathname, header=TRUE, sep="\t")
nbrOfScans <- max(rg$slide)
rg <- as.list(rg)
for (field in c("R", "G"))
rg[[field]] <- matrix(as.double(rg[[field]]), ncol=nbrOfScans)
rg$slide <- rg$spot <- NULL
rg <- as.matrix(as.data.frame(rg))
colnames(rg) <- rep(c("R", "G"), each=nbrOfScans)
layout(matrix(c(1,2,0,3,4,0,5,6,7), ncol=3, byrow=TRUE))
rgC <- rg
for (channel in c("R", "G")) {
sidx <- which(colnames(rg) == channel)
channelColor <- switch(channel, R="red", G="green");
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
# The raw data[原始数据]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
plotMvsAPairs(rg[,sidx])
title(main=paste("Observed", channel))
box(col=channelColor)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
# The calibrated data[校准数据]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
rgC[,sidx] <- calibrateMultiscan(rg[,sidx], average=NULL)
plotMvsAPairs(rgC[,sidx])
title(main=paste("Calibrated", channel))
box(col=channelColor)
} # for (channel ...)[(通道......)]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
# The average calibrated data[平均校准数据]
#[]
# Note how the red signals are weaker than the green. The reason[注意:红色信号弱于绿色。究其原因]
# for this can be that the scale factor in the green channel is[这可能是在绿色通道的比例因子是]
# greater than in the red channel, but it can also be that there[大于在红色通道,但它也可以是有]
# is a remaining relative difference in bias between the green[是剩余的偏见相对差异的绿色]
# and the red channel, a bias that precedes the scanning.[和红色通道,前面的扫描偏见。]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
rgCA <- rg
for (channel in c("R", "G")) {
sidx <- which(colnames(rg) == channel)
rgCA[,sidx] <- calibrateMultiscan(rg[,sidx])
}
rgCAavg <- matrix(NA, nrow=nrow(rgCA), ncol=2)
colnames(rgCAavg) <- c("R", "G");
for (channel in c("R", "G")) {
sidx <- which(colnames(rg) == channel)
rgCAavg[,channel] <- apply(rgCA[,sidx], MARGIN=1, FUN=median, na.rm=TRUE);
}
# Add some "fake" outliers[添加一些“假”离群]
outliers <- 1:600
rgCAavg[outliers,"G"] <- 50000;
plotMvsA(rgCAavg)
title(main="Average calibrated (AC)")
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
# Normalize data[标准化数据]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
# Weight-down outliers when normalizing[正火时体重下降离群]
weights <- rep(1, nrow(rgCAavg));
weights[outliers] <- 0.001;
# Affine normalization of channels[仿射标准化的渠道]
rgCANa <- normalizeAffine(rgCAavg, weights=weights)
# It is always ok to rescale the affine normalized data if its[它始终是确定以重新调整仿射规范化的数据,如果其]
# done on (R,G); not on (A,M)! However, this is only needed for[(的R,G);不上(A,M模式)!然而,这仅仅是需要]
# esthetic purposes.[审美的目的。]
rgCANa <- rgCANa *2^1.4
plotMvsA(rgCANa)
title(main="Normalized AC")
# Curve-fit (lowess) normalization[曲线拟合(LOWESS)标准化]
rgCANlw <- normalizeLowess(rgCAavg, weights=weights)
plotMvsA(rgCANlw, col="orange", add=TRUE)
# Curve-fit (loess) normalization[曲线拟合(黄土)标准化]
rgCANl <- normalizeLoess(rgCAavg, weights=weights)
plotMvsA(rgCANl, col="red", add=TRUE)
# Curve-fit (robust spline) normalization[曲线拟合(强大的样条)标准化]
rgCANrs <- normalizeRobustSpline(rgCAavg, weights=weights)
plotMvsA(rgCANrs, col="blue", add=TRUE)
legend(x=0,y=16, legend=c("affine", "lowess", "loess", "r. spline"), pch=19,
col=c("black", "orange", "red", "blue"), ncol=2, x.intersp=0.3, bty="n")
plotMvsMPairs(cbind(rgCANa, rgCANlw), col="orange", xlab=expression(M[affine]))
title(main="Normalized AC")
plotMvsMPairs(cbind(rgCANa, rgCANl), col="red", add=TRUE)
plotMvsMPairs(cbind(rgCANa, rgCANrs), col="blue", add=TRUE)
abline(a=0, b=1, lty=2)
legend(x=-6,y=6, legend=c("lowess", "loess", "r. spline"), pch=19,
col=c("orange", "red", "blue"), ncol=2, x.intersp=0.3, bty="n")
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|