R语言 aroma.light包 normalizeCurveFit.matrix()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 12:05:43

normalizeCurveFit.matrix(aroma.light)
normalizeCurveFit.matrix()所属R语言包：aroma.light

                                    Weighted curve-fit normalization between a pair of channels
                                       一双渠道之间的加权曲线拟合标准化

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Weighted curve-fit normalization between a pair of channels.
加权曲线拟合标准化之间一双渠道。

This method will estimate a smooth function of the dependency between the log-ratios and the log-intensity of the two channels and then correct the log-ratios (only) in order to remove the dependency. This is method is also known as intensity-dependent or lowess normalization.
这种方法将估计光滑函数的log率和强度的两个通道的log之间的依赖关系，然后纠正log比率（只），以消除依赖。这是方法，也被称为依赖强度或LOWESS标准化。

The curve-fit methods are by nature limited to paired-channel data. There exist at least one method trying to overcome this limitation, namely the cyclic-lowess [1], which applies the paired curve-fit method iteratively over all pairs of channels/arrays. Cyclic-lowess is not implented here.
不限于自然配对通道数据的曲线拟合方法。存在至少有一个方法试图克服这种局限性，即循环LOWESS [1]，它适用于双通道/阵列配对的曲线拟合方法反复。循环，的LOWESS是不implented这里。

We recommend that affine normalization [2] is used instead of curve-fit normalization.
我们建议，仿射标准化[2]，而不是使用曲线拟合标准化。

用法----------Usage----------

参数----------Arguments----------

参数：X
An Nx2 matrix where the columns represent the two channels to be normalized.
一个的NX2matrix列代表两个渠道进行标准化。

参数：weights
If NULL, non-weighted normalization is done. If data-point weights are used, this should be a vector of length N of data point weights used when estimating the normalization function.
如果NULL，非加权标准化完成。如果使用的数据点的权重，这应该是一个长度为N的数据点的权重vector估计标准化的功能时使用。

参数：typeOfWeights
A character string specifying the type of weights given in argument weights.
一个character字符串，指定类型参数weights给定的权重。

参数：method
character string specifying which method to use when fitting the intensity-dependent function. Supported methods: "loess" (better than lowess), "lowess" (classic; supports only zero-one weights), "spline" (more robust than lowess at lower and upper intensities; supports only zero-one weights), "robustSpline" (better than spline).
character字符串指定装修时使用的强度依赖的函数的方法。支持的方法："loess"（优于LOWESS），"lowess"（经典;只支持零一重），"spline"（LOWESS在上下的强度超过健壮;支持零1权重），"robustSpline"（比样条）。

参数：bandwidth
A double value specifying the bandwidth of the estimator used.
一个double值，指定所使用的估计带宽。

参数：satSignal
Signals equal to or above this threshold will not be used in the fitting.
等于或高于此阈值的信号将不会被使用在装修。

参数：...
Not used.
不使用。

Details

详情----------Details----------

A smooth function c(A) is fitted throught data in (A,M), where M=log_2(y_2/y_1) and A=1/2*log_2(y_2*y_1). Data is normalized by M <- M - c(A).
装在一个光滑函数c(A)(A,M)M=log_2(y_2/y_1)和A=1/2*log_2(y_2*y_1)throught数据。数据归M <- M - c(A)。

Loess is by far the slowest method of the four, then lowess, and then robust spline, which iteratively calls the spline method.
黄土高原是迄今为止最慢的四种方法，然后LOWESS，然后强大的样条，反复调用样条方法。

值----------Value----------

A Nx2 matrix of the normalized two channels. The fitted model is returned as attribute modelFit.
一个的NX2matrix归两个通道。属性modelFit返回拟合模型。

负，不积极，饱和值----------Negative, non-positive, and saturated values----------

Non-positive values are set to not-a-number (NaN). Data points that are saturated in one or more channels are not used to estimate the normalization function, but they are normalized.
非正面的价值观不是一个数（NaN）。不使用数据，在一个或多个通道饱和点估计标准化的功能，但它们归。

遗漏值----------Missing values----------

The estimation of the affine normalization function will only be made based on complete non-saturated observations, i.e. observations that contains no NA values nor saturated values as defined by satSignal.
估计仿射标准化功能将只基于完整的非饱和的意见，即观察，其中包含没有NA值也不作为satSignal的定义饱和值。

加权标准化----------Weighted normalization----------

Each data point, that is, each row in X, which is a vector of length 2, can be assigned a weight in [0,1] specifying how much it should affect the fitting of the affine normalization function. Weights are given by argument weights, which should be a numeric vector of length N. Regardless of weights, all data points are normalized based on the fitted normalization function.
每个数据点，也就是X，这是一个长度为2的向量的每一行，可以指定在[0,1]重量指定多少应该影响的仿射标准化函数的拟合。权重由参数weights，这应该是一个numericvector长度为N的重量，所有数据点的基础上拟合标准化功能标准化。

Note that the lowess and the spline method only support zero-one {0,1} weights. For such methods, all weights that are less than a half are set to zero.
请注意的的LOWESS和样条法，只支持零一{0,1}重量。对于这种方法，所有的权重，不到一个半被设置为零。

黄土详情----------Details on loess----------

For loess, the arguments family="symmetric", degree=1, span=3/4, control=loess.control(trace.hat="approximate", iterations=5, surface="direct") are used.
loess，参数family="symmetric"，degree=1，span=3/4，control=loess.control(trace.hat="approximate"，iterations=5，surface="direct")用于。

作者（S）----------Author(s)----------

Henrik Bengtsson (<a href="http://www.braju.com/R/">http://www.braju.com/R/</a>)

参考文献----------References----------

Contrast Normalization of Oligonucleotide Arrays, Journal Computational Biology, 2003, 10, 95-102. <br> [2] Henrik Bengtsson and Ola H鰏sjer, Methodological Study of Affine Transformations of Gene Expression Data, Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method, BMC Bioinformatics, 2006, 7:100. <br>

参见----------See Also----------

*normalizeAffine().
*normalizeAffine()。

举例----------Examples----------

pathname <- system.file("data-ex", "PMT-RGData.dat", package="aroma.light")
rg <- read.table(pathname, header=TRUE, sep="\t")
nbrOfScans <- max(rg$slide)

rg <- as.list(rg)
for (field in c("R", "G"))
  rg[[field]] <- matrix(as.double(rg[[field]]), ncol=nbrOfScans)
rg$slide <- rg$spot <- NULL
rg <- as.matrix(as.data.frame(rg))
colnames(rg) <- rep(c("R", "G"), each=nbrOfScans)

layout(matrix(c(1,2,0,3,4,0,5,6,7), ncol=3, byrow=TRUE))

rgC <- rg
for (channel in c("R", "G")) {
  sidx <- which(colnames(rg) == channel)
  channelColor <- switch(channel, R="red", G="green");

  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
  # The raw data[原始数据]
  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
  plotMvsAPairs(rg[,sidx])
  title(main=paste("Observed", channel))
  box(col=channelColor)

  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
  # The calibrated data[校准数据]
  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
  rgC[,sidx] <- calibrateMultiscan(rg[,sidx], average=NULL)

  plotMvsAPairs(rgC[,sidx])
  title(main=paste("Calibrated", channel))
  box(col=channelColor)
} # for (channel ...)[（通道......）]

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
# The average calibrated data[平均校准数据]
#[]
# Note how the red signals are weaker than the green. The reason[注意：红色信号弱于绿色。究其原因]
# for this can be that the scale factor in the green channel is[这可能是在绿色通道的比例因子是]
# greater than in the red channel, but it can also be that there[大于在红色通道，但它也可以是有]
# is a remaining relative difference in bias between the green[是剩余的偏见相对差异的绿色]
# and the red channel, a bias that precedes the scanning.[和红色通道，前面的扫描偏见。]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
rgCA <- rg
for (channel in c("R", "G")) {
  sidx <- which(colnames(rg) == channel)
  rgCA[,sidx] <- calibrateMultiscan(rg[,sidx])
}

rgCAavg <- matrix(NA, nrow=nrow(rgCA), ncol=2)
colnames(rgCAavg) <- c("R", "G");
for (channel in c("R", "G")) {
  sidx <- which(colnames(rg) == channel)
  rgCAavg[,channel] <- apply(rgCA[,sidx], MARGIN=1, FUN=median, na.rm=TRUE);
}

# Add some "fake" outliers[添加一些“假”离群]
outliers <- 1:600
rgCAavg[outliers,"G"] <- 50000;

plotMvsA(rgCAavg)
title(main="Average calibrated (AC)")

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
# Normalize data[标准化数据]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -[--------------------------------]
# Weight-down outliers when normalizing[正火时体重下降离群]
weights <- rep(1, nrow(rgCAavg));
weights[outliers] <- 0.001;

# Affine normalization of channels[仿射标准化的渠道]
rgCANa <- normalizeAffine(rgCAavg, weights=weights)
# It is always ok to rescale the affine normalized data if its[它始终是确定以重新调整仿射规范化的数据，如果其]
# done on (R,G); not on (A,M)! However, this is only needed for[（的R，G）;不上（A，M模式）！然而，这仅仅是需要]
# esthetic purposes.[审美的目的。]
rgCANa <- rgCANa *2^1.4
plotMvsA(rgCANa)
title(main="Normalized AC")

# Curve-fit (lowess) normalization[曲线拟合（LOWESS）标准化]
rgCANlw <- normalizeLowess(rgCAavg, weights=weights)
plotMvsA(rgCANlw, col="orange", add=TRUE)

# Curve-fit (loess) normalization[曲线拟合（黄土）标准化]
rgCANl <- normalizeLoess(rgCAavg, weights=weights)
plotMvsA(rgCANl, col="red", add=TRUE)

# Curve-fit (robust spline) normalization[曲线拟合（强大的样条）标准化]
rgCANrs <- normalizeRobustSpline(rgCAavg, weights=weights)
plotMvsA(rgCANrs, col="blue", add=TRUE)

legend(x=0,y=16, legend=c("affine", "lowess", "loess", "r. spline"), pch=19,
   col=c("black", "orange", "red", "blue"), ncol=2, x.intersp=0.3, bty="n")

plotMvsMPairs(cbind(rgCANa, rgCANlw), col="orange", xlab=expression(M[affine]))
title(main="Normalized AC")
plotMvsMPairs(cbind(rgCANa, rgCANl), col="red", add=TRUE)
plotMvsMPairs(cbind(rgCANa, rgCANrs), col="blue", add=TRUE)
abline(a=0, b=1, lty=2)
legend(x=-6,y=6, legend=c("lowess", "loess", "r. spline"), pch=19,
   col=c("orange", "red", "blue"), ncol=2, x.intersp=0.3, bty="n")

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 aroma.light包 normalizeCurveFit.matrix()函数中文帮助文档(中英文对照)

浏览过的版块