找回密码
 注册
查看: 409|回复: 0

R语言 twslm包 twslm()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-10-1 13:16:26 | 显示全部楼层 |阅读模式
twslm(twslm)
twslm()所属R语言包:twslm

                                        Normalization of cDNA microarray data using the two-way semi-linear model
                                         cDNA微阵列数据的归一化使用的双向半线性模型

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Normalize cDNA microarray data using the two-way semi-linear model. Two methods are available for estimation, including a robust estimation method and the least square method. The B-splines is used to estimate nonparametric curves in the model.
规范化使用cDNA微阵列数据的双向半线性模型。有两种方法可以估算,包括健全的估计法和最小二乘法。的B-样条曲线是用来估计非参数模型中的曲线。


用法----------Usage----------


twslm(sld, blk, geneid, rt, intn, df=12, degree=3, norm.only=TRUE,
    block.norm=FALSE,robust=TRUE, robust.name="Tukey",scale.constant=2.5,
    weight.constant=4.685,ibeta=NULL,iscale=NULL,tol=1e-5)

non.robust.twslm(sld, geneid, rt, intn, df=12, degree=3, norm.only=TRUE,
               tol=1e-5)

robust.twslm(sld, geneid, rt, intn, df=12, degree=3, norm.only=TRUE,
           robust=TRUE,robust.name="Tukey", scale.constant=2.5,
           weight.constant=4.685,ibeta=NULL,iscale=NULL,tol=1e-5)

BlockByBlock(sld, blk, geneid, rt, intn, df=12, degree=3,norm.only=TRUE,
             robust=TRUE, robust.name="Tukey", scale.constant=2.5,
             weight.constant=4.685, ibeta=NULL,iscale=NULL,tol=1e-5)



参数----------Arguments----------

参数:sld
a vector of array or slide numbers. This argument is required.
阵列或幻灯片编号的向量。此参数是必需的。


参数:blk
a vector of block numbers. This argument is required only for blockwise normalization.
块数的矢量。此参数是必需的,只列块的标准化。


参数:geneid
a vector of gene identification numbers, can be numerical numbers or gene names. This argument is required.
的向量的基因的识别号码,可以是数值的号码或基因名称。此参数是必需的。


参数:rt
a vector of log_2 intensity ratio, i.e. log_2(Cy5/Cy3). This argument is required.
log_2强度比,即一个向量log_2(Cy5/Cy3)。此参数是必需的。


参数:intn
a vector of average of log two total intensity, i.e. 0.5log_2(Cy5*Cy3). This argument is required.
一个矢量平均的log总强度,即0.5log_2(Cy5*Cy3)。此参数是必需的。


参数:df
the degrees of freedom for B-spline smooth. The default is 12.
的自由度的B样条平滑。缺省值是12。


参数:degree
the order of polynomials in the B-splines. The default is 3, the cubic spline.
在B-样条曲线的多项式的顺序。默认值是3,三次样条。


参数:norm.only
a logical value indicating if only normalization is carried out. The default is TRUE. If this option is FALSE, then variance for estimated parameters of interest will be calculated beside normalization. It will take more time for calculation if this option is FALSE.
一个逻辑值,该值指示如果只进行归一化,满分。默认值是TRUE。如果此选项为FALSE,则方差估计参数的计算旁边标准化。如果此选项为FALSE,这将需要更多的时间来计算。


参数:block.norm
a logical value indicating whether blockwise normalization is performed or not. The default is FALSE, which means the default normalization is slide by slide normalization.
一个逻辑值,该值指示是否执行或不列块的标准化。默认值是FALSE,这意味着默认标准化幻灯片幻灯片标准化的。


参数:robust
a logical value indicating if the robust procedure is incoporated in the normalization. The default is TRUE, which means normalization is conducted using a robust method in the two-way semilinear model. The least square method is used if this argument is FALSE.
一个逻辑值,该值指示是否强大的程序立案法中的标准化。默认值是TRUE时,这意味着一种稳健的方法中使用的双向半线性模型,进行归一化。如果此参数为FALSE,使用最小二乘法。


参数:robust.name
a name for the robust procedure. The default is "Tukey", which means the location and scale parameters are estimated iteratively with Tukey's bisquare weight function. The other option is "Huber", which uses Huber's weight function and the location and scale parameters are estimated iteratively. This option works only if robust argument is TRUE.
强劲的程序的名称。默认值是“杜克”,这意味着Tukey的bisquare权重函数的参数估计迭代的位置和规模。另一种选择是“贝尔”,即用Huber的权重函数和参数估计迭代的位置和规模。此选项仅在强大的参数是TRUE。


参数:scale.constant
a constant chosen for scale estimation in the robust two-way semilinear model. The default is 2.5.
一个常数,被选规模估计在强大的双向半线性模型。默认值是2.5。


参数:weight.constant
a constant chosen for robust location estimation. The default is 1.345 for Huber's weight function and 4.685 for Tukey's weight function.
一个常量选择了稳健的位置估计。 Huber的权重函数和4.685 Tukey的权重函数的默认值是1.345。


参数:ibeta
a vector for initalization of \beta. The default is NULL. The ordinary least square  estimators for \beta is a good choice. Giveing this value will speed up convergence.
一个向量的\betainitalization。默认值是NULL。普通最小二乘估计为\beta是一个不错的选择。 Giveing该值将加快收敛。


参数:iscale
a value for initalization of the scale parameter in the robust model. The default is NULL. Giveing this value will speed up convergence of the algorithm.
initalization的规模在强大的模型参数的值。默认值是NULL。 Giveing该值将加快算法的收敛性。


参数:tol
a convergent criteria for iterative estimation procedure. The default is 1e-5.
一个收敛的标准迭代的评估程序。默认是1e-5。


Details

详细信息----------Details----------

Normalization is a basic step in the analysis of cDNA microarray data. Widely used normalization method for cDNA microarray data is the Lowess normalization method proposed by Yang et al.(2001). This method requires that at least one of the two underlying biological assumptions, i.e. either (i) a small fraction of genes in the experiment are differentially expressed; or (ii) the up-regulated genes and the down-regulated genes are distributed symmetrically. The proposed two-way semilinear model is a generalization of the semiparametric regression model. It does not require either of the above two assumptions for normalization of cDNA microarray data.
标准化是cDNA微阵列数据分析的一个基本步骤。广泛应用于cDNA微阵列数据的归一化方法是由杨等人(2001)提出的LOWESS归一化法。此方法要求两个潜在的生物学假设,即或者(i)的一小部分在实验中的基因的差异表达,或(ii)该上调基因和下调的基因中的至少一个对称分布。建议的双向半线性模型是一个广义的半参数回归模型。它不要求任一上述的两个假设标准化的cDNA微阵列数据。

The proposed two-way semilinear model has the form
建议双半线性模型的形式

The twslm package implements the two-way semilinear model for normalization of cDNA microarray data. Two robust estimation procedures are implemented in the current version of twslm: Huber's method (1981) and Tukey's method (1986). twslm can also calculate variance of estimated parameters of interest \beta under the assumption of constant variance for error terms in the model. Inference for differentially expressed genes can be carried out based on the estmated relative gene expression levels \hat\beta and its variance estimator.
twslm包实现了标准化的cDNA微阵列数据的双向半线性模型。在当前版本的twslm:Huber的方法(1981)和Tukey法(1986年)实施稳健估计程序。 twslm也可以计算方差估计参数的\beta的假设条件下的恒定方差模型中的错误。相对基因表达水平的estmated\hat\beta和它的方差估计的基础上,可以进行差异表达基因的推理。


值----------Value----------

An object of a list is returned with components: <table summary="R valueblock"> <tr valign="top"><td>name</td> <td>  a vector of names of unique genes. </td></tr> <tr valign="top"><td>beta</td> <td>  an estimated parameters of relative gene expression level for each gene. </td></tr> <tr valign="top"><td>ymean</td> <td>  a mean vector for each gene after normalization. It is the arithmetic mean vector if the least squares is used in the model. This vector will be weighted mean vector if robust methods are used in the model. </td></tr> <tr valign="top"><td>bvar</td> <td>  a vector of variance estimator for \hat\beta. </td></tr> <tr valign="top"><td>fittedvalue</td> <td>  a vector of fitted values in the two-way semilinear model. </td></tr> <tr valign="top"><td>bfit</td> <td>  a vector of fitted values for normalization curves. </td></tr> <tr valign="top"><td>slide</td> <td>  a vector of slide number from the input of "twslm" function. The order is different from the input "sld" vector. </td></tr> <tr valign="top"><td>id</td> <td>  a vector of gene ID from the input vector "geneid" with a different order. </td></tr> <tr valign="top"><td>ratio</td> <td>  a vector of the log two intensity ratio from the input vector "rt" with a different order. </td></tr> <tr valign="top"><td>intensity</td> <td>  a vector of average log two total intensity from the input vector "intn" with a different order. </td></tr> <tr valign="top"><td>scale</td> <td>  a scale estimator in the two-way semilinear model. </td></tr> <tr valign="top"><td>rscale</td> <td>  a robust scale estimator if the robust method is used in the model. It is NULL for the two-way semilinear model using ordinary least squares. </td></tr> </table>
对象的组件的列表,则返回:<table summary="R valueblock"> <tr valign="top"> <TD>name</ TD> <td>一个独特的基因向量的名称。 </ TD> </ TR> <tr valign="top"> <TD>beta</ TD> <TD>估计参数的每一个基因的基因的相对表达水平。 </ TD> </ TR> <tr valign="top"> <TD>ymean</ TD> <TD>,平均每个基因向量后标准化。这是如果是在模型中使用的最小二乘的算术均值向量。这将可靠的方法是在模型中使用的加权平均矢量矢量。 </ TD> </ TR> <tr valign="top"> <TD> bvar</ TD> <td>一个向量,方差估计\hat\beta。 </ TD> </ TR> <tr valign="top"> <TD> fittedvalue </ TD> <td>一个双向的半线性模型的拟合值向量。 </ TD> </ TR> <tr valign="top"> <TD>bfit</ TD> <td>一个向量的归一化曲线的拟合值。 </ TD> </ TR> <tr valign="top"> <TD>slide</ TD> <td>一个矢量的幻灯片编号的输入“twslm”功能。 “SLD”从输入矢量的顺序是不同的。 </ TD> </ TR> <tr valign="top"> <TD>id</ TD> <td>一个向量的基因ID与不同的顺序从输入的向量“geneid”。 </ TD> </ TR> <tr valign="top"> <TD> ratio</ TD> <td>一个矢量的log强度比不同的顺序从输入向量“RT” 。 </ TD> </ TR> <tr valign="top"> <TD> intensity</ TD> <td>一个矢量的平均记录总强度的输入向量“INTN”,以不同的顺序。 </ TD> </ TR> <tr valign="top"> <TD> scale </ TD> <TD>的规模估计在双向半线性模型。 </ TD> </ TR> <tr valign="top"> <TD>rscale</ TD> <td>一个可靠的方法是在模型中使用的强大的规模估计。它是双向的半线性模型,采用普通最小二乘为NULL。 </ TD> </ TR> </ TABLE>


注意----------Note----------

twslm is the main function to control which normalizatin method will be used. non.robust.twslm is the function for the two-way semilinear model using the ordinary least squares, robust.twslm is the function for robust estimation of the two-way semilinear model, BlockByBlock is the function for blockwise normalization.
twslm的主要功能是控制normalizatin方法将被用于。 non.robust.twslm的功能是为两半线性模型采用普通最小二乘,robust.twslm的功能是对的双向半线性模型的稳健估计,BlockByBlock是列块的功能,标准化。


(作者)----------Author(s)----------



Deli Wang <a href="mailto:deli.wang@ccc.uab.edu">deli.wang@ccc.uab.edu</a> Jian Huang <a href="mailto:jian@stat.uiowa.edu">jian@stat.uiowa.edu</a>




参考文献----------References----------

Huang, J., Wang, D. &amp; Zhang, C.H. (2005),  A Two-way Semi-Linear Model for Normalization and Analysis of Microarray Data. Journal of the American Statistical Association, 100(471):814-829
Wang, D., Huang, J., Xie, H., Manzella, L., Soares, M. B.,  A robust two-way semi-linear model for normalization of cDNA microarray data,BMC Bioinformatics 2005, 6:14.
Yang, Y. H., Dudoit, S., Luu, P. &amp; Speed, T. P. (2001), Normalization for cDNA microarray. In Bittner, M.L., Chen, Y., Dorsel, A.N. and Dougherty, E.R.(eds), Microarrays: Optical Technologies and Informatics. SPIE, Society for Optical Engineering, San Jose, CA, 4266.
Huber, P.J. (1981), Robust Statistics, John Wiley &amp; Sons.
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. &amp; Stahel, W. A.(1986), Robust Statistics-The Approach Based on Influence Functions, John Wiley &amp; Sons.

实例----------Examples----------



## Using one part of a public available dataset from Terry Speed group.[#特里高速集团公开使用的数据集的一部分。]
## Block one in the treatment group is chosen as an example.[#座1在治疗组中被选择作为一个例子。]

data(terrycallow)
attach(terrycallow)

p<-twslm(sld=slide,geneid=id,rt=ratio,intn=intensity)

## get normalized data[#得到标准化的数据。]
normalized.data=p$ratio-p$bfit

##plot normalization curves[#图归一化曲线]
par(mfrow=c(3,3),cex.main=0.8,cex.lab=0.7,cex.axis=0.7,mgp=c(1,0.2,0),
         mar=c(3,3,3,1),tcl=-0.3)

for(i in 1:length(unique(p$slide))){
  plot(p$intensity[p$slide==i],p$ratio[p$slide==i],xlab="1/2log2(RG)",ylab="log2(R/G)",
     main=paste("Slide",i,sep=" "))
  ii<-order(p$intensity[p$slide==i])
  lines(p$intensity[p$slide==i][ii],p$bfit[p$slide==i][ii],col="red")
}

detach("terrycallow")

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2024-11-28 13:56 , Processed in 0.036820 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表