找回密码
 注册
查看: 654|回复: 0

R语言 IPPD包 fitModelParameters()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-25 22:24:32 | 显示全部楼层 |阅读模式
fitModelParameters(IPPD)
fitModelParameters()所属R语言包:IPPD

                                        Peak parameter estimation
                                         山顶的参数估计

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

In the template-based approach of this package, each template/peak pattern is composed of several single basic peaks. The shape of such a basic peak may be modeled either as Gaussian or as Exponentially Modified Gaussian (EMG). The second model assumes that the shape of each peak equals the shape of the function one obtains when convolving the probability density function of a Gaussian distribution with the probability density function of the exponential distribution. This is a more complex as well as more flexible model, since it allows one to account for skewness. Both peak models depend on a set of parameters which are usually unknown a priori. Moreover, these parameters tend to vary over the spectrum. The documented method provides the following functionality: given a raw spectrum and a linear model that describes how the set of parameters varies as a function of m/z, we detect well-resolved peaks within the spectrum. For each detected peak, we determine the parameters of our model in such a way that the resulting peak shape matches that of the detected peak as good as possible in a least-squares sense. In this manner, we obtain a set of estimated parameters for different m/z positions. These estimates are used as response, the m/z positions as explanatory variable in a linear regression model (not necessarily linear in m/z). To be resistant to outliers (e.g. as occurring due to overlapping peaks), we use least absolute deviation regression to infer the model parameters of these linear models. The result can directly be used for the argument model.parameters in
在这个包的基于模板的方法,每个模板/峰值模式的几个基本单峰组成。这样一个基本的高峰的形状可为高斯或高斯(EMG)指数修正为蓝本。第二个模型中,假定每个峰的形状等于得到卷积的高斯分布的概率密度函数与指数分布的概率密度函数时,函数的形状。这是一个更复杂的以及更灵活的模型,因为它允许一个帐户偏斜。旺季模型依赖于一组参数,通常是未知的先验。此外,这些参数往往不同的频谱。所记录的方法提供了以下功能:原始谱和线性模型来描述不同的m / z的函数的参数设置如何,我们发现在频谱以及解决峰。为每个检测到的高峰期,我们确定我们的模型中的参数,以这样的方式,导致峰形匹配尽可能好的在最小二乘感上检测到的峰值。在这种方式中,我们获得不同的m / z位置的一组参数估计。这些估计是用来作为响应,线性回归模型的解释变量(不一定是线性的m / z)的m / z的立场。要耐离群值(例如,如发生因叠嶂),我们使用最小绝对偏差回归推断这些线性模型的模型参数。结果可直接用于参数model.parameters


用法----------Usage----------


                                           fitting = c("most_intense", "model"),
                                           formula.alpha =  formula(~1),
                                           formula.sigma = formula(~1),
                                           formula.mu = formula(~1),
                                           control = list(window = 6,
                                           threshold = NULL))



参数----------Arguments----------

参数:mz
A numeric vector of m/z (mass/charge) values (in Thomson).
一个numeric向量的m / z(质量/电荷)值(汤姆逊)。


参数:intensities
A numeric vector of intensities corresponding to mz.
一个numeric强度对应mz的向量。


参数:model
Basic model for the shape of a single peak. Must be "Gaussian" or "EMG" (exponentially modified Gaussian). See details below.
基本模式为单峰形状。必须"Gaussian"或"EMG"(指数修改高斯)。详见下文。


参数:fitting
A character specifying the mode of peak extraction and -fitting. If fitting = "most_intense", then the most intense peak is detected and the parameter(s) is (are) fitted using only this peak. Then, the resulting functions of the form parameter(mz) are all constant functions. If fitting =       "model", as many peaks as possible satisfying the criteria of control are used to estimate linear models of the form parameter(mz) = beta_0 + beta_1 g_1(mz) + ... + beta_p g_p(mz), where the g's are fixed functions. Their specification is performed by specifying one-sided formulae according to the Wilkinson-Roger notation for linear models as in the function lm. The model formulae have to satisfy the following criteria:  
一个字符指定的峰值提取及拟合模式。如果fitting = "most_intense",然后最激烈的峰值检测和参数(S)()安装使用仅此峰。然后,形式parameter(mz)功能都是恒定的功能。如果fitting =       "model",尽可能满足标准的许多山峰control用于估计线性模型的形式parameter(mz) = beta_0 + beta_1 g_1(mz) + ... + beta_p g_p(mz),其中g的固定功能。其规范执行指定片面公式根据威尔金森线性模型中的功能lm罗杰符号。模型公式,必须满足以下条件:

The formula is one sided, i.e. no term appears on the left hand side of ~.  
其计算公式是一边倒,即没有术语出现左侧~。

The right hand side consists only of functions in mz, and mz is the only variable that may be used. Product terms involving * are not admitted.  
右侧只包含mz,mz是唯一可使用的变量。功能产品条款涉及*不承认。

Important: Note that, for example ~ 1 +           mz + sqrt(mz) is a valid formula in the sense that no error will occur, but it does not correspond to the linear model  parameter(mz) = beta_0 +  beta_1 mz + beta_2           sqrt(mz). The correct model formula instead reads ~ 1 + mz + I(sqrt(mz)), i.e. each function has to be bracketed by I().
重要的是:注意,例如~ 1 +           mz + sqrt(mz)是一个有效的公式,在这个意义上,不会发生错误,但它不符合线性模型parameter(mz) = beta_0 +  beta_1 mz + beta_2           sqrt(mz)。正确的模型公式,而不是读取~ 1 + mz + I(sqrt(mz)),即每个函数必须由I()括起来。


参数:formula.alpha
A one-sided formula describing the dependence of the EMG-parameter alpha as a function of m/z, s. fitting. The default assumes that the parameter is independent of m/z, in which case one obtains a function returning the median of the estimates obtained for different detected peaks. formula.alpha, formula.sigma and formula.mu are needed if and only if fitting = "model".
一种片面的公式,描述了肌电图参数的依赖alphaM / Z,S功能。 fitting。默认的假定参数是独立的m / z,在这种情况下,获得函数返回不同的检测峰得到的估计的中位数。 formula.alpha,formula.sigma和formula.mu需要当且仅当fitting = "model"。


参数:formula.sigma
Parameter used for both peak models. For further information, see formula.alpha
旺季模型使用的参数。如需进一步信息,请参阅formula.alpha


参数:formula.mu
See formula.alpha. Used only if model =       "EMG".
看到formula.alpha。用于只有model =       "EMG"。


参数:control
A list controlling peak detection. The parameter window refers to the minimal resolution of a peak. According to window, a sequence of intensities at adjacent m/z positions is considered as peak if and only if there are at least window m/z positions with increasing intensity followed by a second sequence of window m/z positions with decreasing intensity. If in addition threshold is specified, only peaks whose maximum intensity is equal to or greater than threshold is considered. Note: Usually, threshold is specified, since otherwise the maximum intensity among the complete spectrum minus some epsilon is taken as threshold.
一个list峰值检测控制。参数window是指一个高峰的最小分辨率。据window,在相邻的m / z位置的强度序列被视为高峰当且仅当有至少windowm / z为增加强度的位置,第二个序列的windowm / z为降低强度的位置。如果除了threshold指定,唯一的山峰,其最大强度或等于比threshold被视为更大的。注:通常情况下,threshold指定,否则,减去一些小量的完整频谱之间的最大强度,threshold。


Details

详情----------Details----------

Let the variable x represent m/z. Then model =     "Gaussian" assumes that a single peak can be described as <br> <br> gaussfun(x;sigma,mu) = exp(-(x - mu)^2/sigma)  <br> <br> The parameter mu is not considered as model parameter: in the computation of the resulting basis function matrix, mu is always set to a known m/z position where the leading peak of a peak pattern might be present.<br> Model = "EMG" assumes that a single peak can be described as <br> <br> EMG(x;alpha,sigma,mu) = exp(sigma^2/(2 * alpha^2) + (mu -     x)/alpha) (1 - Phi(sigma/alpha + (mu - x)/(sigma)))/alpha, <br> <br> where Phi represents the cumulative density function of the standard Gaussian distribution. Alternatively, EMG(.;alpha,sigma,mu) can be expressed as <br> EMG(x;alpha,sigma,mu) = (phi ** gamma)(x), <br> where ** denotes convolution, phi is the density function of the Gaussian distribution with mean mu and standard deviation sigma and gamma is the density function of an exponential distribution with expectation alpha.<br> The parameters of EMG can be interpreted as follows.
让变量x代表的m / z。然后model =     "Gaussian"假设一个单峰可谓参考参考gaussfun(x;sigma,mu) = exp(-(x - mu)^2/sigma)参考参考参数mu不被视为模型参数计算基函数矩阵,mu总是被设置为一个已知的m / z位置的龙头峰峰值模式可能存在。参考Model = "EMG"假设,可以被描述为一个单峰<BR> <BR> EMG(x;alpha,sigma,mu) = exp(sigma^2/(2 * alpha^2) + (mu -     x)/alpha) (1 - Phi(sigma/alpha + (mu - x)/(sigma)))/alpha,参考参考Phi标准高斯分布的累积密度函数。另外,EMG(.;alpha,sigma,mu)可以作为参考EMG(x;alpha,sigma,mu) = (phi ** gamma)(x),参考表示**表示卷积,phi意味着高斯分布的密度函数<X >和标准差mu和sigma是期望gamma。参考指数分布的密度函数的参数alpha可以解释如下。




alpha The lower alpha, the more the shape of the peak resembles that of a Gaussian. Conversely, large values of
alpha低alpha,峰的形状更类似于高斯。相反,大值




sigma Controls the width of the peak (together with
sigma控制峰宽(连同




mu A location parameter. Note that in general mu does not coincide with the mode of EMG. Therefore, if model = "EMG", all three parameters are estimated from
mu一个位置参数。请注意,在一般情况下mu不配合EMG的模式。因此,如果model = "EMG",所有三个参数的估计

Moreover, the skewness of EMG is characterized by the ratio alpha/sigma.
此外,的偏EMG“的特点是比alpha/sigma。


值----------Value----------

An object of class modelfit.
对象类modelfit。


警告----------Warning----------

Parameter estimation by fitting detected peaks is possible only if single peaks are sufficiently well-resolved. A peak composed of, say, five (m/z, intensity) pairs, is inappropriate to infer three parameters.
装修检测峰的参数估计是可能的,只有单峰不够好解决。推断三个参数是不适当的,比方说,对五(M / Z,强度),组成一个高峰。


警告----------Warning----------

The choice model = "EMG" in conjunction with fitting =      "model" can be extremely slow (taking up to several minutes of computing time) if many peaks are detected and fitted. This is caused by a grid search over a grid of 10^6 different combinations of alpha, sigma and mu performed prior to nonlinear least squares estimation in order to find suitable starting values.
选择model = "EMG"结合fitting =      "model"可能是极其缓慢的(占用几分钟的计算时间),如果检测到许多山峰和安装。这是由比10^6alpha,sigma和mu之前进行非线性最小二乘估计,以便找到合适的初始值的不同组合网格的网格搜索。


参见----------See Also----------

getPeaklist, modelfit
getPeaklist,modelfit


举例----------Examples----------


### load data[##加载数据]
data(toyspectrum)
### estimate parameter sigma of a Gaussian model,[#高斯模型的参数估计西格玛,]
### assumed to be independent of m/z[#假定为独立的m / z]

simplegauss <- fitModelParameters(toyspectrum[,1],
             toyspectrum[,2],
             model = "Gaussian",
             fitting = c("model"),
             formula.sigma = formula(~1),
             control = list(window = 6, threshold = 1))

show(simplegauss)
visualize(simplegauss, type = "peak", xlab = "m/z", ylab = "intensity",
          main = "Gaussian fit")

### fit the model sigma(m/z) = beta_0 + beta_1 m/z + beta_2 m/z^2[#适合模型Σ(M / Z)= beta_0 + beta_1 M / Z + beta_2的m / z ^ 2]

gaussquadratic <- fitModelParameters(toyspectrum[,1],
             toyspectrum[,2],
             model = "Gaussian",
             fitting = "model",
             formula.sigma = formula(~mz + I(mz^2) ),
             control = list(window = 6, threshold = 1))

show(gaussquadratic)
visualize(gaussquadratic, type = "model", modelfit = TRUE)

### estimate parameters for EMG-shaped peaks[#参数估计肌电图形的山峰]

EMGlinear <- fitModelParameters(toyspectrum[,1],
             toyspectrum[,2],
             model = "EMG",
             fitting = "model",
             formula.alpha = formula(~mz),
             formula.sigma = formula(~mz),
             formula.mu = formula(~1),
             control = list(window = 6, threshold = 1))

show(EMGlinear)

visualize(EMGlinear, type = "peak", xlab = "m/z", ylab = "intensities",
          main = "EMG fit")

visualize(EMGlinear, type = "model", parameters = c("alpha", "sigma"), modelfit = TRUE)

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-5 14:03 , Processed in 0.025772 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表