VarEff-package(VarEff)
VarEff-package()所属R语言包:VarEff
Overview: Estimation of effective sizes from present to ancestral time
概述:有效尺寸的估计,从目前到祖先的时间
译者:生物统计家园网 机器人LoveR
描述----------Description----------
This package is a model called VarEff to estimate the evolution of effective population size with coalescent approach.
该软件包是一个模型,称为VarEff估计有效群体大小的演变与成膜助剂的方法。
The estimation is done on simulated demographies modelled by steps of constant size for which the posterior probabilities are derived using an approximation of likelihood.
该估计模型的后验概率的大小固定的步骤推导出一个近似的可能性模拟demographies。
Details
详细信息----------Details----------
概观----------Overview----------
This package depends on package MCMC (C.J. Geyer, 2009, version 0.7-3), so you have to load the library mcmc: library(mcmc). <br>
这个软件包依赖于包MCMC(CJ赫耶尔2009年版0.7-3),所以你要加载的库MCMC:图书馆(MCMC)。参考
To use this package and explore the results go through four steps:
要使用这个包和探索的结果经过四个步骤:
Data preparation
数据准备
Variable input
输入的变量
Output files
输出文件
Explore the results
探索的结果
1。数据准备----------1. Data preparation----------
The data file describes the genotypes of a population at microsatellite markers.
数据文件描述了人口的微卫星标记的基因型。
The model assumes alleles defined by their lengths (number of microsatellite repeats).
该模型假设定义的长度(微卫星重复数)的等位基因。
The format of the file is close to MSVAR infile (Beaumont 1999). To convert a MSVAR file in VarEff file go to the Web site: https://qgp.jouy.inra.fr <br>
的文件的格式是靠近到MSVAR的infile(博蒙特1999)。要转换MSVAR的文件VarEff文件的网站:https://qgp.jouy.inra.fr参考
Infile:
INFILE:
Each markers is described by 2 lines.
每个标记是由2行描述。
The first line gives the number of alleles (allelic classes) at the locus.
第一行给出的位点的等位基因数(等位基因类)。
The second line gives the numbers of alleles at each corresponding length of the locus. <br>
第二行给出的等位基因的数目,在每个相应的长度的轨迹。参考
Caution:
注意:
You have to specify all potential alleles between those of minimum and maximum lengths.
您必须指定所有可能的等位基因之间的最小和最大长度。
It means that if you have a locus with 2 types of alleles at the lengths 10 and 12 (number of repeat motifs), you have to mention the unobserved allele with 11 motifs. <br>
这意味着,如果你有一个位点有2种等位基因的长度10和12(重复序列的数量),就不得不提到未观测到的等位基因有11个图案。参考
So if the alleles 10 and 12 have observed at the frequencies 24 and 6 respectively, you have to describe the locus by:
因此,如果观察到的等位基因10和12已分别在频率24和6,你要描述的轨迹:
3
3
24 0 6 <br>
24 0 6 <BR>
In this package the infile test is called InputTest.
在这个包中的infile测试的名字叫做InputTest。
2。输入的变量----------2. Variable input----------
Because this model follows a Bayesian approach, you have to give priors on effective sizes (current and ancestral) and age of the population, specifying means and variances on the logarithmic scale.
因为这个模型如下:贝叶斯方法,你必须给先验有效的尺寸(当前和祖先)和年龄的人口中,指定装置和对数刻度的差异。
Estimations are concerned with reduced population sizes (on the Theta = 4 * N * u scale) and reduced time (product of generation times (T) and mutation rate (u)).
估计所关心的是降低人口规模的Theta = 4 * N * U规模和时间缩短,产品的世代时间(T),变异率(U)。
The mutation rate is not estimated in this package. The package used the mutation rate as a scale parameter to recover actual census size and actual times (generation numbers) from the results.
在此包中的突变率估计不。该器件采用的突变率作为尺度参数,恢复实际普查的大小和实际时间(代数字)的结果。
In the case that you want estimation not in Theta (4*Ne*u) but in effective size Ne), you have to estimate previously the mutation rate (u) with existing method (Ex: MSVAR (Beaumont 1999).
在情况下,你要估计θ(4 * NE * U),但在有效尺寸NE),你必须与现有的方法(例如:MSVAR(博蒙特1999年)以前的估计突变率(u)。
The other parameters use in the package to visualize results concern the time (generations) that you want to go back and the times you want to watch. <br>
其他参数使用包中的可视化结果的关注要回去的时候,你想要观看的时间(代)。参考
Call the package VarEff then answer the questions: <br>
调用的包VarEff,然后回答的问题是:参考
- parafile (Name that you give to the job and to the output files created by the model).
- parafile(姓名,你给的工作和创建的输出文件的模型)。
- infile (Name of the data file).
- infile中(数据文件名称)。
- NBLOC (Number of Loci).
- NBLOC(位点)。
- JMAX (Number of times when the effective size has changed, used to generate step functions simulating the past demography. Ex: JMAX=2, if you think that the population took 3 different effective sizes in the past).
- JMAX(次时的有效尺寸发生了变化,产生阶梯函数的模拟过去的人口。例如:JMAX = 2,如果你认为在过去的人口了3种不同的有效尺寸)。
- MODEL (choose one mutation model in: S = Single Step Model, T = Two Phase Model, G = Geometric Model, and provide an additional coefficient (C) for T and G models).
- MODEL(选择一个突变模型:S =单步模式,T = G =几何模型,两阶段模型,并提供额外的系数(C),T和G型号)。
- MUTAT (Mutation rate, assumed the same for all loci).
- MUTAT(突变率,假定所有位点相同)。
- NBAR (Global prior mean of effective size).
- NBAR(全球前平均有效尺寸)。
- VARP1 (Variance of the prior log-distribution of effective sizes. Ex: VARP1=3 allows for searches with 20- to 40-fold relative variations of effective size).
- VARP1(方差现有log有效体积分布例:VARP1 = 3允许检索数据,与20 - 40倍的相对变化的有效尺寸)。
- RHOCORN (Coefficient of correlation between effective sizes in successive intervals).
- RHOCORN(在连续的时间间隔的有效大小之间的相关性系数)。
- GBAR (Number of generations since the assumed origin of the population).
- GBAR(假设起源的人口数代以来)。
- VARP2 (Variance of the prior log-distribution of time intervals during which the population is assumed of constant size).
- VARP2(事先log的时间间隔,在此期间,人口分布大小不变的假设的差异)。
- DMAXPLUS = DMAX+1 (DMAX is the maximal distance between alleles (number of microsatellite motifs) that is used in the estimation algorithm).
- DMAXPLUS = DMAX +1(DMAX是等位基因(微图案的数量)之间的最大距离的估计算法中使用的)。
- Diagonale (A smoothing parameter to balance the observed covariance structure with a theoretical diagonal variance matrix and avoid numerical instability. Diagonale = 0.5 is a robust choice).
- Diagonale(A平滑参数,以平衡所观察到的协方差结构的理论对角线协方差矩阵,避免数值不稳定。Diagonale = 0.5是一个强大的选择)。
- NumberBatch (number of batch (nbatch) for metrop in MCMC).
- NumberBatch(数量批量(nbatch)的为metrop在MCMC)。
- LengthBatch (length of batch (blen) for metrop in MCMC).
- LengthBatch(长度的批次(BLEN在MCMC)metrop)。
- SpaceBatch (space of batch (nspac) for metrop in MCMC). <br>
- SpaceBatch(空间在MCMC批次(nspac)metrop,)。参考
You can also directly give the parameters into R console. <br>
您也可以直接给到R控制台的参数。参考
Exemple with data InputTest:
为例的数据InputTest:
VarEff(infile=system.file("data/InputTest.txt", package = "VarEff"), parafile = 'job', NBLOC=20, JMAX=3, MODEL = 'S', MUTAT=0.01, NBAR=1000, VARP1=3, RHOCORN=0, GBAR=5000, VARP2=3, DMAXPLUS=12, Diagonale=0.5, NumberBatch = 2, LengthBatch = 1, SpaceBatch = 1)
的VarEff(INFILE。系统(数据/ InputTest.txt“,包=的”VarEff“的),parafile =工作,NBLOC = 20,JMAX = 3,MODEL =S,MUTAT = 0.01,NBAR = 1000,VARP1的= 3,RHOCORN = 0,GBAR = 5000,VARP2 = 3,DMAXPLUS = 12,Diagonale = 0.5,NumberBatch = 2,LengthBatch = 1,SpaceBatch = 1)
3。输出文件----------3. Output files----------
At the end of the calculations, VarEff() returns global theta values, the summaries of adjustment criteria of data to model, and the distribution of posterior probabilities, which are added to the .Theta file.<br>
在计算结束中,VarEff()返回全球西塔值,调整标准数据模型的摘要,和分布的后验概率,其中被添加到。的Theta文件。<br>物理化学学报
The 4 lines includes in .Theta file contain:
4线包括。西塔文件包含以下内容:
Line 1: global Theta0, Theta1 and Theta2 estimates.
线路1:全球Theta0,THETA1和Theta2的估计。
Line 2: imbalance indices ln(Theta1/Theta0) and ln(Theta2/Theta0).
2号线:失衡指标的LN(Theta1/Theta0)和LN(Theta2/Theta0)。
Line 3: expected range of Ne values, from the minimum and maximum global Theta estimates.
第3行:氖(Ne)的值的预期范围内,从极小值和最大值的全球的Theta估计。
Line 4: means and standard deviations over simulations of the quadratic deviations of data from simulated state and of natural logarithm of the prior probabilities of the simulated states.<br>
4号线:在模拟数据从模拟状态下的模拟状态的先验概率的自然对数的二次偏差的平均值和标准差。<BR>
The main result of VarEff() is the .Batch file, which reports a list of demographic evolutions described by step functions. Each line includes:
的,主要结果VarEff()是批处理文件,其报告的列表阶跃函数所描述的人口演变。每行包括:
Column 1: the number i of the simulated state (from 1 to Numberbatch).
塔1:的数量i的模拟状态(从1至Numberbatch的)。
Column 2: quadratic deviation of data from the i-th simulated state.
第2列:从第i个模拟状态下的数据的二次偏差。
Column 3: natural logarithm of the prior probability the i-th state.
第3栏:自然对数的先验概率的第i状态。
Columns 4 to JMAX+4: the JMAX+1 population sizes in the i-th state.
4 JMAX列+4:JMAX +1的人口规模在第i个状态。
Columns JMAX+5 to 2 JMAX+4: times of size changes in the i-th state.
列JMAX JMAX +4 +5~2倍大小的变化在第i个状态。
Columns 2 JMAX + 5: value of the C parameter of the mutation model. <br>
2 JMAX + 5列:C参数的突变模型的价值。参考
Results are kept in the .Batch files in reduced scales:
结果被保存在批处理文件中缩小比例:
Theta's for population sizes, products of generation numbers times mutation rate for times of size changes.
西塔的人口规模,产品的代数字时代的突变率大小的变化的时候。
The additional C parameter is set to 0 for the Single Step Mutation Model, positive for geometrical model or negative for the Two Phase Model. <br>
额外的C参数设置为0,单步突变模型的几何模型,积极或消极的两阶段模型。参考
NatSizeDist <br>
NatSizeDist参考
To obtain the distributions of effective size at a number of generations in the past, from the time of sampling to an ancestral time, use the function called NatSizeDist().<br>
要获得有效的规模在过去一个世代数的分布,从时间采样的祖先,使用该功能名为NatSizeDist()。<BR>
This function provides 2 files with the results in the Ne scale:
此功能提供了2个文件在NE规模的结果:
-job.Nstat
- job.Nstat
-job.Ndist <br>
- job.Ndist <BR>
Format of Nstat or Lstat file
Nstat或lstat文件格式
Column 1: Time in generations (if MUTAT is not 0) or the corresponding reduced time.
塔1:时间在世代(如果MUTAT不是0)或相应的减少的时间。
Columns 2: Arithmetic Mean of Ne or Log(Ne).
第2列:的Ne或登录(NE)的算术平均数。
Columns 3: Harmonic means of Ne (not provided for Log(Ne), set to 0 in .Lstat file).
列3:谐波手段,NE(不提供为log(NE),设置为0。LSTAT文件)。
Columns 4: Mode of Ne or Log(Ne).
列4:的Ne或登录(NE)的模式。
Columns 5: Median of Ne or Log(Ne).
列5位数的Ne或登录(NE)。
Columns 6: Quantile 5 percent of Ne or Log(Ne).
第6列:位数5%的Ne或登录(NE)。
Columns 7: Quantile 95 percent of Ne or Log(Ne). <br>
列7位数的95%的Ne或登录(NE)。参考
LogSizeDist <br>
LogSizeDist参考
To obtain the distributions of logarithm of effective size at a number of generations in the past, from the time of sampling to an ancestral time, use the function called LogSizeDist(). <br>
在过去一个世代数,从时间采样的祖先为了获得对数的有效尺寸分布,可以使用函数称为LogSizeDist()。参考
This function provides 2 files with the results in the Log(Ne) scale:
此功能提供了2个文件的log(NE)规模的结果:
-job.Lstat
- job.Lstat
-job.Ldist <br>
- job.Ldist <BR>
Format of Ndist or Ldist file:
Ndist或Ldist文件格式:
Posterior densities of Ne or Log(Ne) at past times (fitted distribution using the density R function).
Ne或登录(NE)在过去的时间后密度(拟合分布的密度R的功能)。
File with (Nbinstants+1) lines and 514 or 1025 columns.
文件,(Nbinstants +1)线和514或1025列。
Lines: Instants when the distribution of N(T(i-1)) was calculated (1<i<Nbinstants+1; 0<T(i-1)<Tempsmax). <br>
路线:时点的N(T第(i-1))的分布时,计算(1 <I <Nbinstants 1,0 <T第(i-1)<Tempsmax)。参考
File Ndist:
文件Ndist:
Columns in line i :
中的列线:
Column 1 : Value of T(i-1).
专栏1:T(I-1)。
Columns 2 : Size of each of the intervals (=TMAX/511) in the abcsissa (Ne scale).
第2列:每个在abcsissa(氖规模)的间隔(= TMAX/511)的大小。
Columns 3 to 514 : Ordinates (densities of Ne at 512 points). <br>
列3到514:坐标(东北密度512点)。参考
File Ldist:
文件Ldist:
Columns in line i :
中的列线:
Column 1 : Value of T(i-1).
专栏1:T(I-1)。
Columns 2 to 513 : Abscissa ( Log(Ne) values).
列2到513:横坐标(log(NE)值)。
Columns 514 to 1025 : Ordinates (densities of these Log(Ne)).
514到1025列:纵坐标的这些log(密度(Ne)等)。
(作者)----------Author(s)----------
Natacha Nikolic <documents_57@hotmail.com> and Claude Chevalet <claude.chevalet@toulouse.inra.fr>
Maintainer: Who to complain to <documents_57@hotmail.com>
Natacha Nikolic
参考文献----------References----------
参见----------See Also----------
Summary: VarEff <br> Exemple: InputTest <br> HelpData: HelpData <br> Functions to built output files: NatSizeDist <br> or LogSizeDist <br> Functions to visualize and plot the results: plotNdistrib <br> and NTdist <br> Web site: https://qgp.jouy.inra.fr
摘要:VarEff的<BR>为例:InputTest的<BR> HelpData:HelpData的<BR>或NatSizeDist LogSizeDist的<BR>功能设计的输出文件: <BR>可视化的功能,并画出结果:plotNdistrib的<BR>和NTdist的<BR>网站:https://qgp.jouy.inra.fr
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|