找回密码
 注册
查看: 475|回复: 0

R语言 trio包 trio.sim()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-10-1 12:07:43 | 显示全部楼层 |阅读模式
trio.sim(trio)
trio.sim()所属R语言包:trio

                                        Simulate Case-Parent Trios
                                         模拟案例父三重奏

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

This function generates case-parents trios when the disease risk of children is specified by (possibly higher-order) SNP-SNP interactions. The SNP minor allele frequencies and/or haplotypes are specified by the user, as are the parameters in the logistic model
这个函数生成指定儿童疾病的风险(可能是高阶)SNP-SNP相互作用的情况下,家长三重奏。 SNP未成年人的等位基因频率和/或由用户指定的单倍型,Logistic模型中的参数


用法----------Usage----------


trio.sim(freq, interaction="1R and 2D", prev=1e-3, OR=1, n=100, rep=1,
   step.save=NULL, step.load=NULL, verbose=FALSE)



参数----------Arguments----------

参数:freq
A data frame specifying haplotype blocks and frequencies. For an example, see the data frame simuBkMap contained in this package.  If provided, the following argument blocks will be ignored.  The object must have three columns in the following order: block identifiers (key), haplotypes (hap), and haplotype frequencies (freq).  The block identifiers must be unique for each block. For each block, the haplotypes must be encoded as a string of the integers 1 and 2, where 1 refers to the major allele and 2 refers to the minor allele. The respective haplotype frequencies will be normalized to sum one.  
一个数据框指定的单倍型块和频率。举一个例子,看到的数据框simuBkMap包含在此套件。如果提供的话,下面的参数blocks将被忽略。该对象必须有三列顺序如下:块标识符(key),单倍型(hap)和单倍型频率(freq)。每个块的块标识符必须是唯一的。对于每个块,必须被编码为一个字符串的整数1和2,其中1是指主要等位基因和2是指次要等位基因的单倍型。各自的单倍型频率将被标准化,总结1。


参数:interaction
A string that specifies the risk altering genotype interaction as a Boolean term, such as "7D or 19R", or "(not 10D) or 45D".  Each locus can appear at most once in the string, and the the Boolean term not can appear at most once before each locus, and must be enclosed in paranthesis, e.g., "(not 3D)".  Therefore, strings such as "not (not 3D)" and "not 3D or 5R" are prohibited.  Parenthesis are also used to unambiguously define the Boolean expression as a binary tree, i.e., every parent node has exact two children.  For example Thus, a long string such as "1R or 3D or 5R" must be written as "(1R or 3D) or 5R" or as "1R or (3D or 5R)", even though the paranthesis are technically redundant. There is also a limit on the size of the interactions, please see Details below.
一个字符串,指定一个布尔值来看,这样的作为“7D,或19R”,或“(而不是10D)或45D”的风险改变基因型互作。每个位点可以出现在字符串中的最多一次,和的布尔术语的,不能再次出现在最前,每个位点,必须括在括号,例如,“(3D)”。因此,字符串,如“不(未3D)”和“没有3D或5R”被禁止。括号也可用于明确定义为一个二进制树的布尔表达式,即每个父节点有确切的两个孩子。因此,例如如“1R或3D或5R一个长字符串”必须被写入“(1R或3D)或5R”或为“1R或(3D或5R)”,即使paranthesis技术上多余的。的相互作用的大小也有一个限制,请参见下面的详细信息。


参数:prev
The prevalence of the disease in the simulated population among non-carriers (the "un-exposed" group).
疾病的患病率在模拟群体之间的非携带者(“非暴露组)。


参数:OR
The odds ratio of disease in the simulated population, comparing carriers to non-carriers.
模拟人口的疾病的几率比非携带者相比,运营商。


参数:n
The number of case-parent trios simulated. The default is 100.
模拟的情况下,母公司三重奏。默认值是100。


参数:rep
The number of data set replicates generated. The default is 1.
生成的数据集的数量重复。默认值是1。


参数:step.save
The name of the binary file (without ".RData" extension) in which the object specifying the simulation mating tables and probabilities will be saved.  The default value is NULL In that case, the object will not be saved for re-use in later run. See Details.
的二进制文件的名称(无“。RDATA”扩展名),在其中将被保存的对象指定的模拟配合表和概率。默认值是NULL在这种情况下,该对象将不被保存在以后的运行再利用。查看详细信息。


参数:step.load
The name of an existing binary file (without ".RData" extension) in which the object specifying the simulation mating tables and probabilities have been saved (see above).  The default value is NULL. In that case, a new object will be generated.
一个现有的二进制文件的名称(无“。RDATA”扩展名)中的对象已被指定的模拟配合表和概率保存(见上文)。默认值是NULL。在这种情况下,将生成一个新的对象。


参数:verbose
A logical value indicating whether or not to print information about memeory and time usage.
一个逻辑值,该值指示是否要打印的信息memeory和时间的使用。


Details

详细信息----------Details----------

The function trio.sim simulates case-parent trio data when the disease risk of children is specified by (possibly higher-order) SNP-SNP interactions. The mating tables and the respective sampling probabilities depend on the haplotype frequencies (or SNP minor allele frequencies when the SNP does not belong to a block). This information is specified in the freq argument of the function.  The probability of disease is assumed to be described by the logistic term logit(p) = a + b I[Interaction], where a = logit (prev) and b = log(OR), with prev and OR specified by the user. Note that at this point only data for two risk groups (carriers versus non-carriers) can be simulated. Since the computational demands for generating the mating is dependent on the number of loci involved in the interactions and the lengths of the LD blocks that contain these disease loci, the interaction term can only consist of up to six loci, not more than one of those loci per block, and haplotype (block) lengths of at most 5 loci.
函数trio.sim模拟的情况下,父母三人的疾病风险的儿童的数据时所指定的(可能是高阶)SNP-SNP相互作用。配合表和各自的采样概率依赖于单倍型频率(或SNP次要等位基因频率的SNP时不属于一个块)。 freq函数的参数中指定此信息。假定将要描述的由MF术语罗吉特(对)= + b口[相互作用],其中a =罗吉特(prev)和b =log(OR),具有疾病的概率prev和OR由用户指定的。请注意,在这一点上,两个风险组的唯一的数据(载流子相对于非携带者)可以模拟。由于用于产生的配合的计算需求是依赖于参与的相互作用和包含这些疾病的位点的LD块的长度的位点数目,交互项只能包括最多6个基因座,那些不超过一个每块的位点,和至多5个位点的单倍型(块)的长度。

Generating the mating tables and the respective sampling probabilities necessary to simulate case-parent trios can be very time consuming for interaction models involving three or more SNPs. In simulation studies, many replicates of similar data are usually required, and generating these sampling probabilities in each instance would be a large and avoidable computational burden (CPU and memory). The sampling probabilities depend foremost on the interaction term and the underlying haplotype frequencies, and as long as these remain constant in the simulation study, the mating table information and the sampling probabilities can be "recycled". This is done by storing the relevant information (denoted as "step-stone") as a binary R file in the working directory (using the argument step.save), and loading the binary file again in future simulations (using the argument step.load), speeding up the simulation process dramatically. It is even possible to change the parameters prev and OR (corresponding to a and b in the logistic model) in these additional simulations, as the sampling probabilities can be adjusted accordingly.
生成的配合表和各个采样概率要模拟案例父三重奏可以是非常耗费时间的交互模型涉及三个或更多个SNPs。在模拟试验中,许多重复类似的数据通常是需要的,产生这些采样在每个实例的概率将是一个大的和可避免的计算负担(CPU和内存)。采样概率取决于交互项和底层的单倍型频率上最重要的,只要这些仿真研究中保持不变,配合表信息和采样概率可以“回收”。这是通过作为工作目录中的二进制R档(使用参数step.save),并加载的二进制文件再次在未来的模拟(使用存储的相关信息(记为“步石”)参数step.load),加快仿真过程中显着。改变的参数,它甚至有可能prev和OR(对应于MF模型中的a和b)在这些额外的模拟,作为抽样概率可以相应地调整。


值----------Value----------

A list of matrices, containing the simulated data sets, in genotype format (indicating the number of variant alleles), including family and subject identifiers.
矩阵列表,包含模拟数据集,在的基因型格式(变异等位基因的数量),包括家庭和主体标识符。


(作者)----------Author(s)----------


Qing Li, mail2qing@yahoo.com



参考文献----------References----------

and Ruczinski, I. (2010). Detection of SNP-SNP Interactions in Trios of Parents with Schizophrenic Children. Genetic Epidemiology, 34, 396-406.

参见----------See Also----------

trio.prepare
trio.prepare


实例----------Examples----------


data(trio.data)
sim = trio.sim(freq=simuBkMap, interaction="1R and 5R", prev=.001, OR=2, n=20, rep=1)
sim[[1]][1:6, 1:12]


转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2024-11-29 16:40 , Processed in 0.028737 second(s), 16 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表