R语言 CGEN包 snp.scan.logistic()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-2-25 14:29:26

snp.scan.logistic(CGEN)
snp.scan.logistic()所属R语言包：CGEN

 Logistic regression analysis for an array of SNPs
 Logistic回归分析的SNP阵列

 译者：生物统计家园网机器人LoveR

描述----------Description----------

Performs a logistic regression analysis of case-control data with three alternative analysis options: (i) Unconstrained maximum-likelihood: This method is equivalent to prospective logistic regression analysis and corresponds to maximum-likelihood analysis of case-control data allowing the joint distribution of the covariates in the model to be completely unrestricted (non-parametric) (ii) Constrained maximum-likelihood: This method performs maximum-likelihood analysis of case-control data under the assumption of gene-environment (or/and gene-gene) independence and Hardy-Weinberg-Equilibrium for the underlying population. The analysis allows the assumptions to be valid conditional on a stratification variable (iii) Empirical-Bayes: This method uses an empirical-Bayes type "shrinkage estimation" technique to trade-off bias and variance between the constrained and unconstrained maximum-likelihood estimators.
执行了三个替代的分析选项的情况下控制数据的logistic回归分析：约束最大似然（一）：这种方法是相当于准logistic回归分析和对应最大可能性的情况下，控制数据的分析，使联合分布在模型中协变量是完全不受限制的（非参数）（二）约束最大似然：此方法执行的可能性最大的情况下，控制数据的分析，基因与环境的假设下（和/或基因的基因）独立底层人口符合Hardy-Weinberg平衡。分析允许的假设是有效的分层变量（三）实证贝叶斯条件：此方法使用经验贝叶斯型“收缩估计”技术贸易之间的约束和无约束的最大似然估计的偏差和方差。

用法----------Usage----------

snp.scan.logistic(snp.list, pheno.list, op=NULL)

参数----------Arguments----------

参数：snp.list
See snp.list. No default.
看到snp.list。没有默认值。

参数：pheno.list
See pheno.list. No default.
看到pheno.list。没有默认值。

参数：op
See details for this list of options. The default is NULL.
详情请参阅此选项列表。默认值为NULL。

Details

详情----------Details----------

To use this function, the data must be stored in files as defined in snp.list and pheno.list. See the examples on how to create these lists. The genotype data is read in from the file(s) snp.list$file, and the variables for the main effects and interactions are read in from the file pheno.list$file. The subjects to be included in the model are defined in pheno.list. For an included subject with id sub.id, there must be the same id in the genotype data file(s). The genotype data file(s) can contain more subject ids than in pheno.list$file, and the ids do not have to be in any particular order. Once the data is read in, all missing values are removed and the function snp.logistic is called for each SNP in the genotype data file(s). By default, output files are not created and only the analysis from the last SNP is returned from this function; so to save the results for all the SNPs, the user must specify op$out.file or op$out.dir. 
要使用此功能，数据必须存储在文件中定义snp.list和pheno.list。请参阅如何创建这些名单的例子。 snp.list$file，变量的主效应和交互读取文件pheno.list$file在阅读文件（S）的基因型数据。被包括在模型中的对象定义在pheno.list。 IDsub.id的主题，必须有相同的ID基因型数据文件（S）。基因型数据文件（S）可以包含比pheno.list$file多学科的IDS，IDS没有在任何特定的顺序。一旦数据被读取，删除所有缺失值的功能snp.logistic呼吁每个SNP基因型数据文件（S）。默认情况下，输出文件不会创建，只有从过去的单核苷酸多态性分析，从这个函数返回，所以保存所有的SNPs的结果，用户必须指定op$out.file或op$out.dir。 

Options list op: Below are the names for the options list op. All names have default values if they are not specified.
选项列表运算：下面是选项列表op的名称。如果它们没有被指定，所有的名字有默认值。

genetic.model 0-3: The genetic model for the SNP. 0=trend, 1=dominant, 2=recessive, 3=general.
genetic.model0-3：为SNP的遗传模型。 0 =趋势，1 =显性隐性，2 =，3 =一般。

tests List of character vectors that will be used in Wald tests. For example, tests=list(c("x1", "x2"), c("x1", "x4", "x9")), will compute a 2 df Wald test involving the variables x1 and x2, and will compute a 3 df Wald test for the variables x1, x4, and x9. The variable name for the main effect of each SNP is called "SNP\_", and the variable names that interact with each SNP are of the form "SNP\_x1", "SNP\_gender", etc. In the output, these tests will labeled as "test1", "test2", etc. The default is NULL.
tests瓦尔德测试将使用的字符向量名单。例如，tests=列表（C（“X1”，“X2”），C（“X1”，“X4”，“X9”）），将计算2 DF Wald检验涉及的变量X1和X2，将计算3 DF瓦尔德变量X1，X4，X9测试。每个SNP的主要影响变量名被称为“单核苷酸多态性\ _”，每个SNP与互动的变量名的形式是“单核苷酸多态性\ _x1”，“单核苷酸多态性\ _gender”等，在输出，这些测试将标示为“TEST1”，“TEST2”等默认值为NULL。

tests.1df Character vector of variable names to compute 1 degree of freedom Wald tests for. The default is NULL.
tests.1df变量名的字符向量计算1度的自由瓦尔德测试。默认值为NULL。

effects List for joint/stratified effects. The default is NULL. Names in the list must be:
effects目录联合/分层效果。默认值为NULL。在列表中的名称必须是：

var Variable name to compute the effects with the SNP variable. This variable must be a main effect. No default.
var变量名来计算的SNP变量的影响。这个变量必须是一个主要的作用。没有默认值。

type 1, 2 or c(1, 2), 1 = joint, 2 = stratified. The default is 1.
type1，2或c（1，2），1 =联合，2 =分层。默认值是1。

var.levels (Only for continuous var). Numeric vector of the levels to be used in the calculation. The default is 0.
var.levels（仅适用于连续var）。数字矢量在计算中使用的水平。默认为0。

var.base (Only for continuous var). Baseline level. The default is 0.
var.base（仅适用于连续var）。基线水平。默认为0。

snp.levels A vector containing any of the values 0, 1, 2 to use as the levels of each SNP. The default is 1.
snp.levels一个向量，包含任何值0，1，2，使用每个SNP的水平。默认值是1。

method Character vector containing any of the following: "UML", "CML", "EB". The default is c("UML", "CML", "EB").
method含有以下特征向量：“UML的”，“慢性粒单元白血病”，“CEL”。默认为C（“UML的”，“慢性粒单元白血病”，“CEL”）。

out.file NULL or file name to save summary information for each SNP. The output will at least contain the columns "SNP" and "MAF". MAF is the minor allele frequency from the controls. Additional columns in this file are based on the values of tests and tests.1df. The default is NULL.
out.fileNULL或文件名保存为每个SNP的摘要信息。输出将至少包含列“单核苷酸多态性”和“农林部”。农林部是从控制未成年人的等位基因频率。在这个文件中的附加列是基于值tests和tests.1df。默认值为NULL。

out.dir NULL or the output directory to store the output lists for each SNP. A seperate file will be created for each SNP in the SNP data set, so this option should only be used for analyzing a small number of SNPs. The file names will be out\_<SNP>.rda. The load() function must be used to read these files into R. The object names are called "ret". The default is NULL.
out.dirNULL或输出目录来存储每个SNP的输出列表。将每个SNP的SNP数据集创建一个单独的文件，所以这个选项应该只用于单核苷酸多态性分析少数。文件名会出\ _ <SNP>。RDA。 load()函数必须使用阅读对象的名字被称为“RET”入河这些文件。默认值为NULL。

reltol Stopping tolerance. The default is 1e-6.
reltol停止容忍。默认为1E-6。

maxiter Maximum number of iterations. The default is 100.
maxiter最大迭代次数。默认是100。

optimizer One of "BFGS", "CG", "L-BFGS-B", "Nelder-Mead", "SANN". The default is "BFGS".
optimizer“的BFGS”，“企业管治”，“L-bfgs-B的”，“内尔德，美赞臣”，“宋双”。默认是“的BFGS”。

值----------Value----------

A list from the LAST analysis performed. This list will contain the estimated parameters, covariance matrices, SNP name, and possibly the results of any Wald tests.
从名单进行最后的分析。此列表将包含参数的估计，协方差矩阵，SNP的名字，可能是任何瓦尔德测试的结果。

参考文献----------References----------

An empirical Bayes approach to trade-off between bias and efficiency. Biometrics 2008, 64(3):685-94.
type I error, power and designs. Genetic Epidemiology, 2008, 32:615-26.
exploting gene-environment independence in case-control studies. Biometrika, 2005, 92, 2, pp.399-418.
case-control studies. Journal of the American Statistical Association, 2009, 104: 220-233.
Using Principal Components of Genetic Variation for Robust and Powerful Detection of Gene-Gene Interactions in Case-Control and Case-Only studies. American Journal of Human Genetics, 2010, 86(3):331-342. 

参见----------See Also----------

snp.logistic
snp.logistic

举例----------Examples----------

# Define the list for the genotype data. [定义为基因型数据的列表。]
snp.list <- list()
snp.list$file <- system.file("sampleData", "SNPdata.rda", package="CGEN")
snp.list$file.type <- 1
snp.list$delimiter <- "|"
snp.list$in.miss <- "NA"

# Only process the first 5 SNPs in the file[只处理文件中的首5个SNPs]
snp.list$start.vec <- 1
snp.list$stop.vec <- 6

# Define pheno.list[定义pheno.list]
pheno.list <- list()
pheno.list$file <- system.file("sampleData", "Xdata.txt", package="CGEN")
pheno.list$file.type <- 3
pheno.list$delimiter <- "\t"
pheno.list$id.var <- "id"

# Define the variables in the model[定义模型中的变量]
pheno.list$response.var <- "case.control"
pheno.list$strata.var <- "ethnic.group"
pheno.list$main.vars <- c("age.group", "oral.years", "n.children")
pheno.list$int.vars <- "n.children"

# Define the list of options[定义选项列表]
op <- list()

# Omnibus Wald test for the main effect of the SNP and the interaction variables, and[综合Wald检验的SNP和交互变量的主效应，]
# a seperate Wald test for "age.group" and "oral.years". [一个单独的Wald检验“age.group”和“oral.years”。]
op$tests <- list(c("SNP_", "SNP_:n.children"), c("age.group", "oral.years"))

# Specifying out.dir will create a separate .rda file for each SNP[指定out.dir将创建一个每个SNP的独立。RDA文件]
#op$out.dir <- "./"[运算元out.dir < - “/。”]
# Specifying out.file will create one output file[指定out.file将创建一个输出文件]
#op$out.file <- "out.txt"[运$ out.file < - “out.txt”]

# For this model, all variables are continuous[对于这个模型，所有的变量是连续的]
# temp <- snp.scan.logistic(snp.list, pheno.list, op=op)[临时< - snp.scan.logistic的（OP = OP snp.list，pheno.list）]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册

R语言 CGEN包 snp.scan.logistic()函数中文帮助文档(中英文对照)

浏览过的版块