随着基因测序技术的飞速发展,基因组数据不断快速增长,如何正确、快速地对这些数据进行各种统计学分析已成为研究的瓶颈,"2010年统计遗传学暑期培训班"将邀请美国华盛顿大学生物统计学系的"Summer Institute in Statistical Genetics"(http://www.biostat.washington.edu/suminst/sisg/general)中的部分老师到北京进行相关课程的讲授,同时具体指导相关数据的分析。



Kent Holsinger  (Professor of Ecology and Evolutionary, Biology, University of Connecticut )

Bruce S. Weir (Professor and Chair of Biostatistics, University of Washington, Director, Summer Institute in Statistical Genetics)



Weir, B.S. (1996). Genetic Data Analysis II. Sinauer Associates.

Holsinger, K., Weir, B. (2009). Genetics in geographically structured populations: defining, estimating, and interpreting Fst. Nature Reviews Genetics 10:639-50.

Module 1: Population Genetic Data Analysis

Instructors: K. Holsinger, and B. Weir

Estimates and sample variances of allele frequencies; Hardy-Weinberg and linkage disequilibrium; characterization of population structure with F-statistics; relationship estimation. Use of public domain software, including GDA and Hickory.

Background reading: Weir, B.S. (1996). Genetic Data Analysis II. Sinauer Associates.

Holsinger, K., Weir, B. (2009). Genetics in geographically structured populations: defining, estimating, and interpreting Fst. Nature Reviews Genetics 10:639-50.



Gregory C. Gibson (Professor, School of Biology Georgia Institute of Technology)

John Storey (Associate Professor, Lewis-Sigler Institute Department of Molecular Biology, Princeton University)

本课程涵盖基因表达数据统计分析的各个方面. 其方法也适用与蛋白质组和代谢物组数据的统计分析。通过理论与实例相结合的方式,讲解资料质量控制和规范化,方差分析和假设检验,时间序列,替代变量分析和优化分析方法。并讨论其方法在基因芯片和下一代DNA序列资料分析中的应用。讲解将和相关的统计软件示范应用相结合。

Module 2: Gene Expression Profiling

Instructors: G. Gibson and J. Storey

This course covers all aspects of the statistical analysis of gene expression profiling; the methods are also relevant to analysis of proteomic and metabolomic data. Theory will be integrated with case studies demonstrating the principles of quality control, normalization, analysis of variance and hypothesis testing, time series, surrogate variable analysis, and optimal discovery procedures. Discussion will include microarray and nextgen sequencing applications, and relevant statistical software will be demonstrated.



J. Bruce Walsh (Professor of Ecology and Evolutionary, Biology, University of Arizona)

Dahlia M. Nielsen (Research Assistant Professor of Genetics North Carolina State University)


阅读材料:Lynch, M., Walsh, B. (1998). Genetics and analysis of quantitative traits. Sinauer.

Module 3: Quantitative Genetics

Instructors:  B. Walsh and D. Nielsen

Quantitative Genetics is the analysis of complex characters where both genetic and environment factors contribute to trait variation. Since this includes most traits of interest, such as disease susceptibility, crop yield, and all microarray data, a working knowledge of quantitative genetics is critical in diverse fields from plant and animal breeding, human genetics, genomics, to ecology and evolutionary biology. The course will cover the basics of quantitative genetics including: Fisher’s variance decomposition, covariance between relatives, heritability, inbreeding and crossbreeding, and response to selection. Also an introduction to advanced topics such as: Mixed Models, BLUP, QTL mapping; correlated characters; and the multivariate response to selection.

Background reading: Lynch, M., Walsh, B. (1998). Genetics and analysis of quantitative traits, Sinauer.



Alison Motsinger-Reif (Assistant Professor of Statistics, North Carolina State University)

Dahlia M. Nielsen (Research Assistant Professor of Genetics North Carolina State University)

本课程将介绍针对人类群体基因定位的关联统计方法。讨论题目包括:利用连锁不平衡进行基因定位的理论; 基因定位实验设计;基于群体和家系的基因关联定位统计方法;离散和连续分布性状统计分析方法;处理群体结构的统计方法; 多测试问题;复杂的遗传模型的分析方法。

Module 4: Human Association Mapping

Instructors: A. Motsinger-Reif, and D. Nielsen

This module is an introduction to association mapping, focusing on human populations. Topics include: theory of linkage disequilibrium and mapping; basics of gene mapping study design; population and family-based association techniques for discrete and continuous traits; methods for detecting and accounting for population structure; and multiple testing issues. New data-mining methods for detecting complex genetics models are also discussed.



Bruce S. Weir (Professor and Chair of Biostatistics, University of Washington, Director, Summer Institute in Statistical Genetics)

Michel Georges (Professor of Molecular Genetics, Université de Liège)


Module 5: Plant and Animal Association Mapping

Instructors: B. Weir and M. Georges

This module is an introduction to association mapping, focusing on plant and animal populations. Topics include theory of linkage disequilibrium and mapping, population and family-based association techniques for discrete and continuous traits, methods for detecting and accounting for population structure, issues in polyploid organisms, multiple testing issues, and genotyping strategies. Use real data and hands-on experience with publicly available software packages for instruction.



Rebecca W. Doerge (Professor of Statistics, Purdue University)

Zhao-Bang Zeng (Reynolds Distinguished Professor of Statistics and Genetics, North Carolina State University)

本课程将系统地介绍适于杂交群体的数量性状基因定位(QTL)的统计方法。讨论题目包括:实验设计; 连锁图谱构建; 单标记分析; 区间作图法; 复合区间作图法;多区间作图法。同时也讨论对全基因组扫描模型统计量阈值的确定。本课程将利用公共领域的软件Windows QTL Cartographer作演讲示范及练习。

Module 6: QTL Mapping

Instructors: R. Doerge and Z-B. Zeng

This module will systematically introduce statistical methods for mapping quantitative trait loci (QTL) in experimental cross populations. Topics include experimental designs, linkage map construction, single-marker analyses, interval mapping, composite interval mapping, and multiple interval mapping. Significance thresholds for genome scan and model selection will also be discussed. Uses public domain software Windows QTL-Cartographer for computer lab exercises.



Thomas Lumley (Professor of Biostatistics, University of Washington)

Kenneth M. Rice (Associate Professor of Biostatistics, University of Washington)


阅读材料:Gentleman. R. (2008). R Programming in Bioinformatics. Taylor and Francis.

Module 7: Genetic Data Analysis with R

Instructors: T. Lumley and K. Rice

This module covers object-oriented programming, SQL database use, some of the Bioconductor data infrastructure, and calling C code from R. The module is aimed at people who have either substantial R experience or programming experience in other languages.

Back-ground reading: Gentleman. R. (2008). R Programming in Bioinformatics. Taylor and Francis.


