References

Aitchison, John, and C. H. Ho. 1989. “The Multivariate Poisson-Log Normal Distribution.” Biometrika 76 (4): 643–53.

Aitchison, Jonh. 1982. “The Statistical Analysis of Compositional Data.” Journal of the Royal Statistical Society. Series B (Methodological) 44 (2): 139–77.

Aitkin, AC. 1935. “On Least Squares and Linear Combination of Observations.” Proceedings of the Royal Society of Edinburgh 55: 42–48.

Anderson, Stephen. 1981. “Shotgun Dna Sequencing Using Cloned Dnase I-Generated Fragments.” Nucleic Acids Research 9 (13): 3015–27.

Ardlie, K. G., L. Kruglyak, and M. Seielstad. 2002. “Patterns of Linkage Disequilibrium in the Human Genome.” Nature Reviews Genetics 3 (4): 299–309.

Armitage, Peter. 1955. “Tests for Linear Trends in Proportions and Frequencies.” Biometrics 11 (3): 375–86.

Asimit, J. L., A. G. Day-Williams, A. P. Morris, and E. Zeggini. 2012. “ARIEL and AMELIA: Testing for an Accumulation of Rare Variants Using Next-Generation Sequencing Data.” Human Heredity 73 (2): 84–94.

Bach, Francis R. 2008. “Bolasso: Model Consistent Lasso Estimation Through the Bootstrap.” In Proceedings of the 25th International Conference on Machine Learning, 33–40. ACM.

Balding, David J, Martin Bishop, and Chris Cannings. 2008. Handbook of Statistical Genetics. John Wiley & Sons.

Barrett, Jeffrey C, and Lon R Cardon. 2006. “Evaluating Coverage of Genome-Wide Association Studies.” Nature Genetics 38 (6): 659.

Barrett, Jeffrey C, B Fry, JDMJ Maller, and Mark J Daly. 2004. “Haploview: Analysis and Visualization of Ld and Haplotype Maps.” Bioinformatics 21 (2): 263–65.

Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society: Series B 57 (1): 289–300.

Bien, Jacob, Jonathan Taylor, and Robert Tibshirani. 2013. “A Lasso for Hierarchical Interactions.” Annals of Statistics 41 (3): 1111.

Bonferroni, C. 1936. “Teoria Statistica Delle Classi E Calcolo Delle Probabilita.” Pubblicazioni Del R Istituto Superiore Di Scienze Economiche E Commericiali Di Firenze 8: 3–62.

Boor, Carl de. 1975. A Practical Guide to Splines. Springer. Springer. http://www.springer.com/gb/book/9780387953663.

Bousquet, O., and A. Elisseeff. 2002. “Stability and Generalization.” Journal of Machine Learning Research 2: 499–526.

Brown, T.A., D.B.S.T. Brown, T.A. Brown, and L.B.T. Brown. 2007. Genomes 3. Taylor & Francis Group, an Informa Business. Garland Science Pub. https://books.google.fr/books?id=Cjl98tqp6rsC.

Brzyski, Damian, Christine B. Peterson, Piotr Sobczyk, Emmanuel J. CandÚs, Malgorzata Bogdan, and Chiara Sabatti. 2017. “Controlling the Rate of GWAS False Discoveries.” Genetics 205 (January). https://doi.org/10.1534/genetics.116.193987.

Buja, Andreas, Trevor Hastie, and Robert Tibshirani. 1989. “Linear Smoothers and Additive Models.” The Annals of Statistics 17 (2): 453–510. http://www.jstor.org/stable/2241560.

Bush, William S, and Jason H Moore. 2012. “Genome-Wide Association Studies.” PLoS Computational Biology 8 (12): e1002822.

Caliński, T., and J. Harabasz. 1974. “A Dendrite Method for Cluster Analysis.” Communications in Statistics 3 (1): 1–27.

Chapman, Juliet M, Jason D Cooper, John A Todd, and David G Clayton. 2003. “Detecting Disease Associations Due to Linkage Disequilibrium Using Haplotype Tags: A Class of Tests and the Determinants of Statistical Power.” Human Heredity 56 (1-3): 18–31.

Charlesworth, Brian, MT Morgan, and D Charlesworth. 1993. “The Effect of Deleterious Mutations on Neutral Molecular Variation.” Genetics 134 (4): 1289–1303.

Chen, Wei, Clarence K. Zhang, Yongmei Cheng, Shaowu Zhang, and Hongyu Zhao. 2013. “A Comparison of Methods for Clustering 16S rRNA Sequences into OTUs.” PLoS ONE 8 (8): e70837.

Clarke, Geraldine M, Carl A Anderson, Fredrik H Pettersson, Lon R Cardon, Andrew P Morris, and Krina T Zondervan. 2011. “Basic Statistical Analysis in Genetic Case-Control Studies.” Nature Protocols 6 (2): 121.

Clayton, David G, Neil M Walker, Deborah J Smyth, Rebecca Pask, Jason D Cooper, Lisa M Maier, Luc J Smink, et al. 2005. “Population Structure, Differential Bias and Genomic Control in a Large-Scale, Case-Control Association Study.” Nature Genetics 37 (11): 1243.

Cochran, William G. 1954. “Some Methods for Strengthening the Common \(\chi\) 2 Tests.” Biometrics 10 (4): 417–51.

Cox, David R. 1958. “The Regression Analysis of Binary Sequences.” Journal of the Royal Statistical Society. Series B (Methodological), 215–42.

Crick, F.H.C., and J.D. Watson. 1954. “The Complementary Structure of Deoxyribonucleic Acid.” Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences 223 (1152): 80–96.

Daly, Mark J, John D Rioux, Stephen F Schaffner, Thomas J Hudson, and Eric S Lander. 2001. “High-Resolution Haplotype Structure in the Human Genome.” Nature Genetics 29 (2): 229.

Dandine-Roulland, Claire, and Herve Perdry. 2015. “The Use of the Linear Mixed Model in Human Genetics.” Human Heredity 80 (4): 196–206.

Davey Smith, George, and Shah Ebrahim. 2003. “€˜Mendelian Randomization’: Can Genetic Epidemiology Contribute to Understanding Environmental Determinants of Disease?” International Journal of Epidemiology 32 (1): 1–22.

Davies, Robert B. 1980. “Algorithm as 155: The Distribution of a Linear Combination of \(\chi\) 2 Random Variables.” Journal of the Royal Statistical Society. Series C (Applied Statistics) 29 (3): 323–33.

Dehman, A., C. Ambroise, and P. Neuvial. 2015. “Performance of a Blockwise Approach in Variable Selection Using Linkage Disequilibrium Information.” BMC Bioinformatics 16: 148.

Dehman, Alia. 2015a. “Spatial Clustering of Linkage Disequilibrium Blocks for Genome-Wide Association Studies.” PhD thesis, Université d’Evry Val d’Essonne; Université Paris-Saclay; Laboratoire de Mathématiques et Modélisation d’Evry.

———. 2015b. “Spatial Clustering of Linkage Disequilibrium blocks for Genome-Wide Association Studies.” Theses, Université d’Evry Val d’Essonne ; Université Paris-Saclay ; Laboratoire de Mathématiques et Modélisation d’Evry. https://tel.archives-ouvertes.fr/tel-01288568.

Devlin, Bernie, and Kathryn Roeder. 1999. “Genomic Control for Association Studies.” Biometrics 55 (4): 997–1004.

Devlin, B, and Neil Risch. 1995. “A Comparison of Linkage Disequilibrium Measures for Fine-Scale Mapping.” Genomics 29 (2): 311–22.

Diaz-Quijano, Fredi A. 2012. “A Simple Method for Estimating Relative Risk Using Logistic Regression".” BMC Medical Research Methodology 1 (1): 14.

Donoho, David L, and Yaakov Tsaig. 2008. “Fast Solution of-Norm Minimization Problems When the Solution May Be Sparse.” IEEE Transactions on Information Theory 54 (11): 4789–4812.

Dudbridge, Frank, and Arief Gusnanto. 2008. “Estimation of Significance Thresholds for Genomewide Association Scans.” Genetic Epidemiology 32 (3): 227–34.

Efron, Bradley, Trevor Hastie, Iain Johnstone, Robert Tibshirani, and others. 2004. “Least Angle Regression.” The Annals of Statistics 32 (2): 407–99.

Eubank, Randall L. 1999. Nonparametric Regression and Spline Smoothing, Second Edition. CRC Press.

Excoffier, Laurent, and Montgomery Slatkin. 1995. “Maximum-Likelihood Estimation of Molecular Haplotype Frequencies in a Diploid Population.” Molecular Biology and Evolution 12 (5): 921–27.

Fergus, Paul, Casimiro Curbelo Montanez, Basma Abdulaimma, Paulo Lisboa, and Carl Chalmers. 2018. “Utilising Deep Learning and Genome Wide Association Studies for Epistatic-Driven Preterm Birth Classification in African-American Women.” arXiv Preprint arXiv:1801.02977.

Finucane, Hilary K, Brendan Bulik-Sullivan, Alexander Gusev, Gosia Trynka, Yakir Reshef, Po-Ru Loh, Verneri Anttila, et al. 2015. “Partitioning Heritability by Functional Annotation Using Genome-Wide Association Summary Statistics.” Nature Genetics 47 (11): 1228.

Fisher, Ronald A. 1919. “XV.—The Correlation Between Relatives on the Supposition of Mendelian Inheritance.” Earth and Environmental Science Transactions of the Royal Society of Edinburgh 52 (2): 399–433.

———. 1935. “Statistical Methods for Research Workers.” Edinburgh: Oliver and Boyd, 1934 and the Logic of Inductive Interence; Royal Statistical Society 98: S–39.

Fletcher, Roger. 1987. “Practical Methods of Optimization John Wiley & Sons.” New York 80.

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. 2001. The Elements of Statistical Learning. Vol. 1. Springer series in statistics New York.

Frisse, L, RR Hudson, A Bartoszewicz, JD Wall, J Donfack, and A Di Rienzo. 2001. “Gene Conversion and Different Population Histories May Explain the Contrast Between Polymorphism and Linkage Disequilibrium Levels.” The American Journal of Human Genetics 69 (4): 831–43.

Gabriel, S. B., S. F. Schaffner, H. Nguyen, J. M. Moore, J. Roy, B. Blumenstiel, J. Higgins, et al. 2002. “The Structure of Haplotype Blocks in the Human Genome.” Science 296 (5576): 2225–9.

Garibyan, Lilit, and Nidhi Avashia. 2013. “Research Techniques Made Simple: Polymerase Chain Reaction (Pcr).” The Journal of Investigative Dermatology 133 (3): e6.

Gauss, Carl Friedrich. 1809. Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium Auctore Carolo Friderico Gauss. sumtibus Frid. Perthes et IH Besser.

Geurst, P., V. Botta, and G. Louppe. 2014. “Exploiting SNP Correlations Within Random Forest for Genome-Wide Association Studies.” PLoS ONE 9 (4).

Goeman, Jelle J, Aldo Solari, and others. 2011. “Multiple Testing for Exploratory Research.” Statistical Science 26 (4): 584–97.

Golub, Gene H, Michael Heath, and Grace Wahba. 1979. “Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter.” Technometrics 21 (2): 215–23.

Gordon, Aharon D. 1999. Classification. Monographs on Statistics and Applied Probability. Chapman & Hall.

Green, Peter J., and Brian S. Yandell. 1985. Semi-Parametric Generalized Linear Models. Lecture Notes in Statistics. Springer, New York, NY. https://link.springer.com/chapter/10.1007/978-1-4615-7070-7_6.

Grimm, Eric C. 1987. “CONISS: A Fortran 77 Program for Stratigraphically Constrained Cluster Analysis by the Method of Incremental Sum of Squares.” Computers & Geosciences 13 (1): 13–35.

Grimonprez, Quentin. 2016. “Selection de Groupes de Variables corrélées En Grande Dimension.” PhD thesis, Université de Lille; Lille 1.

Gu, Chong. 2002. Smoothing Spline ANOVA Models. Springer Series in Statistics. New York, NY: Springer New York. http://link.springer.com/10.1007/978-1-4757-3683-0.

Hastie, T. J., and R. J. Tibshirani. 1990. Generalized Additive Models. CRC Press.

Hedrick, Philip W. 1987. “Gametic Disequilibrium Measures: Proceed with Caution.” Genetics 117 (2): 331–41.

Hill, WG, and Alan Robertson. 1968. “Linkage Disequilibrium in Finite Populations.” Theoretical and Applied Genetics 38 (6): 226–31.

Hill, William G. 1974. “Estimation of Linkage Disequilibrium in Randomly Mating Populations.” Heredity 33 (2): 229.

Hoerl, Arthur E, and Robert W Kennard. 1970. “Ridge Regression: Biased Estimation for Nonorthogonal Problems.” Technometrics 12 (1): 55–67.

Holm, Sture. 1979a. “A Simple Sequentially Rejective Multiple Test Procedure.” Scandinavian Journal of Statistics, 65–70.

———. 1979b. “A Simple Sequentially Rejective Multiple Test Procedure.” Scandinavian Journal of Statistics 6 (2): 65–70.

Huber, Wolfgang, Anja von Heydebreck, Holger Sültmann, Annemarie Poustka, and Martin Vingron. 2003. “Parameter Estimation for the Calibration and Variance Stabilization of Microarray Data.” Statistical Applications in Genetics and Molecular Biology 2 (1).

Hunter, David J. 2005. “Gene–Environment Interactions in Human Diseases.” Nature Reviews Genetics 6 (4): 287.

International Genetics of Ankylosing Spondylitis Consortium (IGAS), Adrian Cortes, Johanna Hadler, Jenny P. Pointon, Philip C. Robinson, and others. 2013. “Identification of Multiple Risk Variants for Ankylosing Spondylitis Through High-Density Genotyping of Immune-Related Loci.” Nature Genetics 45 (7): 730–38.

Jacob, Laurent, Guillaume Obozinski, and Jean-Philippe Vert. 2009. “Group LAsso with Overlap and Graph LAsso.” In Proceedings of the 26th Annual International Conference on Machine Learning, 433–40. Montreal, Quebec, Canada.

Jenatton, Rodolphe, Jean-Yves Audibert, and Francis Bach. 2011. “Structured Variable Selection with Sparsity-Inducing Norms.” Journal of Machine Learning Research 12 (Oct): 2777–2824.

Jonsson, Viktor, Tobias Österlund, Olle Nerman, and Erik Kristiansson. 2017. “Variability in Metagenomic Count Data and Its Influence on the Identification of Differentially Abundant Genes.” Journal of Computational Biology 24 (4): 311–26.

Kaessmann, Henrik, Sebastian Zöllner, Anna C Gustafsson, Victor Wiebe, Maris Laan, Joakim Lundeberg, Mathias Uhlén, and Svante Pääbo. 2002. “Extensive Linkage Disequilibrium in Small Human Populations in Eurasia.” The American Journal of Human Genetics 70 (3): 673–85.

Kang, Hyun Min, Jae Hoon Sul, Susan K Service, Noah A Zaitlen, Sit-yee Kong, Nelson B Freimer, Chiara Sabatti, Eleazar Eskin, and others. 2010. “Variance Component Model to Account for Sample Structure in Genome-Wide Association Studies.” Nature Genetics 42 (4): 348.

Klein, Robert J, Caroline Zeiss, Emily Y Chew, Jen-Yue Tsai, Richard S Sackler, Chad Haynes, Alice K Henning, et al. 2005. “Complement Factor H Polymorphism in Age-Related Macular Degeneration.” Science 308 (5720): 385–89.

Krzanowski, W. J., and Y. T. Lai. 1988. “A Criterion for Determining the Number of Groups in a Data Set Using Sum-of-Squares Clustering.” Biometrics 44 (1): 23–34.

Kwak, Il-Youp, and Wei Pan. 2016. “Adaptive Gene- and Pathway-Trait Association Testing with GWAS Summary Statistics.” Bioinformatics 32: 1178–84. https://doi.org/10.1093/bioinformatics/btv719.

Laan, Maris, and Svante Pääbo. 1997. “Demographic History and Linkage Disequilibrium in Human Populations.” Nature Genetics 17 (4): 435.

Lander, Eric S. 1996. “The New Genomics: Global Views of Biology.” Science 274 (5287): 536–39.

Law, Charity W, Yunshun Chen, Wei Shi, and Gordon K Smyth. 2014. “Voom: Precision Weights Unlock Linear Model Analysis Tools for Rna-Seq Read Counts.” Genome Biology 15 (2): R29.

Lee, S., G. R. Abecasis, M. Boehnke, and X. Lin. 2014. “Rare-Variant Association Analysis: Study Designs and Statistical Tests.” American Journal of Human Genetics 95 (1): 5–23.

Lee, Seunghak, Nico Görnitz, Eric P. Xing, David Heckerman, and Christoph Lippert. 2017. “Ensembles of LAsso Screening Rules.” IEEE Transactions on Pattern Analysis and Machine Intelligence PP (99): 1–1.

Lee, S., M. C. Wu, and X. Lin. 2012. “Optimal Tests for Rare Variant Effects in Sequencing Association Studies.” Biostatistics 13 (4): 762–75.

Lewontin, R. C. 1964. “THE Interaction of Selection and Linkage. I. GENERAL Considerations; Heterotic Models.” Genetics 49 (1): 49–67.

Li, Bingshan, and Suzanne M Leal. 2008. “Methods for Detecting Associations with Rare Variants for Common Diseases: Application to Analysis of Sequence Data.” The American Journal of Human Genetics 83 (3): 311–21.

Lim, Michael, and Trevor Hastie. 2015. “Learning Interactions via Hierarchical Group-Lasso Regularization.” Journal of Computational and Graphical Statistics 24 (3): 627–54.

Lin, Xihong. 1997. “Variance Component Testing in Generalised Linear Models with Random Effects.” Biometrika 84 (2): 309–26.

Lin, Xinyi, Seunggeun Lee, David C. Christiani, and Xihong Lin. 2013. “Test for Interactions Between a Genetic Marker Set and Environment in Generalized Linear Models.” Biostatistics 14 (4): 667–81.

Listgarten, Jennifer, Christoph Lippert, Carl M. Kadie, Robert I. Davidson, Eleazar Eskin, and David Heckerman. 2012. “Improved Linear Mixed Models for Genome-Wide Association Studies.” Nature Methods 9 (6): 525–26. https://doi.org/10.1038/nmeth.2037.

Listgarten, J., C. Lippert, E. Y. Kang, J. Xiang, C. M. Kadie, and D. Heckerman. 2013. “A Powerful and Efficient Set Test for Genetic Markers That Handles Confounders.” Bioinformatics 29 (12): 1526–33.

Longford, Nicholas T. 1987. “A Fast Scoring Algorithm for Maximum Likelihood Estimation in Unbalanced Mixed Models with Nested Random Effects.” Biometrika 74 (4): 817–27.

Love, Michael I, Wolfgang Huber, and Simon Anders. 2014. “Moderated Estimation of Fold Change and Dispersion for Rna-Seq Data with Deseq2.” Genome Biology 15 (12): 550.

Maciukiewicz, Malgorzata, Victoria S Marshe, Anne-Christin Hauschild, Jane A Foster, Susan Rotzinger, James L Kennedy, Sidney H Kennedy, Daniel J Müller, and Joseph Geraci. 2018. “GWAS-Based Machine Learning Approach to Predict Duloxetine Response in Major Depressive Disorder.” Journal of Psychiatric Research 99: 62–68.

Madsen, Bo Eskerod, and Sharon R Browning. 2009. “A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic.” PLoS Genetics 5 (2): e1000384.

Maher, B. 2008. “Personal Genomes: The Case of the Missing Heritability.” Nature News 456 (7218): 18–21.

Manolio, T. A., and P. M. Visscher. 2009. “Finding the Missing Heritability of Complex Diseases.” Nature 461 (7265): 747–53.

Manolio, Teri A, Lisa D Brooks, and Francis S Collins. 2008. “A Hapmap Harvest of Insights into the Genetics of Common Disease.” The Journal of Clinical Investigation 118 (5): 1590–1605.

Mary-Huard, Tristan, and Stephane Robin. 2009. “Tailored Aggregation for Classification.” IEEE Transactions on Pattern Analysis and Machine Intelligence 31 (11): 2098–2105.

Maxam, Allan M, and Walter Gilbert. 1977. “A New Method for Sequencing Dna.” Proceedings of the National Academy of Sciences 74 (2): 560–64.

Mc Cullagh, Peter, and J. A. Nelder. 1989. “Generalized Linear Models, Second Edition.” CRC Press. https://www.crcpress.com/Generalized-Linear-Models-Second-Edition/McCullagh-Nelder/p/book/9780412317606.

Meier, Lukas, Sara van de Geer, and Peter Buhlmann. 2009. “High-Dimensional Additive Modeling.” The Annals of Statistics 37: 3779–3821. https://doi.org/10.1214/09-AOS692.

Meier, Lukas, Sara Van De Geer, and Peter Buhlmann. 2008. “The Group Lasso for Logistic Regression.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70 (1): 53–71. http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9868.2007.00627.x/abstract.

Meinshausen, N. 2008. “Hierarchical Testing of Variable Importance.” Biometrika 95 (2): 265–78.

Meinshausen, Nicolai, and Peter Bühlmann. 2010. “Stability Selection.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72 (4): 417–73.

Mendel, Gregor Johann. 1865. “Versuche ãœber Pflanzen-Hybriden” [Experiments Concerning Plant Hybrids].” Proceedings of the Natural History Society of BrÃŒnn IV: 3–47.

Mieth, Bettina, Marius Kloft, Juan Antonio Rodrı́guez, Sören Sonnenburg, Robin Vobruba, Carlos Morcillo-Suárez, Xavier Farré, et al. 2016. “Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-Wide Association Studies.” Scientific Reports 6: 36671.

Milligan, Glenn W., and Martha C. Cooper. 1985. “An Examination of Procedures for Determining the Number of Clusters in a Data Set.” Psychometrika 50 (2): 159–79.

Morgenthaler, Stephan, and William G Thilly. 2007. “A Strategy to Discover Genes That Carry Multi-Allelic or Mono-Allelic Risk for Common Diseases: A Cohort Allelic Sums Test (Cast).” Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis 615 (1): 28–56.

Mukerji, Krisha Gopal, C Manoharachary, and BP Chamola. 2002. Techniques in Mycorrhizal Studies. Springer Science & Business Media.

Nelder, J. A., and R. W. M. Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society: Series A 135 (3): 370–84.

Neyman, Jerzy, and Egon S Pearson. 1933. “The Testing of Statistical Hypotheses in Relation to Probabilities a Priori.” In Mathematical Proceedings of the Cambridge Philosophical Society, 29:492–510. Cambridge University Press.

Nyrén, Pettersson, Bertil Pettersson, and Mathias Uhlén. 1993. “Solid Phase Dna Minisequencing by an Enzymatic Luminometric Inorganic Pyrophosphate Detection Assay.” Analytical Biochemistry 208 (1): 171–75.

Osborne, Michael R, Brett Presnell, and Berwin A Turlach. 2000. “A New Approach to Variable Selection in Least Squares Problems.” IMA Journal of Numerical Analysis 20 (3): 389–403.

Paré, Guillaume, Senay Asma, and Wei Q. Deng. 2015. “Contribution of Large Region Joint Associations to Complex Traits Genetics.” PLOS Genetics 11. https://doi.org/10.1371/journal.pgen.1005103.

Parikh, Neal, Stephen Boyd, and others. 2014. “Proximal Algorithms.” Foundations and Trends in Optimization 1 (3): 127–239.

Park, Mee Young, Trevor Hastie, and Robert Tibshirani. 2007. “Averaged Gene Expressions for Regression.” Biostatistics 8 (2): 212–27.

Pearson, Karl. 1896. “Mathematical Contributions to the Theory of Evolution. On a Form of Spurious Correlation Which May Arise When Indices Are Used in the Measurement of Organs.” Proceedings of the Royal Society of London 60: 489–98.

———. 1900. “On the Criterion That a Given System of Deviations from the Probable in the Case of a Correlated System of Variables Is Such That It Can Be Reasonably Supposed to Have Arisen from Random Sampling.” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 50 (302): 157–75.

Pereira, Mariana Buongermino, Mikael Wallroth, Viktor Jonsson, and Erik Kristiansson. 2018. “Comparison of Normalization Methods for the Analysis of Metagenomic Gene Abundance Data.” BMC Genomics 19 (1): 274.

Petersen, Ashley, Carolina Alvarez, Scott DeClaire, and Nathan L. Tintle. 2013. “Assessing Methods for Assigning SNPs to Genes in Gene-Based Tests of Association Using Common Variants.” PLOS ONE 8. https://doi.org/10.1371/journal.pone.0062161.

Pinton, Roberto, Zeno Varanini, and Paolo Nannipieri. 2007. The Rhizosphere: Biochemistry and Organic Substances at the Soil-Plant Interface. CRC press.

Price, A. L., N. J. Patterson, R. M. Plenge, M. E. Weinblatt, N. A. Shadick, and D. Reich. 2006. “Principal Components Analysis Corrects for Stratification in Genome-Wide Association Studies.” Nature Genetics 38: 904–9.

Pritchard, Jonathan K, and Molly Przeworski. 2001. “Linkage Disequilibrium in Humans: Models and Data.” The American Journal of Human Genetics 69 (1): 1–14.

Pritchard, Jonathan K, Matthew Stephens, and Peter Donnelly. 2000. “Inference of Population Structure Using Multilocus Genotype Data.” Genetics 155 (2): 945–59.

Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M. A. R. Ferreira, D. Bender, J. Maller, et al. 2007. “PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses.” The American Journal of Human Genetics 81 (3): 559–75.

Qin, Junjie, Yingrui Li, Zhiming Cai, Shenghui Li, Jianfeng Zhu, Fan Zhang, Suisha Liang, et al. 2012. “A Metagenome-Wide Association Study of Gut Microbiota in Type 2 Diabetes.” Nature 490 (7418): 55–60.

RA Fisher, MA. 1922. “On the Mathematical Foundations of Theoretical Statistics.” Phil. Trans. R. Soc. Lond. A 222 (594-604): 309–68.

Rau, Andrea, Florence Jaffrézic, and Grégory Nuel. 2013. “Joint Estimation of Causal Effects from Observational and Intervention Gene Expression Data.” BMC Systems Biology 7 (1): 111.

Rau, Andrea, and Cathy Maugis-Rabusseau. 2017. “Transformation and Model Choice for Rna-Seq Co-Expression Analysis.” Briefings in Bioinformatics 19 (3): 425–36.

Reich, David E, Michele Cargill, Stacey Bolk, James Ireland, Pardis C Sabeti, Daniel J Richter, Thomas Lavery, et al. 2001. “Linkage Disequilibrium in the Human Genome.” Nature 411 (6834): 199.

Reich, David, Alkes L Price, and Nick Patterson. 2008. “Principal Component Analysis of Genetic Data.” Nature Genetics 40 (5): 491.

Reinsch, Christian H. 1967. “Smoothing by Spline Functions.” Numer. Math. 10 (3): 177–83. http://dx.doi.org/10.1007/BF02162161.

Rice, John A. 2006. Mathematical Statistics and Data Analysis. Cengage Learning.

Risch, Neil, and Kathleen Merikangas. 1996. “The Future of Genetic Studies of Complex Human Diseases.” Science 273 (5281): 1516–7.

Robbins, RB. 1918. “Some Applications of Mathematics to Breeding Problems Iii.” Genetics 3 (4). http://europepmc.org/articles/PMC1200443.

Robinson, Mark D, and Alicia Oshlack. 2010. “A Scaling Normalization Method for Differential Expression Analysis of Rna-Seq Data.” Genome Biology 11 (3): R25.

Rokach, Lior, and Oded Maimon. 2005. “Clustering Methods.” In Data Mining and Knowledge Discovery Handbook, 321–52. Springer.

Roquain, Etienne. 2010. “Type I Error Rate Control for Testing Many Hypotheses: A Survey with Proofs.” arXiv Preprint arXiv:1012.4078.

Sanger, Frederick, Steven Nicklen, and Alan R Coulson. 1977. “DNA Sequencing with Chain-Terminating Inhibitors.” Proceedings of the National Academy of Sciences 74 (12): 5463–7.

Segata, Nicola, Jacques Izard, Levi Waldron, Dirk Gevers, Larisa Miropolsky, Wendy S Garrett, and Curtis Huttenhower. 2011. “Metagenomic Biomarker Discovery and Explanation.” Genome Biology 12 (6): R60.

Segura, Vincent, Bjarni J Vilhjálmsson, Alexander Platt, Arthur Korte, Ümit Seren, Quan Long, and Magnus Nordborg. 2012. “An Efficient Multi-Locus Mixed-Model Approach for Genome-Wide Association Studies in Structured Populations.” Nature Genetics 44 (7): 825.

Sharpton, Thomas J. 2014. “An Introduction to the Analysis of Shotgun Metagenomic Data.” Frontiers in Plant Science 5: 209.

She, Yiyuan, Zhifeng Wang, and He Jiang. 2016. “Group Regularized Estimation Under Structural Hierarchy.” Journal of the American Statistical Association 113 (521): 445–54.

Simpson, Edward H. 1951. “The Interpretation of Interaction in Contingency Tables.” Journal of the Royal Statistical Society. Series B (Methodological), 238–41.

Smith, John Maynard, and John Haigh. 1974. “The Hitch-Hiking Effect of a Favourable Gene.” Genetics Research 23 (1): 23–35.

Srinivas, Girish, Steffen Möller, Jun Wang, Sven Künzel, Detlef Zillikens, John F Baines, and Saleh M Ibrahim. 2013. “Genome-Wide Mapping of Gene–Microbiota Interactions in Susceptibility to Autoimmune Skin Blistering.” Nature Communications 4.

Stanislas, Virginie, Cyril Dalmasso, and Christophe Ambroise. 2017. “Eigen-Epistasis for Detecting Gene-Gene Interactions.” BMC Bioinformatics 18 (1): 54.

Stephens, Matthew, Nicholas J Smith, and Peter Donnelly. 2001. “A New Statistical Method for Haplotype Reconstruction from Population Data.” The American Journal of Human Genetics 68 (4): 978–89.

Stroup, Walter W. 2012. Generalized Linear Mixed Models: Modern Concepts, Methods and Applications. CRC press.

Sturtevant, A.H. 2001. A History of Genetics. G - Reference,Information and Interdisciplinary Subjects Series. Cold Spring Harbor Laboratory Press. https://books.google.fr/books?id=wDIisw1ZqAMC.

Su, Z., J. Marchini, and P. Donnelly. 2011. “HAPGEN2: Simulation of Multiple Disease Snps.” Bioinformatics 27 (16): 2304.

Šidák, Zbyněk. 1967. “Rectangular Confidence Regions for the Means of Multivariate Normal Distributions.” Journal of the American Statistical Association 62 (318): 626–33.

Thomas, Duncan C. 2004. Statistical Methods in Genetic Epidemiology. Oxford University Press.

Tibshirani, Robert. 1988. “Estimating Transformations for Regression via Additivity and Variance Stabilization.” Journal of the American Statistical Association 83 (402): 394–405.

———. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B (Methodological) 58 (1): 267–88.

Tibshirani, Robert, Jacob Bien, Jerome Friedman, Trevor Hastie, Noah Simon, Jonathan Taylor, and Ryan J Tibshirani. 2012. “Strong Rules for Discarding Predictors in Lasso-Type Problems.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 74 (2): 245–66.

Tibshirani, R., G. Walther, and T. Hastie. 2001. “Estimating the Number of Clusters in a Data Set via the Gap Statistic.” Journal of the Royal Statistical Society: Series B 63 (2): 411–23.

Visscher, Peter M., William G. Hill, and Naomi R. Wray. 2008. “Heritability in the Genomics Era-Concepts and Misconceptions.” Nature Reviews Genetics 9 (4): 255–66.

Wall, Jeffrey D, and Jonathan K Pritchard. 2003. “Haplotype Blocks and Linkage Disequilibrium in the Human Genome.” Nature Reviews Genetics 4 (8): 587.

Wang, Baohong, Mingfei Yao, Longxian Lv, Zongxin Ling, and Lanjuan Li. 2017. “The Human Microbiota in Health and Disease.” Engineering 3 (1): 71–82. https://doi.org/https://doi.org/10.1016/J.ENG.2017.01.008.

Wang, Jun, and Huijue Jia. 2016. “Metagenome-Wide Association Studies: Fine-Mining the Microbiome.” Nature Reviews Microbiology 14 (8): 508–22.

Wang, Jun, Louise B. Thingholm, Jurgita Skiecevičienė, Philipp Rausch, Martin Kummen, Johannes R Hov, Frauke Degenhardt, et al. 2016. “Genome-Wide Association Analysis Identifies Variation in Vitamin d Receptor and Other Host Factors Influencing the Gut Microbiota.” Nature Genetics.

Ward, J. H. 1963. “Hierarchical Grouping to Optimize an Objective Function.” Journal of the American Statistical Association 58 (301): 236–44.

Weinberg, Wilhelm. 1908. “Ber Den Nachweis Der Vererbung Beim Menschen.” Jahres. Wiertt. Ver. Vaterl. Natkd. 64: 369–82.

Weir, Bruce S, Lon R Cardon, Amy D Anderson, Dahlia M Nielsen, and William G Hill. 2005. “Measures of Human Population Structure Show Heterogeneity Among Genomic Regions.” Genome Research 15 (11): 1468–76.

Weir, Bruce S, and C Cockerham. 1996. “Genetic Data Analysis Ii: Methods for Discrete Population Genetic Data. Sinauer Assoc.” Inc., Sunderland, MA, USA.

Weir, Bruce S, and others. 1990. Genetic Data Analysis. Methods for Discrete Population Genetic Data. Sinauer Associates, Inc. Publishers.

Wetterstrand, KA. 2016. “DNA Sequencing Costs: Data from the Nhgri Genome Sequencing Program (Gsp).” www.genome.gov/sequencingcostsdata.

Williams, J. W. J. 1964. “Algorithm 232: Heapsort.” Communications of the ACM 7 (6): 347–348.

Wood, Simon N. 2006. Generalized Additive Models: An Introduction with R. crcpress. https://www.crcpress.com/Generalized-Additive-Models-An-Introduction-with-R/Wood/p/book/9781584884743.

Woodrow, J. C., and C. J Eastmond. 1978. “HLA B27 and the Genetics of Ankylosing Spondylitis.” Annals of the Rheumatic Diseases 37 (6): 504–9.

Wooley, John C, Adam Godzik, and Iddo Friedberg. 2010. “A Primer on Metagenomics.” PLoS Computational Biology 6 (2): e1000667.

Wright, Alan F, and Nicholas D Hastie. 2001. “Complex Genetic Diseases: Controversy over the Croesus Code.” Genome Biology 2 (8): comment2007–1.

Wright, Sewall. 1921. “Correlation and Causation.” Journal of Agricultural Research 20 (7): 557–85.

———. 1929. “The Evolution of Dominance.” The American Naturalist 63 (689): 556–61.

WTCCC. 2007. “Genome-Wide Association Study of 14,000 Cases of Seven Common Diseases and 3,000 Shared Controls.” Nature 447 (7145): 661–78.

Wu, M. C., P. Kraft, M. P. Epstein, D. M. Taylor, S. J. Chanock, D. J. Hunter, and X. Lin. 2010. “Powerful SNP-Set Analysis for Case-Control Genome-Wide Association Studies.” American Journal of Human Genetics 86 (6): 929–42.

Wu, M. C., S. Lee, T. Cai, Y. Li, M. Boehnke, and X. Lin. 2011. “Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test.” American Journal of Human Genetics 89 (1): 82–93.

Yang, Jian, Beben Benyamin, Brian P McEvoy, Scott Gordon, Anjali K Henders, Dale R Nyholt, Pamela A Madden, et al. 2010. “Common Snps Explain a Large Proportion of the Heritability for Human Height.” Nature Genetics 42 (7): 565.

Yi, Hui, Patrick Breheny, Netsanet Imam, Yongmei Liu, and Ina Hoeschele. 2015. “Penalized Multimarker Vs. Single-Marker Regression Methods for Genome-Wide Association Studies of Quantitative Traits.” Genetics 199 (1): 205–22.

Yoo, Yun Joo, Lei Sun, Julia G. Poirier, Andrew D. Paterson, and Shelley B. Bull. 2016. “Multiple Linear Combination (MLC) Regression Tests for Common Variants Adapted to Linkage Disequilibrium Structure: Yoo et Al.” Genetic Epidemiology 41. https://doi.org/10.1002/gepi.22024.

Yuan, Ming, and Yi Lin. 2006. “Model Selection and Estimation in Regression with Grouped Variables.” JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B 68: 49–67.

Zhu, Zhihong, Zhili Zheng, Futao Zhang, Yang Wu, Maciej Trzaskowski, Robert Maier, Matthew R Robinson, et al. 2018. “Causal Associations Between Risk Factors and Common Diseases Inferred from Gwas Summary Data.” Nature Communications 9 (1): 224.

Zondervan, Krina T, and Lon R Cardon. 2004. “The Complex Interplay Among Factors That Influence Allelic Association.” Nature Reviews Genetics 5 (2): 89.