Date(s) : 06/06/2013 iCal
14 h 00 min - 15 h 00 min
Large scale network-guided feature selection in genome-wide association mapping. By Chloé-Agathe Azencott, Max Planck Institute.
Genome-wide association studies (GWAS)\, in which hundreds of thousands\nor millions of single nucleotide polymorphisms (or SNPs) are genotyped\nfor up to tens of thousands of individuals\, are a powerful tool to\ndetect genetic loci likely to be associated with a complex trait. In\nrecent years\, they have made it possible to identify hundreds of new\nsusceptibility loci for common diseases. However\, they still often fail\nto explain much of the phenotypic variability that is known to be due to\ngenetic causes. It is therefore imperative to establish methods that can\nbetter extract biological signal from large GWAS data sets.\n\nModeling the joint effects of multiple genomic loci is a major avenue to\nfollow to achieve this goal. While several methods for multi-locus\nmapping have been proposed\, it is often unclear how to relate the\ndetected loci to current biological knowledge. Integrating information\nabout biological pathways and networks to GWAS can help detecting\nmeaningful and interpretable associations. However\, the few multi-locus\nmethods that attempt it are either restricted to investigating a limited number of predetermined sets of loci\, or do not scale to genome-wide settings.\n\nWe present a new efficient method to discover sets of features that are maximally\nassociated with an output variable\, while being connected in an underlying\nnetwork. This method\, based on a min-cut reformulation\, outperforms all its\ncomparison partners in terms of runtime and easily scales to millions of\nvariables. In simulation studies\, it exhibits higher precision than other\nmethods in detecting true causal features. On real GWAS data from Arabidopsis\nthaliana\, it detects loci that enable accurate phenotype prediction\nand are supported by the literature.