Laurent Jacob
Laurent Jacob
About
Projects
People
Publications
Community
Light
Dark
Automatic
GWAS
CALDERA
A follow-up on DBGWAS. We use the structure of the de Bruijn graph to define new genomic variants (e.g. gene presence) that accomodate polymorphisms. This amounts to defining one variant for each connected subgraph. We rely on a reverse search strategy to efficiently enumerate variants, and on the Tarone strategy to control the FWER when testing their association with a bacterial phenotype.
PDF
Code
Video
DBGWAS
We describe the variation in bacterial genomes through their content in short sequences (k-mers). This avoids the bias usually caused by focusing on pre-defined SNPs or gene lists. We use this representation to pinpoint the genomic variation associated to antimicrobial resitances, and rely on a so-called de Bruijn graph to help make sense of the identified k-mers in terms of SNPs or mobile genetic elements.
PDF
Code
SEISM
Convolutional neural networks on biological sequences are known to learn probabilistic motifs associated with the target phenotype. To go beyond this informal interpretation, we provide a statistical test quantifying the motif-phenotype association. This requires to solve a post-selection inference problem, where the selection involves the infinite set of possible motifs.
Antoine Villié
,
Laurent Jacob
PDF
Code
Cite
×