Current students


LAMPIS ANDREACycle: XL

Advisor: MATTEUCCI MATTEO
Tutor: IEVA FRANCESCA

Major Research topic:
Machine Learning for Polygenic Risk Prediction of Complex Non-communicable Diseases

Abstract:
In genetics, complex diseases result from the interactions of multiple genetic variants and environmental factors. Population genomics employs polygenic risk scores (PRS) to estimate individual genomic risks for diseases by summing the effects of genetic variants. While PRSs hold promise for clinical applications, their effectiveness is hindered by several challenges: low accuracy due to simplistic additive modeling and ignoring gene-gene and gene-environment interactions, biased data from predominantly European ancestries affecting transferability to other ethnic groups, and restricted access to genotype databases limiting the creation of diverse and accurate predictive models. Identifying high-risk individuals is essential for targeted prevention and therapy, necessitating innovative approaches to overcome the aforementioned obstacles. Advanced machine learning (ML) methods offer a promising direction by capturing complex data patterns and integrating different data modalities (genetic, environmental, clinical) to enhance disease prediction and derive insight into its mechanisms. ; ; The research will focus on the development of novel representation learning and machine learning methods to enhance polygenic risk prediction. Specific methodologies will include: ; - Representation learning of genotype (and potentially also environmental and clinical) data; ; - Latent space arithmetic or explanation/attribution methods to identify the drivers of complex diseases; ; - (Deep) Transfer Learning and Generative Modeling to overcome data-access and privacy barriers and enhance PRS portability across ethnicities; ; - Advanced Deep Learning/Machine learning risk prediction models and multi-modal integration to account for gene-gene and gene-environment interactions