Current students - PhD DADS

CALDERA LUCA

Cycle: XL

Advisor: IEVA FRANCESCA
Tutor: MATTEUCCI MATTEO

Major Research topic:
Advancing Robust and Reliable Healthcare Data Analysis through Disentanglement, Harmonization, and Synthetic Data Generation

Abstract:

Healthcare data are inherently complex, with various sources of heterogeneity that must be carefully addressed. Privacy concerns, inconsistencies in data distribution across institutions, variations in imaging equipment, and diverse acquisition protocols further complicate the development of reliable machine learning models, which require large, high-quality datasets. Neglecting these issues can lead to models that are biased, unreliable, or limited in their ability to generalize, thereby reducing their effectiveness in real-world medical settings. ; This thesis tackles these challenges by enhancing the robustness and reliability of healthcare data analysis through three key methodologies: disentanglement, harmonization, and synthetic data generation. Disentanglement techniques identify and separate different sources of heterogeneity, improving model fairness and interpretability. Harmonization methods standardize and integrate medical data from various sources to ensure consistency and comparability, enabling more effective cross-institutional studies. Meanwhile, synthetic data generation preserves patient privacy while augmenting datasets, enhancing model generalization and performance. ; By integrating these approaches, this work advances more accurate and generalizable healthcare data analytics, ultimately supporting data-driven decision-making in medical research and clinical practice. ;