Propose a general interoperable data model for handling different health data
Define a set of metadata for each dataset and align it with existing healthcare ontologies
Propose and implement an ETL pipeline to integrate and FAIRify such data
Expose the integrated data in a decentralized structure following the PHT (Personal Health Train) for researchers to analyze data and find possible causes of 3 rare diseases (intellectual disability, eye distrophies, and suicidal behaviors for autistic patients)
clinical data
heterogeneity
modeling
FAIR
Python
User-oriented exploration of semi-structured datasets
Institut Polytechnique de Paris & Inria Saclay (FR)
Jan. 2021 - March 2024
Pr. Ioana Manolescu, SME WeDoData
My PhD work received the accessit (2nd place) for the prize given to PhDs with significant contributions to the data management community.
Main achievements:
Entity-Relationship-like summaries out of semi-structured datasets (system Abstra)
Entity path enumeration between entities of interest in heterogeneous data (system PathWays)