Postdoc: Better real-world health-data distributed analytics research

DEIB, Milano (IT) Apr. 2024 - now Pietro Pinoli, Anna Bernasconi
  • https://www.better-health-project.eu/
  • Main achievements:
    • Propose a general FAIR data model for handling different health data
    • Define a set of metadata for each dataset and align it with existing healthcare ontologies
    • Propose and implement an ETL pipeline to integrate and FAIRify such data
    • Expose the integrated data in a decentralized structure following the PHT (Personal Health Train) for researchers to analyze data and find possible causes of 3 rare diseases (intellectual disability, eye distrophies, and suicidal behaviors for autistic patients)
  • clinical data
  • heterogeneity
  • modeling
  • FAIR
  • Python

PhD: User-oriented exploration of semi-structured datasets

Institut Polytechnique de Paris & Inria Saclay (FR) Jan. 2021 - March 2024 Pr. Ioana Manolescu, SME WeDoData
  • My PhD work received the accessit (2nd place) for the prize given to PhDs with significant contributions to the data management community.
  • Main achievements:
    • Entity-Relationship-like summaries out of semi-structured datasets (system Abstra)
    • Entity path enumeration between entities of interest in heterogeneous data (system PathWays)
  • Summer schools: HiParis 2021, MDD 2022, HiParis 2023
  • Transversal training: "Teaching at university", "Public speaking" and English courses
  • See also Talks and Teaching sections
PhD manuscript
Defense slides
  • semi-structured data
  • data summarization
  • E-R model
  • interesting connections
  • Java

Internship: Predicting the environment of a neighborhood

LIRIS (FR) Feb. 2020 - July 2020 Fabien Duchateau, Franck Favetta, start-up Home in Love
  • Propose an algorithm to select top-k features about neighborhoods to help the prediction
  • Predict a set of environment variables, e.g., the landscape, or the wealth of a neighborhood
  • Develop a cartographic visualization interface for neighborhoods
  • Develop an interface for generic tuning of prediction algorithms
Internship report (in French)
Defense slides (in French)
  • variable selection
  • prediction algorithms
  • experimental evaluation
  • Python

Research project: Matching and merging geographic entities

LIRIS (FR) Jan. 2019 - June 2019 Fabien Duchateau, Franck Favetta
  • Integrate heterogeneous cartographic data from Geonames, Bing, Here and Open Street Maps
  • Create a global schema based on individual data providers schemas
  • Propose a tunable formula for detecting correspondences, based on attribute similarity, and automatically estimate their quality
  • Merge correspondences (different strategies)
  • Develop an interface for matching and merging correspondences between points of interest (geographic entities) with an estimation of the quality
Poster (in French)
  • entity matching
  • entity merging
  • similarity measures
  • PHP
  • JavaScript

Internship: User-oriented real estate recommendations

LIRIS (FR) May 2018 - July 2018 Fabien Duchateau, Franck Favetta, start-up Home in Love
  • Integrate data from heterogeneous sources (Excel, JSON, GeoJSON...)
  • Use prediction algorithms to recommend neighborhoods based on few criteria
  • Use clustering algorithms to classify neighborhoods based on their characteristics
  • Develop an interface for facilitating comparison and recommendation of neighborhoods in France
Internship report (in French)
  • recommendation algorithms
  • user interface
  • similarity measures
  • Python
  • JavaScript