Heterogeneous datasets: a tale of integration and exploration

LIRIS, DB team Feb. 2025 1h

This talk presents the different researches that I have conducted during my PhD and my post-doc in the heterogeneous data integration area. See my research.


Integrating and exploring heterogeneous datasets

DEIB Apr. 2024 1h30

This talk presents the different researches that I have conducted since my Bachelor in the heterogeneous data integration area. See my research.


Semi-structured data user exploration

LIB, LISN Jan. 2024, Oct. 2023 1h

This talk presents my PhD work on how to approach semi-structured data as a novice or as a data scientist who has to work with new data. The two main axes are: (a) E-R diagrams built out of any structured or semi-structured dataset ; (b) entity-to-entity path enumeration and ranking.


From data to journalism

Lycée international de Palaiseau Jan. 2024 2h30

This workshop has been conducted for and with first-year (CPES) undergraduate students. The goal is to provide background in data integration, relational databases, graphs, but also NLP and LLM. The workshop also engaged students to manipulate existing research tools for data integration, such as ConnectionLens and StatCheck.


Intelligence Artificielle: un outil au service du journalisme

CFI July 2023 1h

The goal of this forum at CFI (French media development agency) is to present new research directions and tools that journalists could later use in their quest of better information sharing and acquisition. ConnectionStudio has been presented in detail; high interest has been shown during a long question-answering time.

Slides (in French)

Recherche en intégration de données: le cas de ConnectionLens

RJMI Feb. 2022 30min

This talk is aimed at high school female scientists and at the RMJI (young female mathematicians and computer scientists meeting). It covers both data integration challenges and main aspects of being a PhD student and doing research. The former part presents ConnectionLens, a graph-based approach to integrate very heterogeneous data (structured, semi-structured and un-structured) in the context of data journalism. The latter part emphasises the different aspects of doing research.

Slides (in French)