ML Engineer - National Audiovisual Institute (INA)
Design and implement an image search engine unlocking
access to 250K+ hours of archival video content, surfacing
previously hard-to-discover media.
Orchestrate large-scale on-premise embedding pipelines with Airflow DAGs
and vLLM, producing more than 1 billion image embeddings.
Architect and operate large-scale data pipelines (ETL) processing terabytes of multimodal content, including semantic transcription chunking, embeddings,
and metadata, enabling high-performance, low-latency Elasticsearch indexing at
scale.
Deploy and optimize vLLM inference infrastructure on k3s/k8s using Helm, benchmarking performance against local pod-based inference to maximize
scalability and efficiency.
Conduct in-depth research on vector quantization techniques in vector databases (e.g., Qdrant, Weaviate), benchmarking their performance
in Elasticsearch and OpenSearch to inform system design decisions.
Drive full-stack development across frontend, backend, and search
infrastructure using Vue.js, FastAPI, and Elasticsearch.
Research Internship - Self-supervised Learning on Satellite Image Time Series
Conducted extensive literature review on state-of-the-art methods
in self-supervised learning (SSL) applied to Satellite Image Time Series (SITS).
Reproduced and benchmarked baseline models to ensure robust
performance comparisons.
Developed and implemented an innovative image retrieval pretext task
utilizing a specialized hierarchical loss function to train SSL models on
SITS data.
Actively participated in research lab activities.
Research project - Explainable insects classification
Development of a convolutional neural network for insect classification.
Use XAI with gradient-based approaches and the LIME and SHAP frameworks.
Takeover, adaptation and documentation of the project.
Experiences monitoring and evaluation of different models with WandB.
Research Project - INRIA
Study of the trade-off between observation and action in reinforcement
learning (subject details)
Understanded, independently, a cutting-edge problem in reinforcement
learning.
Demonstrated perseverance and determination in the face of technical
challenges.
Implemented an experimental plan.
Education
Master's Degree in Computer Science - Specialisation Data, Machine Learning, and Knowledge (DAC)
Relevant courses:
Image and signal processing
Mathematics for ML (statistics/probability, Markov Chain, ...)
Databases: SQL, XML, JSON, distributed databases
Machine Learning: classification, neural networks, decision trees
Natural language processing, information retrieval
Opening courses from Master of Mathematics: statistical learning and convex
optimization.