Bridging the gap with MLOps
Insee — French National Institute of Statistics and Economic Studies
Insee
2026-06-04
Defining production
Going to production: making an application live in the space of its users
Source: ibm.com
Code
Versioning (Git), improving quality with formatters (Ruff), community standard structure (cookiecutters)…
Configuration
Virtual environments and dependency management (uv), controlling external dependencies (Docker)…
Data
Standardised format (Parquet), cloud storage (S3), pipeline-oriented workflow (dbt)…
Docker) solves this by packaging the full runtime alongside the code:
Kubernetes turns individual containers into an industrialised, scalable fleet
Note
Great improvement but this is only the first phase of a project: continuous improvement requires observability
During development, practitioners need to:
MLFlow (and similar platforms!) provides a centralised tracking server, model registry, and serving API
LLM-based systems need additional tracking. Langfuse adds:
| Supervised ML | LLM-based systems | |
|---|---|---|
| Training | Full retraining cycle | Fine-tuning or prompt engineering |
| Evaluation | Standard metrics (F1, RMSE…) | Requires LLM-as-judge or human review (see 4) |
| Drift | Feature / label drift | Prompt drift, outdated knowledge base |
| Availability | Batch or on the fly ? | Continuous |
| Infrastructure | CPU often sufficient for inference | Bigger and bigger GPU (💵💵) |
Supervised learning:
LLMs:
Avertissement
Is it really possible to leapfrog when having missed the ML era ?
The gap between a notebook that works and a system that delivers value is not only a technical gap: it is an operational one.