How Paris's Public Archives Ended Up Drowning in Duplicate Images — And What's Being Done About It
A years-long accumulation of redundant visual records across city institutions has forced a reckoning with how Paris manages its digital heritage.
A years-long accumulation of redundant visual records across city institutions has forced a reckoning with how Paris manages its digital heritage.

Paris's municipal digital archive holds somewhere in the region of 14 million images. A significant portion of them, according to internal reviews conducted by the Direction des Affaires Culturelles de la Ville de Paris, are duplicates — the same photograph stored twice, three times, sometimes a dozen times across separate departmental servers. The city is now midway through a structured deduplication programme that began formally in January 2025 and is expected to run through late 2026.
The problem did not appear overnight. It is the product of roughly two decades of institutional fragmentation, accelerated by the Paris 2024 Olympics and its legacy documentation requirements, and compounded by the Seine riverbank regeneration project, which generated its own enormous visual archive. Every directorate that touched those projects retained its own copies. Nobody coordinated storage. The result is a digital warehouse stuffed with near-identical aerial shots of the Trocadéro and redundant construction progress photos from the Bercy-Charenton development zone.
The root cause is structural. Paris, like most large European municipalities, digitalised its photographic collections in waves rather than through a unified strategy. The Bibliothèque historique de la Ville de Paris on Rue Pavée in the 4th arrondissement began scanning its holdings in the early 2000s. The Atelier Parisien d'Urbanisme — APUR — built its own geospatial image library independently. The Paris Musées network, which covers 14 municipal museums including the Musée Carnavalet and the Petit Palais, digitised collections on a museum-by-museum basis with no shared deduplication layer.
The 2024 Olympics pushed the problem to a breaking point. Documentation requirements for the Games' legacy activation meant that city communications teams, the Comité d'Organisation Paris 2024, urban planning directorates, and the Préfecture de la Région Île-de-France were all simultaneously archiving images of the same venues — the Stade de France in Saint-Denis, the shooting range at Châteauroux, the temporary installations along the Champ-de-Mars. Storage costs climbed. Retrieval times slowed. Archivists began flagging the issue formally through the city's internal reporting structures in mid-2024.
A parallel pressure came from the Grand Paris Express construction programme. With 68 new stations under development across the metropolitan zone, the Société du Grand Paris has been one of the most photographically documented infrastructure projects in French history. Its own media library, maintained separately from the city's systems, added another layer of duplication when images crossed over into municipal communications channels.
The current programme, coordinated through the Direction des Systèmes et Technologies de l'Information at the Hôtel de Ville, uses perceptual hashing — a technique that identifies visually similar images even when file names, resolutions, or metadata differ. The first phase, completed in March 2026, audited holdings across six major city directorates and identified approximately 2.3 million candidate duplicates, according to the programme's publicly released scope document from February 2025.
Not every duplicate is straightforwardly deletable. Archivists distinguish between true duplicates — bit-for-bit identical files — and near-duplicates, which may carry different provenance metadata that has its own historical value. A photograph of the Place de la République taken by a city communications officer in 2015 and separately acquired from a press agency may be visually identical but legally and institutionally distinct. Those decisions are being made case by case, which is why the full programme runs to the end of 2026.
The practical stakes extend beyond storage economics. Paris's housing and urban planning directorates rely on image archives to document neighbourhood change over time — particularly in areas like the northern banlieues targeted by the Politique de la Ville programme. If archival records are unreliable or inconsistent, so too is the evidentiary basis for planning decisions affecting tens of thousands of residents.
For institutions and citizens who rely on the city's public image collections — researchers at the Bibliothèque historique, journalists, architects — the deduplication work should eventually mean faster search, cleaner metadata, and more accurate provenance records. The city has indicated it expects a public-facing improvement to its Archives de Paris online portal, accessible at archives.paris.fr, by the first quarter of 2027.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Paris
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News