Paris's municipal documentation services are sitting on an estimated tens of thousands of duplicate digital images spread across at least a dozen separate archival systems — a legacy of rushed, siloed digitisation drives that began accelerating after 2015 and compounded dramatically in the run-up to the Paris 2024 Olympics. The problem has become pressing enough that the Direction des Affaires Culturelles de la Ville de Paris formally flagged it as a priority remediation task in its 2026 operational review cycle.
The timing matters because the city is now deep into its post-Olympics legacy activation phase. Venues from the Stade de France in Saint-Denis to the Paris La Défense Arena in Nanterre generated enormous volumes of photographic documentation between 2022 and 2024 — construction records, event imagery, infrastructure surveys — all captured by different contractors working to different briefs. Much of that material was ingested into pre-existing city databases without deduplication protocols in place. The result is redundancy layered on top of redundancy.
A Decade of Disconnected Digitisation
The roots of the problem go back further than the Olympics. From around 2012 onward, individual arrondissement administrations, the Bibliothèque historique de la Ville de Paris on Rue des Francs-Bourgeois in the 4th arrondissement, and the Atelier Parisien d'Urbanisme — known as APUR — each pursued their own image capture and storage strategies. When the Grand Paris Express construction programme launched its documentation efforts across the Île-de-France region, adding hundreds of site photographs weekly from stations along Line 15 and Line 16 corridors, the institutional gap between those records and existing city archives widened further. Nobody owned the whole picture.
APUR, whose mandate covers urban research and planning documentation for the greater Paris metropolitan area, maintains its own image library that partially overlaps with holdings at the Médiathèque de l'Architecture et du Patrimoine, located at the Hôtel de Vigny near the Champs-Élysées. Both institutions digitised significant portions of their physical collections during the 2018-2022 period, but used different metadata schemas and file naming conventions. Cross-referencing them is, by any technical measure, a manual nightmare.
The scale of the duplication problem in European municipal archives is not unique to Paris. Cities that undertook rapid digitisation — particularly those hosting major international events — frequently report post-event data hygiene challenges. But Paris's administrative structure, which distributes cultural and planning responsibilities across the mairie centrale, twenty arrondissement offices, and multiple regional bodies under the Métropole du Grand Paris umbrella, makes consolidation structurally harder than in more centralised systems.
What Remediation Looks Like — and What It Costs
Fixing this is neither quick nor cheap. Deduplication projects of this complexity typically require a combination of perceptual hashing software — tools that identify visually similar images even when file names differ — and human editorial review for ambiguous cases. For a collection of the scale Paris is dealing with, industry benchmarks suggest the process can run to several hundred thousand euros when staff time, software licensing, and data migration are factored in.
The city's current approach, outlined in planning documents circulated within the Direction des Affaires Culturelles this spring, involves a phased audit beginning with the most heavily duplicated collections: the Olympics construction archive and the Seine riverbank regeneration photographic record, the latter connected to the multi-year projet urbain along the Berges de Seine between the Pont de l'Alma and the Pont d'Iéna. Both collections are flagged as high priority because they feed directly into public-facing platforms used by urban planners, journalists, and researchers.
The practical advice for anyone trying to use Paris's public image archives right now is straightforward: treat any single search result with caution, cross-reference across at least two platforms, and where possible request provenance metadata directly from the holding institution. The Bibliothèque historique de la Ville de Paris accepts provenance queries by email and typically responds within ten working days. For researchers working on Seine regeneration specifically, APUR publishes its image collections through its own web portal, which is updated independently of the main city archive system — making it the more reliable single source until the deduplication work is complete, which administrators are not expected to finish before early 2027.