Paris city hall confirmed this spring that the municipal image library maintained by the Direction de l'Information et de la Communication de la Ville de Paris had accumulated more than 340,000 digital assets since its last major audit in 2019 — and that preliminary checks found roughly 18 percent of those files were either exact duplicates or near-identical variants generated by automated resizing and AI-upscaling tools. The cleanup, now formally budgeted at €2.1 million through 2027, represents the most ambitious deduplication exercise any French municipality has attempted since the Bibliothèque nationale de France tackled its own Gallica digitisation backlog in 2021.
The timing matters. Paris 2024 Olympics legacy programmes have flooded official channels with hundreds of thousands of event photographs, and the Grand Paris Express construction authority has been filing weekly documentation images since tunnelling work intensified along the Line 15 South corridor last year. Without active deduplication, archivists warn, storage costs compound, search tools degrade, and journalists or civil servants risk republishing the wrong image of the wrong venue at exactly the wrong moment — a problem that already surfaced publicly in March when the Mairie du 10e arrondissement's website briefly ran a stock photograph of a London canal beside an article about the Canal Saint-Martin.
What Paris Is Doing Differently
The city's approach leans heavily on a partnership with the Institut national de l'audiovisuel and the Paris urban planning agency Apur, which together developed a perceptual hashing protocol — essentially a fingerprinting system that detects visually similar images even after cropping, colour-grading or format conversion. The tool was piloted at the Pavillon de l'Arsenal on Boulevard Morland in the 4th arrondissement, where Apur stores thousands of urban photography records. By February 2026, that pilot had flagged more than 6,400 duplicate files across three connected databases, roughly a third of which had been duplicated through automated batch uploads rather than human error.
Compare that with London, where the Greater London Authority's digital archive team has relied on commercial vendor software — specifically Adobe Experience Manager — but has no single cross-departmental deduplication mandate. Berlin's Senate Chancellery launched a similar audit in late 2024 under its Transparenz Portal initiative, but the process remains voluntary for individual Bezirke, and participation across the city's twelve districts has been uneven. New York City's Department of Citywide Administrative Services runs a centralised digital asset management platform called NYC Media, but deduplication there is triggered only at the point of upload rather than applied retroactively to legacy files dating back before 2015.
Paris is not ahead on every metric. The €2.1 million budget covers only the municipal archive, leaving out files held by semi-public operators like the Société du Grand Paris, which manages its own image inventory for the metro expansion. And the Institut national de l'audiovisuel collaboration, while technically sophisticated, has no enforcement mechanism — department heads can reject flagged duplicates if they believe a distinction exists between similar images, a loophole archivists at the Hôtel de Ville have already described internally as a potential bottleneck.
What Residents and Institutions Should Expect Next
The practical impact will be felt most directly by journalists, researchers and NGOs that routinely pull images from Paris Open Data portals. By the end of the third quarter of 2026, the city has said it expects to publish an updated, cleaned image set covering the Seine-Saint-Denis and eastern Paris regeneration zones — the neighbourhoods most extensively photographed during the Olympic construction period and the most likely to contain near-duplicate aerial shots filed by different contractors on overlapping dates.
For smaller cultural institutions along the Rue du Faubourg Saint-Antoine or community archives in Belleville, the deduplication push also creates an opportunity. The city is offering technical workshops through its Paris Numérique programme, starting in September, to help local organisations apply the same hashing tools to their own collections. Registration opens in August through the Direction de la Démocratie, des Citoyen·nes et des Territoires. The sessions are free. Capacity is capped at forty participants per session — a detail that suggests demand may quickly outstrip supply if the programme gains traction beyond the city's own departments.