Paris's city hall confirmed last month that more than 340,000 duplicate images have been identified across the municipal digital archive managed by the Délégation Générale à la Transformation Numérique — a figure that administrators say is straining storage infrastructure and distorting search results on the city's public data portals. The problem, long treated as a back-office headache, has taken on new urgency as the Grand Paris Express construction documentation and Seine riverbank regeneration records have flooded city servers since 2023.
The timing matters. Paris spent heavily on digital infrastructure during and after the 2024 Olympics, committing to open-data principles that would make venue planning, transport logistics and public space records permanently accessible to citizens. When those records contain hundreds of near-identical aerial photographs of the Stade de France or duplicate inspection images from the Quai de Bercy flood-resilience works, the promise of transparency starts to look more like a data landfill. Officials at the Hôtel de Ville are under pressure from National Assembly deputies to demonstrate that the post-Olympics digital legacy actually functions.
What Paris Is Doing — and Where It Is Falling Behind
The city launched a deduplication pilot in March 2026 through Paris En Commun's digital services arm, focusing first on the archives of the Apur — the Atelier parisien d'urbanisme — which holds decades of neighbourhood survey photographs covering areas from Belleville to the 13th arrondissement's Chinatown district. The pilot uses perceptual hashing, a technique that compares image fingerprints rather than pixel-by-pixel data, allowing near-duplicate photos taken seconds apart during site surveys to be flagged automatically. Early results cleared roughly 18,000 redundant files from the Apur database in six weeks, according to the programme documentation published on paris.fr in May.
Compare that with Amsterdam, where the Gemeente Amsterdam began a similar deduplication drive in 2022 across its Beeldbank municipal image library. By early 2025, the Dutch capital reported removing over 200,000 duplicate or near-duplicate images and cut annual cloud storage costs by an estimated 23 percent. Tokyo's Bureau of Urban Development has gone further still, embedding automated deduplication directly into its upload workflow since 2024, meaning duplicates are flagged before they enter the archive rather than after.
Paris has not yet reached that upstream stage. The Apur pilot remains a retrospective cleanup, not a prevention system. The Grand Paris Express project authority, Société du Grand Paris, operates its own separate documentation platform and has not yet integrated with the city's deduplication tools, meaning images from tunnel surveys beneath the Plateau de Saclay or station construction at Fort d'Issy are accumulating independently, without coordinated quality control.
The Cost Question and What Comes Next
Storage is not cheap. Paris's 2025 municipal budget allocated roughly €4.2 million to digital infrastructure maintenance across all city departments — a figure that technical staff say is already under pressure from the volume of construction photography generated by major projects on the Rive Gauche and around Porte de la Chapelle, site of the Athletes' Village conversion now being turned into social housing. Redundant images inflate those costs directly.
London offers a cautionary parallel. The Greater London Authority's Datastore project faced criticism in 2023 when an audit found that duplicated planning application images were returning misleading results in property searches, eroding public trust in what was meant to be a flagship open-data initiative. Paris administrators appear aware of that reputational risk.
The next phase of the Paris pilot, scheduled for rollout in October 2026, will extend the perceptual hashing system to the Direction de l'Urbanisme's planning application portal — the tool most used by architects, developers and residents checking permit histories in neighbourhoods like Montrouge and Saint-Denis. If the technology performs as well on that larger and messier dataset as it did on the Apur archive, the city plans to propose a shared deduplication standard for all Grand Paris municipalities by the end of 2026. Residents and professionals who regularly use the city's open portals should expect search results to become noticeably cleaner — and faster — before winter.