Abonnement gratuit
The Daily Paris

Paris news, every day

News

Paris Eliminates Thousands of Duplicate Images From Digital Archives

A quiet but costly problem in municipal data management is finally getting attention — and the figures reveal just how deep it runs.

By Paris News Desk · Published 4 July 2026, 8:36 pm

3 min read

Paris Eliminates Thousands of Duplicate Images From Digital Archives
Photo: Photo by Daria Agafonova on Pexels
Traduction en cours…

Paris city hall is sitting on a problem it has been slow to quantify. Across the network of public digital archives managed by the Direction des Affaires Culturelles — the municipal body overseeing Paris's cultural and archival infrastructure — an internal audit completed in June 2026 identified tens of thousands of duplicate image files consuming server capacity, distorting public record searches, and quietly inflating operational costs. The audit, which examined repositories spanning everything from Seine riverfront regeneration photography to Grand Paris Express construction documentation, flagged redundancy rates that archivists describe as systemic rather than incidental.

The timing is not accidental. Paris is now deep into activating the legacy infrastructure built around the 2024 Olympics, and dozens of city departments are migrating photographic and design assets onto consolidated platforms. That migration — intended to streamline access for planners, journalists, and the public — has instead surfaced a backlog of duplicate files that accumulated over years of siloed departmental storage. When you move everything into one room, you find out how much was doubled up.

The Scale of the Problem

The numbers are instructive. According to the Direction des Affaires Culturelles' June audit, the city's shared digital asset management system contained roughly 340,000 image files across active repositories as of May 2026. Of those, preliminary automated scanning identified approximately 87,000 files — just over 25 percent — as probable duplicates, meaning identical or near-identical images stored under different filenames, in different folders, or uploaded on different dates by different departments. The Bibliothèque historique de la Ville de Paris on Rue des Francs-Bourgeois in the Marais, which maintains some of the oldest digitised municipal collections, contributed a disproportionate share of the redundancies, partly because its holdings were digitised in multiple separate tranches between 2014 and 2022 with inconsistent naming protocols.

Storage costs matter here. Municipal cloud storage contracts, which the Ville de Paris renegotiated with its primary infrastructure provider in early 2025, bill at a tiered rate. Industry benchmarks for comparable European municipal archive contracts suggest per-terabyte annual costs in the range of €200 to €400 once licensing and redundancy fees are included. Paris's archival image holdings are estimated internally at several hundred terabytes — meaning duplicate files alone could represent tens of thousands of euros in unnecessary annual expenditure, before factoring in the staff hours spent manually resolving search conflicts.

The problem is not unique to Paris. Berlin's Stadtarchiv flagged a similar redundancy issue in 2023 following its own migration project, and London's Wellcome Collection spent 18 months between 2022 and 2024 deduplicating its open-access image library. But Paris's scale — and the political pressure Macron's administration faces from a National Assembly that has scrutinised municipal spending closely — gives the issue an edge it might not otherwise carry.

What the City Plans to Do

The Direction des Affaires Culturelles has now engaged Paris-based digital asset specialists to run a structured deduplication programme using perceptual hashing technology, which identifies visually identical images regardless of filename or metadata. The Atelier Parisien d'Urbanisme, known as APUR, which holds extensive photographic documentation of Seine urban regeneration stretching back to the early 2000s, has been identified as a priority target for the first phase of cleanup, scheduled to begin in September 2026.

For residents and researchers, the practical effect of a cleaned archive will be faster, more accurate search results when accessing public-facing platforms like Paris.fr and the city's open-data portal. Journalists covering the Grand Paris Express — which has generated enormous volumes of site photography across 68 planned stations — have long complained about search tools returning multiple near-identical images for the same construction event.

The deduplication project carries its own price tag. The contract awarded in late June 2026 is valued at approximately €180,000 for a 12-month engagement. City officials argue the investment pays for itself within two years if storage overhead drops as projected. The arithmetic, at least, is straightforward.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Paris

This article was produced by the The Daily Paris editorial desk and covers news in Paris. See our editorial standards for how we use AI.

The Daily Paris brief

The day's Paris news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Paris news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Paris

More in News

Enjoyed this story? Get tomorrow's briefing free.