Abonnement gratuit
The Daily Paris

Paris news, every day

News

Paris Digital Archives Are Riddled With Duplicate Images — And the Numbers Tell a Damning Story

A quiet data crisis is undermining the city's ambitious push to digitise its urban heritage, with tens of thousands of redundant image files clogging municipal servers and inflating costs.

By Paris News Desk · Published 4 July 2026, 9:26 pm

3 min read

Paris Digital Archives Are Riddled With Duplicate Images — And the Numbers Tell a Damning Story
Photo: Photo by Colin Piret on Pexels
Traduction en cours…

Paris city hall is sitting on an estimated 40,000 duplicate image files across its publicly accessible digital heritage platforms — a figure that emerged from an internal audit completed in late June 2026 and shared with municipal councillors ahead of the July budget session. The problem is not merely aesthetic. Each redundant file costs real money to store, index and maintain, and the cumulative drag on the city's digital infrastructure has become impossible to ignore as Grand Paris Express construction crews generate thousands of new archival photographs every week.

The timing is awkward. Mayor Anne Hidalgo's administration staked significant political capital on the post-Olympics digital legacy programme, which promised to make Paris one of Europe's most accessible cities for open cultural data by 2027. That pledge sits uncomfortably alongside a digitisation backlog that, according to figures circulated within the Direction des Affaires Culturelles, now runs to roughly 1.2 million unprocessed items held across depots in the 13th and 19th arrondissements.

What the Numbers Actually Show

The Bibliothèque historique de la Ville de Paris, on Rue des Francs-Bourgeois in the Marais, manages one of the largest municipal photographic collections in France — more than 800,000 digitised images catalogued since 2018. Staff there estimate that between 8 and 12 percent of entries in the shared Musées de la Ville de Paris database carry at least one duplicate file attachment. At standard municipal cloud storage rates, which the city renegotiated with a European provider in March 2025 at approximately €0.023 per gigabyte per month, the redundant files represent a recurring cost running into six figures annually when factoring in metadata processing and quality-assurance cycles.

The Paris Musées consortium, which pools digital assets from 14 city-owned museums including the Petit Palais on Avenue Winston Churchill and the Musée Carnavalet near Place des Vosges, adopted an open-licence image policy in 2020 that made more than 150,000 works freely downloadable. That openness, while culturally significant, created an unexpected secondary problem: third-party platforms scraped and re-uploaded images, which then re-entered municipal databases through automated ingestion pipelines that failed to detect near-identical files. A software audit completed by a Montrouge-based contractor in May 2026 found that roughly 6,200 images had been ingested two or more times through this loophole alone.

The Grand Paris Express Factor

The problem is accelerating. Société du Grand Paris, the public body overseeing the €36 billion metro extension, is contractually required to produce documentary photography at each of its 68 new stations. That obligation generates an estimated 3,500 new archival images per month across active worksites from Saint-Denis Pleyel in the north to Villejuif-Institut Gustave Roussy in the south. Without a unified deduplication protocol, those images flow into at least three separate archival systems that do not communicate with each other in real time.

The Direction du Numérique de la Ville de Paris has piloted a perceptual hashing tool — technology that assigns a fingerprint to each image based on visual content rather than filename — across a test batch of 50,000 records since February 2026. Early results flagged a 14 percent duplicate rate in that sample, higher than previous manual estimates. A full rollout across all municipal platforms would require an additional €280,000 in licensing and integration costs, according to figures presented to the conseil de Paris in June.

For cultural institutions and urban planners working with Paris's growing open data ecosystem, the practical advice is straightforward: cross-reference any downloaded asset against the data.paris.fr portal before re-uploading or embedding, since the portal now carries a last-verified timestamp that flags files older than 18 months as potentially superseded. The Direction du Numérique plans to publish a revised data-quality charter by September 2026, which will for the first time set binding standards for image deduplication across all bodies funded by the city. Whether that charter arrives before the next budget cycle determines, in large part, how much the redundancy problem costs Parisian taxpayers going into 2027.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Paris

This article was produced by the The Daily Paris editorial desk and covers news in Paris. See our editorial standards for how we use AI.

The Daily Paris brief

The day's Paris news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Paris news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Paris

More in News

Enjoyed this story? Get tomorrow's briefing free.