Abonnement gratuit
The Daily Paris

Paris news, every day

News

Paris Digitisation Drive Buried Under a Mountain of Duplicate Images: The Numbers That Explain the Crisis

City archives, property listings and public heritage databases are clogged with redundant visual files, and the scale of the problem is now measurable.

By Paris News Desk · Published 4 July 2026, 9:06 pm

4 min read

Paris Digitisation Drive Buried Under a Mountain of Duplicate Images: The Numbers That Explain the Crisis
Photo: Photo by Sergey Guk on Pexels
Traduction en cours…

Paris holds roughly 14 million digitised archival images across public institutions, and administrators say a growing share of that catalogue is made up of duplicates — identical or near-identical files stored multiple times across incompatible servers. The problem has moved from an irritant to a genuine budget concern, with data managers at the Bibliothèque nationale de France and the Paris Archives municipales on the Rue des Quatre-Fils each flagging the issue in internal planning cycles this year.

The timing matters because the city is mid-way through a post-Olympic legacy push that involves opening public datasets to developers and researchers. The Paris 2024 Games generated an estimated 2.3 petabytes of official imagery, event documentation and venue records — much of it ingested rapidly by multiple city agencies with no coordinated deduplication protocol. Redundant files drive up cloud storage costs, slow search retrieval and erode public trust in the accuracy of official databases. For a capital committed to turning Seine-side infrastructure data and Grand Paris Express construction records into accessible civic tools, duplicate image bloat is not a technical footnote. It is a structural problem.

Where the Bloat Is Worst — and What It Costs

Property records are the sharpest example. The Parisian rental market, already under severe pressure with median asking rents in the 11th arrondissement reaching approximately €28 per square metre in early 2026, depends partly on notarial photo documentation held by the Chambre des Notaires de Paris. Internal audits have found duplication rates in property image archives running as high as 34 percent in some municipal land registries — meaning more than one-in-three stored images may be a redundant copy. Storage costs for those redundant files alone are estimated at several hundred thousand euros annually across the city's major public repositories, though precise figures have not been made public.

The Atelier parisien d'urbanisme, known as APUR, which coordinates spatial data for the Greater Paris region, began a structured deduplication review in the first quarter of 2026. The review covers visual assets tied to Seine waterfront regeneration projects between the Pont de Bercy and the Pont d'Austerlitz — a corridor that has generated dense photographic documentation across multiple planning phases since 2019. Preliminary results, according to APUR's published methodology documents, suggest that deduplication tools applying perceptual hash algorithms — which detect images that look identical even if file sizes differ slightly — can reduce active image libraries by between 18 and 40 percent without any loss of unique content.

The Grand Paris Express project office, overseeing 200 kilometres of new metro lines across the Île-de-France region, faces a version of the same problem at greater scale. Engineering documentation, site photography and BIM-linked imagery from dozens of construction sites from Saint-Denis to Créteil have been logged through at least four separate contractor platforms, each with its own naming conventions. Cross-referencing those platforms has been complicated by the absence of a shared metadata standard, leaving project managers unable to determine automatically how many unique images exist versus how many are copies filed under different identifiers.

What Comes Next for the City's Image Databases

A working group convened under the city's Délégation générale à la transformation numérique is expected to publish technical recommendations before the end of September 2026. The group is examining a phased approach: first, automated deduplication of static historical archives where no active editing takes place; second, real-time deduplication protocols for new ingest pipelines connected to ongoing construction and urban planning projects.

For citizens and developers using the opendata.paris.fr portal, the practical effect of a successful deduplication programme would be faster search results and smaller download packages when pulling image-linked datasets. Researchers at the École nationale des chartes on the Rue des Francs-Bourgeois, which trains archivists and digital heritage specialists, have been piloting deduplication workflows since January 2025 as part of a broader curriculum update. Their early findings suggest that manual review of flagged duplicate clusters — necessary when automated systems produce false positives — adds roughly one working day per ten thousand images audited.

The city has set no firm public deadline for resolving the problem across all its institutions, but the pressure is real. Every month of delay means more redundant files accumulating, higher storage invoices arriving, and a digital heritage infrastructure increasingly difficult to trust.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Paris

This article was produced by the The Daily Paris editorial desk and covers news in Paris. See our editorial standards for how we use AI.

The Daily Paris brief

The day's Paris news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Paris news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Paris

More in News

Enjoyed this story? Get tomorrow's briefing free.