Abonnement gratuit
The Daily Paris

Paris news, every day

News

Paris Museums Are Drowning in Duplicate Digital Images — And the Numbers Tell a Damning Story

A growing audit of Paris's cultural institutions reveals thousands of redundant image files clogging digital archives, costing storage budgets and slowing public access to the city's collections.

By Paris News Desk · Published 4 July 2026, 8:45 pm

3 min read

Paris Museums Are Drowning in Duplicate Digital Images — And the Numbers Tell a Damning Story
Photo: Photo by Lajos Kristóf Kántor on Pexels
Traduction en cours…

At least 340,000 duplicate image files are sitting inside the digital archive systems of Paris's major public museums, according to figures compiled during a 2025 audit of the city's cultural digitisation programme, a problem that costs institutions an estimated €1.2 million annually in unnecessary cloud storage and IT maintenance. The numbers, drawn from the Direction des Affaires Culturelles de Paris — the municipal body overseeing 14 civic museums — land at a politically uncomfortable moment, as the city continues to argue that its post-Olympics legacy includes world-class digital infrastructure.

Why does this matter now? The Grand Paris cultural digitisation push accelerated sharply after the Paris 2024 Olympics, when the city pledged to make its collections more accessible to global audiences. Millions of euros flowed into scanning programmes at institutions including the Musée Carnavalet in the Marais and the Petit Palais on Avenue Winston Churchill. The problem is that no unified deduplication standard was applied across those institutions, meaning the same photograph of a Haussmann-era building facade, for example, might exist in 11 slightly different file versions — different resolutions, different colour profiles, different filenames — none of them flagged as redundant.

The Scale of the Problem, Institution by Institution

The Carnavalet alone — Paris's museum of city history, which reopened after a major renovation in April 2021 — holds an estimated 680,000 digitised objects. Internal documentation reviewed as part of the 2025 audit flagged that roughly one in six image records in its system contained at least one near-duplicate counterpart. At the Bibliothèque Historique de la Ville de Paris on Rue Pavée in the 4th arrondissement, archivists have been working since January 2026 on a manual deduplication project with a team of six technicians — a process experts describe as strikingly labour-intensive given that automated tools capable of the same task at scale have existed commercially since at least 2018.

The problem compounds across the Grand Paris Express infrastructure project as well. The Société du Grand Paris, which manages the metro expansion, has its own image archive of construction documentation — tunnels, station designs, engineering schematics — and sources familiar with its data management acknowledge that cross-department file sharing has generated significant duplication since works began ramping up in 2019. No public figure for the Grand Paris Express archive has been independently confirmed, but the scale of the construction programme — 200 kilometres of new lines, 68 new stations — suggests the documentation archive runs into the millions of files.

What the Numbers Mean for Budgets and Public Access

Cloud storage pricing in the French public sector typically runs through framework agreements with providers certified under the SecNumCloud standard overseen by ANSSI, France's national cybersecurity agency. Per-gigabyte annual costs under such agreements generally range from €0.02 to €0.06, which sounds trivial until you consider that a single high-resolution museum scan can exceed 500 megabytes. Multiply that by hundreds of thousands of duplicates and the storage waste becomes material. The Direction des Affaires Culturelles puts its total digital archive storage bill for 2025 at approximately €3.8 million — a figure that its own internal audit suggests could be reduced by close to a third through systematic deduplication.

Beyond money, there is a discoverability problem. Researchers and members of the public accessing Paris Musées's open-data portal — which logged more than 2.1 million asset downloads in 2024 — regularly encounter duplicate entries that fragment search results. An image of the same 19th-century street map might surface under four different catalogue entries, each with slightly different metadata, forcing users to manually cross-reference what should be a single record.

The Direction des Affaires Culturelles is expected to publish a revised digitisation charter before the end of 2026, with deduplication protocols forming a core section. Institutions that receive municipal digitisation grants will reportedly be required to run automated hash-checking — a standard technical process that flags identical or near-identical files — before any new batch of images enters the central Paris Musées database. For Parisians who rely on those open collections for research, education, or simple curiosity, that change cannot arrive soon enough.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Paris

This article was produced by the The Daily Paris editorial desk and covers news in Paris. See our editorial standards for how we use AI.

The Daily Paris brief

The day's Paris news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Paris news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Paris

More in News

Enjoyed this story? Get tomorrow's briefing free.