Abonnement gratuit
The Daily Paris

Paris news, every day

News

How Paris's Duplicate Image Problem Became a Heritage Crisis: The Story Behind the Cleanup

Years of fragmented digital archiving across city agencies left thousands of redundant photographs clogging the capital's cultural databases — and the bill for fixing it is finally arriving.

By Paris News Desk · Published 4 July 2026, 8:51 pm

3 min read

How Paris's Duplicate Image Problem Became a Heritage Crisis: The Story Behind the Cleanup
Photo: Photo by Mathias Reding on Pexels
Traduction en cours…

Paris's cultural institutions are sitting on a data problem that predates the smartphone era. At least three major municipal archives — including the Bibliothèque historique de la Ville de Paris on Rue de Rivoli and the Musée Carnavalet in the Marais — have independently confirmed they are running deduplication programmes this year after internal audits revealed massive overlaps in their digitised photograph collections. The trigger: a 2025 directive from the Direction des Affaires Culturelles de Paris requiring all city-funded repositories to align their holdings with a single interoperable catalogue by January 2027.

The timing matters. Paris spent much of 2024 and early 2025 digitising at speed to serve the Olympics legacy agenda, pushing tens of thousands of images of Seine-side venues, Grand Paris Express construction sites, and banlieue regeneration projects into public-access servers. That sprint, undertaken across agencies that rarely communicated, produced exactly the kind of redundancy conservators had warned about for years. The same photograph of the Pont de Bercy, shot from the same angle on the same afternoon in September 2023, turned up in at least four separate institutional databases, each catalogued under a slightly different metadata tag.

A Decade of Siloed Archiving

The structural cause is straightforward. Between 2015 and 2024, Paris's cultural infrastructure expanded fast — the Philharmonie de Paris in the 19th arrondissement, the Fondation Cartier's expanded digital outreach, the city's Grand Paris Express documentation contracts — but each entity developed its own image management protocol. The Établissement public territorial Est Ensemble, which covers nine communes in the inner northeastern suburbs, ran a separate visual archive for its urban renewal programme that overlapped substantially with documentation produced by Plaine Commune to the north. Neither repository was networked to the other.

A 2024 internal report by the Atelier parisien d'urbanisme, known as APUR, identified the core issue: municipal photography contracts often failed to specify exclusive delivery, meaning a single photographic agency could legitimately deliver near-identical frames to multiple clients simultaneously. One contract reviewed by APUR covered documentation of the Porte de la Chapelle Arena — built for the 2024 Olympics — and resulted in more than 800 near-duplicate images distributed across four separate institutional archives, according to the report's findings.

The cost of storage is not trivial. Professional-grade archival servers of the kind used by the Musée Carnavalet typically run between €15,000 and €40,000 per petabyte annually once maintenance, migration, and access infrastructure are factored in. With duplicates estimated to account for between 20 and 35 percent of total holdings in some city collections, the redundancy represents a recurring and avoidable budget line at a moment when Mairie de Paris is facing real fiscal pressure from housing subsidy commitments and the ongoing Grand Paris Express cost overruns.

What the Cleanup Will Require

The deduplication process is not simply a matter of deleting files. Cultural archivists at institutions like the Médiathèque de l'Architecture et du Patrimoine, based in Charenton-le-Pont just southeast of the périphérique, argue that determining which version of a duplicate image is the authoritative one requires human curatorial judgment — checking original resolution, provenance documentation, and chain-of-custody metadata. Running automated hash-matching software, which identifies pixel-identical or near-identical files, handles only the easiest cases. The messier problem is near-duplicates: images taken seconds apart, or scans of the same physical print made at different moments with marginally different colour calibration.

The January 2027 deadline set by the Direction des Affaires Culturelles gives institutions roughly 18 months to complete submissions to the unified catalogue. Archivists at several institutions have already begun the work, and the Bibliothèque historique is understood to be piloting AI-assisted metadata reconciliation software sourced through a European consortium. For anyone who uses Paris's public image archives for research, journalism, or urban planning — the practical advice is to treat current catalogue entries with caution, cross-reference across repositories, and expect significant metadata corrections to continue rolling through the system well into 2027.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Paris

This article was produced by the The Daily Paris editorial desk and covers news in Paris. See our editorial standards for how we use AI.

The Daily Paris brief

The day's Paris news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Paris news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Paris

More in News

Enjoyed this story? Get tomorrow's briefing free.