Abonnement gratuit
The Daily Paris

Paris news, every day

News

How Paris's Digital Archives Ended Up Full of Doubles: The Story Behind the Duplicate Image Problem

A years-long accumulation of redundant photographs across city-run platforms has forced municipal archivists and cultural institutions to reckon with a sprawling, costly mess — and figure out how to clean it up.

By Paris News Desk · Published 4 July 2026, 9:23 pm

3 min read

How Paris's Digital Archives Ended Up Full of Doubles: The Story Behind the Duplicate Image Problem
Photo: Photo by Darya Sannikova on Pexels
Traduction en cours…

Paris's public digital image libraries contain tens of thousands of duplicate photographs. That is the working estimate held by archivists at the Bibliothèque historique de la Ville de Paris, where staff have been quietly auditing holdings since early 2025. The problem did not arrive overnight. It built up across more than a decade of overlapping digitisation drives, emergency uploads during the pandemic, and the chaotic content push that preceded the Paris 2024 Olympics.

The timing matters now because the city is deep into its post-Games legacy phase. Several major platforms — including the official Paris 2024 cultural heritage archive and the Seine-Saint-Denis urban memory project run out of the Département 93 — are being merged or migrated into unified municipal portals. Every duplicate in those systems costs storage, slows search functions, and, more practically, creates legal headaches around image rights when the same photograph appears under two different licensing records.

A Problem Built Layer by Layer

Trace it back and the origins are straightforward enough. The first mass digitisation push in Paris came through the Plan Numérique Patrimonial, which ran from roughly 2010 to 2016 and moved hundreds of thousands of archival images from institutions including the Musée Carnavalet on Rue des Francs-Bourgeois and the archives held at the Hôtel de Ville onto digital platforms. Those early uploads were done in batches, often by different contractors working to different file-naming conventions. Nobody was cross-checking in real time.

Then came the 2020 lockdowns. Cultural institutions across the city — from the Palais de Tokyo on Avenue du Président Wilson to local mairie branches in the 18th and 19th arrondissements — scrambled to get visual content online fast, sometimes uploading image sets that had already been partially digitised years before. The Grand Paris Express construction project added another layer: multiple communications agencies working simultaneously for different line consortia uploaded overlapping photographic documentation of tunnelling work, street-level impacts in communes like Saint-Denis and Villejuif, and public consultation events.

By the time the Paris 2024 organising committee began feeding images into the Agence France-Presse partnership archive and the city's own Plateforme Culturelle Numérique, nobody had a clean master inventory. Staff at the Archives de Paris on Rue des Quatre-Fils flagged the duplication issue internally as early as autumn 2023, but the pre-Games pressure meant the audit was deferred.

What a Cleanup Actually Involves

Deduplication at this scale is not a simple delete operation. Each image record carries metadata — date, photographer credit, rights status, acquisition cost — and duplicate records often carry conflicting information. A single photograph of the Pont d'Austerlitz renovation, for example, might sit in three separate collections under three different rights assignments. Resolving that requires human review, not just algorithmic matching.

The European cultural sector has dealt with versions of this before. Europeana, the EU's aggregated digital heritage platform, publicly documented in 2022 that roughly 8 percent of its then-50-million-item collection contained some form of duplication — a figure that cost an estimated €2.3 million in a single year to begin addressing, according to the organisation's published annual report for that period. Paris's municipal holdings are smaller but the rights complexity is comparable.

The city's Direction des Affaires Culturelles has confirmed a deduplication and metadata standardisation programme is budgeted for 2026-2027, though specific funding figures have not been made public. The Archives de Paris is understood to be the lead institution. Smaller bodies — neighbourhood cultural centres, school digitisation projects in banlieue communes connected through the Grand Paris Express corridor — will need to align their own systems with whatever standard emerges from that central process.

For institutions holding image collections right now, the practical advice from archivists is consistent: freeze new uploads to any platform scheduled for migration, document existing rights records for every image regardless of apparent duplication, and do not delete anything unilaterally before the city's standardisation framework is published. The cleanup is coming. The institutions that prepared their own inventories first will move through it fastest.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Paris

This article was produced by the The Daily Paris editorial desk and covers news in Paris. See our editorial standards for how we use AI.

The Daily Paris brief

The day's Paris news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Paris news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Paris

More in News

Enjoyed this story? Get tomorrow's briefing free.