Paris's public digital image libraries contain tens of thousands of duplicate photographs. That is the working estimate held by archivists at the Bibliothèque historique de la Ville de Paris, where staff have been quietly auditing holdings since early 2025. The problem did not arrive overnight. It built up across more than a decade of overlapping digitisation drives, emergency uploads during the pandemic, and the chaotic content push that preceded the Paris 2024 Olympics.
The timing matters now because the city is deep into its post-Games legacy phase. Several major platforms — including the official Paris 2024 cultural heritage archive and the Seine-Saint-Denis urban memory project run out of the Département 93 — are being merged or migrated into unified municipal portals. Every duplicate in those systems costs storage, slows search functions, and, more practically, creates legal headaches around image rights when the same photograph appears under two different licensing records.
A Problem Built Layer by Layer
Trace it back and the origins are straightforward enough. The first mass digitisation push in Paris came through the Plan Numérique Patrimonial, which ran from roughly 2010 to 2016 and moved hundreds of thousands of archival images from institutions including the Musée Carnavalet on Rue des Francs-Bourgeois and the archives held at the Hôtel de Ville onto digital platforms. Those early uploads were done in batches, often by different contractors working to different file-naming conventions. Nobody was cross-checking in real time.
Then came the 2020 lockdowns. Cultural institutions across the city — from the Palais de Tokyo on Avenue du Président Wilson to local mairie branches in the 18th and 19th arrondissements — scrambled to get visual content online fast, sometimes uploading image sets that had already been partially digitised years before. The Grand Paris Express construction project added another layer: multiple communications agencies working simultaneously for different line consortia uploaded overlapping photographic documentation of tunnelling work, street-level impacts in communes like Saint-Denis and Villejuif, and public consultation events.
By the time the Paris 2024 organising committee began feeding images into the Agence France-Presse partnership archive and the city's own Plateforme Culturelle Numérique, nobody had a clean master inventory. Staff at the Archives de Paris on Rue des Quatre-Fils flagged the duplication issue internally as early as autumn 2023, but the pre-Games pressure meant the audit was deferred.
What a Cleanup Actually Involves
Deduplication at this scale is not a simple delete operation. Each image record carries metadata — date, photographer credit, rights status, acquisition cost — and duplicate records often carry conflicting information. A single photograph of the Pont d'Austerlitz renovation, for example, might sit in three separate collections under three different rights assignments. Resolving that requires human review, not just algorithmic matching.
The European cultural sector has dealt with versions of this before. Europeana, the EU's aggregated digital heritage platform, publicly documented in 2022 that roughly 8 percent of its then-50-million-item collection contained some form of duplication — a figure that cost an estimated €2.3 million in a single year to begin addressing, according to the organisation's published annual report for that period. Paris's municipal holdings are smaller but the rights complexity is comparable.
The city's Direction des Affaires Culturelles has confirmed a deduplication and metadata standardisation programme is budgeted for 2026-2027, though specific funding figures have not been made public. The Archives de Paris is understood to be the lead institution. Smaller bodies — neighbourhood cultural centres, school digitisation projects in banlieue communes connected through the Grand Paris Express corridor — will need to align their own systems with whatever standard emerges from that central process.
For institutions holding image collections right now, the practical advice from archivists is consistent: freeze new uploads to any platform scheduled for migration, document existing rights records for every image regardless of apparent duplication, and do not delete anything unilaterally before the city's standardisation framework is published. The cleanup is coming. The institutions that prepared their own inventories first will move through it fastest.