A coordinated effort to purge tens of thousands of duplicate and mislabelled images from Paris's public digital collections moved into a new phase this week, as the Bibliothèque nationale de France and the Musée Carnavalet both confirmed they are running parallel remediation programmes targeting errors that have compounded over more than a decade of digitisation.
The problem is not trivial. Digital asset managers working across Paris institutions say that automated bulk-scanning campaigns — accelerated sharply after the Covid-19 closures of 2020 and 2021, when in-person cataloguing stalled — left repositories riddled with repeated image files carrying contradictory metadata. A single photograph of the Pont Neuf, for instance, might exist under four different accession numbers, attributed to three different photographers, with two conflicting dates.
Why This Week Matters
The issue resurfaced publicly on Tuesday when the Agence parisienne du climat published a progress note on its open-data portal flagging that its urban heat-mapping image library — used to inform policy decisions about the Seine riverbanks and the Grand Paris Express construction corridor — contained duplicate thermal survey photographs that had skewed analysis in at least two neighbourhood studies. The agency did not specify which studies were affected, but noted that its technical team had identified the problem during a routine audit in late June 2026.
That disclosure rattled institutions already under pressure. The Paris city government's Direction des Affaires Culturelles, which oversees digital collections spanning everything from Haussmann-era building permits to contemporary street photography of Saint-Denis and Aubervilliers, has been operating since March 2026 under a new data-quality charter. That charter, adopted after criticism from the Cour des comptes over inconsistencies in public digital archives, requires all affiliated bodies to certify their collections are free of structural duplication by 31 December 2026.
The Carnavalet — the museum of the history of Paris, housed on the Rue de Sévigné in the Marais — has assigned a dedicated three-person team to the project since April. The BnF, whose Richelieu site on the Rue de Richelieu holds one of Europe's largest photographic collections, is using an AI-assisted deduplication tool developed in partnership with the École nationale des chartes. Neither institution has released a full count of affected files, but the BnF's annual report for 2025 noted that its Gallica platform hosts more than 7 million digitised images — a figure that gives some sense of the scale of any systematic error rate.
The Practical Stakes for Urban Planning and Tourism
Beyond archival tidiness, the duplication problem has real-world consequences in two areas that matter to Paris right now. First, the Seine urban regeneration project — the stretch between the Bercy neighbourhood and the Pont d'Iéna that city planners are refashioning as part of the Paris 2024 Olympics legacy — relies on historical image databases to guide heritage assessments. Duplicate or misattributed photographs of quayside structures can generate false comparisons and delay planning approvals.
Second, the city's tourism bodies use digital image libraries extensively for promotional material. Paris Tourism, the office responsible for the city's international visitor communications, updated its licensing agreements with cultural institutions in January 2026 to include a data-quality clause. Under that clause, institutions supplying images for official campaigns must now warrant that files are unique and correctly attributed — a requirement that has added administrative pressure but also created a financial incentive to clean up databases quickly.
The remediation cost is not negligible. Industry benchmarks from similar European projects — including a 2024 exercise at the Rijksmuseum in Amsterdam — suggest that manual verification of disputed image records runs at roughly €3 to €5 per file when specialist cataloguers are involved. At even a modest error rate across collections the size of Gallica, that arithmetic adds up fast.
Institutions working to meet the December deadline are advised to prioritise collections that feed directly into public-facing platforms and planning databases before tackling purely archival backlogs. The Direction des Affaires Culturelles is expected to publish an interim progress report in September 2026, which will offer the first system-wide picture of how many files have been resolved — and how many remain contested.