Paris's Duplicate Image Problem: The Numbers Hiding in Plain Sight
A growing wave of repeated and recycled visuals is distorting how the capital documents itself online — and the data tells a surprisingly precise story.
A growing wave of repeated and recycled visuals is distorting how the capital documents itself online — and the data tells a surprisingly precise story.

More than 340,000 digital images tagged with Paris location data are flagged as near-duplicate or exact-duplicate files across major French public archives and municipal databases, according to figures compiled by the Institut national de l'audiovisuel and cross-referenced with the Bibliothèque nationale de France's Gallica platform. The problem is not new, but it has accelerated sharply since the Paris 2024 Olympics, when a surge of official photography flooded civic repositories within a matter of weeks.
The stakes are higher than they look. When city planners, journalists, architects and residents reach for reference images — of the Seine riverbanks under regeneration, of the Grand Paris Express construction sites, of the banlieues east of the Périphérique — they are increasingly pulling the same dozen photographs recycled across hundreds of records. Bad data compounds bad decisions.
The problem clusters around three zones. The Trocadéro esplanade, the Pont de Bir-Hakeim and the Canal Saint-Martin account for a disproportionate share of repeated imagery in the Paris municipal photo library maintained by the Direction de l'Urbanisme. Internal audits of that library, whose holdings have grown by roughly 18 percent since January 2024, found that for every ten images tagged "Canal Saint-Martin, 10e arrondissement", between three and four were near-identical duplicates — same framing, same light, often the same timestamp stripped and re-uploaded under a different file name.
The Grand Paris Express project, managed by Société du Grand Paris, is a separate case study. Construction documentation for the new Line 15 South — running beneath Issy-les-Moulineaux, Bagneux and Villejuif — is required by procurement rules to include photographic progress records. Contractors submit those records monthly. An independent review commissioned earlier this year found that one civil engineering firm submitted 1,247 images across six monthly reports, of which 214 were flagged as duplicates, including 31 that were identical files renamed to suggest they had been taken on different dates.
That 17 percent duplication rate within a single contractor's submission is not considered exceptional by archivists working in the field. The BnF's Gallica team has noted that the platform's Paris-related holdings — running to some 2.1 million digitised items — carry an estimated duplication rate of between 8 and 12 percent for photographs taken after 2010, when smartphone uploads began feeding aggregator pipelines that fed directly into national repositories.
Storage is the obvious line item. Paris city hall's Direction des Systèmes et Technologies de l'Information estimates that eliminating confirmed duplicates from its core urban documentation servers would free roughly 4.7 terabytes of managed storage, at a recurring annual saving in the low six figures of euros. That figure does not include the human cost: archivists and data officers at the Pavillon de l'Arsenal, the city's urban planning resource centre on the Boulevard Morland in the 4th arrondissement, spend an estimated 15 percent of cataloguing time on deduplication tasks that automation could handle.
The political dimension matters too. As the Macron government faces pressure in the National Assembly over municipal spending efficiency, any audit of city-funded digital infrastructure becomes potential ammunition. Housing activists in the 93 Seine-Saint-Denis corridor have already argued, in public meetings in Aubervilliers and Saint-Denis, that documentation gaps — partly caused by unreliable imagery records — have slowed the processing of Seine urban regeneration planning applications.
Three practical steps are already in motion. The Pavillon de l'Arsenal launched a perceptual hashing pilot in March 2026, using open-source tools to compare image fingerprints before accession into the main catalogue. Société du Grand Paris has updated its contractor documentation brief, effective from the June 2026 contract cycle, to require metadata integrity checks at source. And the BnF has set a target of reducing its post-2010 duplication rate to below 5 percent by the end of 2027. Whether the timetable holds will depend on budget allocations still being negotiated ahead of the autumn finance calendar — but the numbers, at least, are no longer being ignored.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Paris
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News