Paris archivists and urban data managers have spent the better part of three years systematically purging duplicate images from the city's public-facing digital repositories, a quiet but consequential housekeeping effort that is now drawing attention from municipal governments across Europe and Asia. The scale of the problem was larger than most officials expected: internal audits conducted between 2023 and 2025 identified tens of thousands of redundant image files across portals managed by the Mairie de Paris and associated agencies, slowing database queries and inflating storage costs at a time when the city's IT budget is already stretched by Grand Paris Express commitments.
The urgency sharpened after Paris 2024. The Olympic and Paralympic Games generated an enormous surge in official photography — accreditation images, venue documentation, press-pool assets — that flowed into shared municipal and cultural databases. By late 2024, the Bibliothèque nationale de France's Gallica platform and the Atelier parisien d'urbanisme, known as APUR, were each dealing with redundancy rates that insiders described as administratively untenable, though neither organisation has published a precise figure publicly. What is clear is that the problem prompted a formal inter-agency working group, convened in early 2025, to agree on deduplication standards for images held across Paris's fragmented network of public data portals.
What Paris Is Actually Doing — And Where
The practical work has been concentrated in two distinct areas. At the BnF's site in the 13th arrondissement on Quai François-Mauriac, technical staff have been running perceptual hashing algorithms across Gallica's digitised photographic collections since February 2025, a method that catches near-identical images even when file names or metadata differ. The BnF has not released deduplication statistics, but the effort forms part of the institution's broader 2025–2027 digital strategy, a publicly available document that lists storage efficiency as a headline objective.
Separately, APUR — whose offices sit near the Hôtel de Ville in the 4th arrondissement — has been rationalising the aerial and street-level imagery it uses for urban planning analysis, particularly datasets tied to Seine riverbank regeneration projects between the Pont de Bercy and the Pont d'Iéna. Planners working on the ZAC Austerlitz redevelopment zone told colleagues at a January 2026 professional forum that duplicate imagery had in some cases led to conflicting measurements being used in early-stage design briefs, a practical error with real consequences.
How Paris Compares With London, Amsterdam and Seoul
Other major cities have approached the same problem with varying degrees of urgency. Transport for London began a structured deduplication programme for its CCTV and asset-inspection image libraries in 2022, partly driven by UK government guidance on public-sector data efficiency issued that year. Amsterdam's municipal archive, the Stadsarchief, completed a major deduplication pass across its 750,000-item digital photographic collection by mid-2024, publishing its methodology openly — a level of transparency Paris has not yet matched. Seoul's Smart City data team, operating under the city's Digital Master Plan adopted in 2023, automated image deduplication at the point of upload across all district-level portals, effectively preventing the backlog from accumulating in the first place.
Paris, by contrast, is still working reactively rather than preventively. The inter-agency working group has proposed a common metadata schema that would allow real-time duplicate detection across participating portals, but as of July 2026 that schema remains in draft form. Adoption by agencies outside the BnF and APUR — including those managing images for the Paris habitat social housing authority and the Île-de-France Mobilités transport network — has not yet been formally confirmed.
For residents and researchers who use public data, the immediate practical upshot is limited but real. Searches on the Opendata Paris portal at opendata.paris.fr return cleaner results than they did two years ago, and the platform's image-related datasets have been flagged for review under a rolling six-month audit cycle that began in October 2025. Anyone relying on those datasets for research, journalism or urban planning work should cross-reference against APUR's own data portal, which publishes its update logs, to verify they are working from the most current, non-duplicated source. The city's next formal progress report on its data governance framework is expected before the end of 2026.