Abonnement gratuit
The Daily Paris

Paris news, every day

News

Paris Takes a Harder Line on Duplicate Images in Public Databases Than London or Berlin

As cities race to clean up digitised heritage and housing records, Paris is betting on stricter automated auditing — but the backlog is still enormous.

By Paris News Desk · Published 4 July 2026, 9:16 pm

3 min read

Paris Takes a Harder Line on Duplicate Images in Public Databases Than London or Berlin
Photo: Fisher, Dorothy Canfield, 1879-1958 / Public domain (Wikimedia Commons)
Traduction en cours…

French archivists and municipal data managers are confronting a problem that has quietly ballooned alongside the city's post-Olympics digital acceleration: thousands of duplicate images embedded across public-facing databases, from housing permit portals to the digitised collections of the Bibliothèque nationale de France. Paris city hall acknowledged the scope of the issue in a working document circulated to the Direction de l'Urbanisme in May 2026, which flagged that duplicate or near-duplicate image files were inflating storage costs and distorting search results in at least three major municipal platforms.

The timing matters. Paris spent heavily on digital infrastructure ahead of the 2024 Olympics, migrating records, launching new resident-facing portals and digitising decades of planning files to support the ongoing Grand Paris Express construction corridor. That sprint created ideal conditions for data bloat. Files were uploaded multiple times by different agencies, resized versions were stored alongside originals without consistent naming conventions, and no unified deduplication protocol existed across arrondissement-level services.

What Paris Is Doing — and Where It Falls Short

The city's current response centres on two programs. The Atelier Parisien d'Urbanisme, known as APUR, began a structured image audit of its cartographic and photographic holdings in January 2026, targeting roughly 400,000 georeferenced images accumulated since 2018. Separately, the BnF's Gallica platform — which hosts more than eight million digitised documents accessible to the public — launched an internal deduplication initiative in late 2025 using perceptual hashing, a technique that identifies visually identical or near-identical images even when file names differ. BnF technical staff presented early results at a data governance seminar in Toulouse in March 2026, describing a first-pass duplicate rate of around 4.2 percent across specific periodical collections.

That figure, while not catastrophic, represents tens of thousands of files. For a platform that serves researchers across Europe, the consequences are real: duplicated images inflate catalogue entries, confuse metadata tagging and in some cases cause images to surface multiple times in a single search query, burying distinct results. The BnF's deduplication work is expected to continue through the end of 2026, with full integration into Gallica's indexing system targeted for the first quarter of 2027.

The Marais neighbourhood's Centre Pompidou, which manages its own digital collection separately from the BnF, has not publicly disclosed a deduplication timeline. Its IRCAM research arm, based on the Place Igor-Stravinsky, is experimenting with AI-assisted image clustering tools developed in partnership with a French tech consortium, though those trials remain at the prototype stage.

How Paris Compares to London and Berlin

London and Berlin have moved earlier and with more standardised frameworks. The British Library completed a deduplication sweep of its digitised newspaper archive — roughly 900 million page images — in 2023 using an open-source toolkit developed with the Alan Turing Institute. The project reduced redundant storage by an estimated 11 percent and is now a reference case in European digital heritage circles. Berlin's Staatsbibliothek adopted a mandatory deduplication protocol for all new digitisation contracts from January 2024, meaning every vendor delivering scanned content must provide a hash-verified unique-image manifest before files are accepted into the central repository.

Paris has no equivalent vendor-side requirement yet. The Direction des Affaires Culturelles has been consulting on a draft standard since autumn 2025, but as of July 2026 it has not been formalised. Housing data presents an even patchier picture: the Agence Nationale de l'Habitat, which administers renovation grant applications, uses image uploads as proof of works completed, and duplicate submissions have been identified as one vector for inflated claims — a concern raised in a February 2026 parliamentary finance committee review of ANAH's Monlogement portal, though the committee did not quantify the financial exposure.

For residents and institutions navigating these systems now, the practical advice from data governance specialists is straightforward: when submitting documents to any Paris municipal portal, use a consistent file-naming convention that includes date and version number, avoid re-uploading resized copies, and check whether a portal offers a duplicate-detection warning before final submission. Several arrondissement planning desks, including those serving the 10th and 13th, have added informal guidance to their online submission pages. The broader fix, however, waits on city hall to move a draft technical standard off the shelf and into enforcement.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Paris

This article was produced by the The Daily Paris editorial desk and covers news in Paris. See our editorial standards for how we use AI.

The Daily Paris brief

The day's Paris news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Paris news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Paris and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Paris

More in News

Enjoyed this story? Get tomorrow's briefing free.