Paris's municipal archives directorate began enforcing a new deduplication protocol this spring, requiring all digitised image collections submitted to the Bibliothèque historique de la Ville de Paris to pass automated duplicate-detection screening before ingestion. The change, rolled out quietly in April 2026, affects tens of thousands of photographs, postcards and urban survey images gathered since the city's post-Olympics digitisation push accelerated in late 2024.
The timing matters. The Grand Paris Express construction programme has generated an enormous volume of site documentation photography — archaeological surveys, infrastructure progress shots, neighbourhood impact assessments — much of it redundant by design. Without systematic deduplication, archivists say a single tunnel excavation site on the future Line 15 corridor near Saint-Denis can produce hundreds of near-identical frames that bloat storage, slow search tools and ultimately degrade the usefulness of the public record.
How Paris Compares to London and Berlin
London's Wellcome Collection, one of Europe's largest health and social-history image libraries, adopted a perceptual-hash deduplication system in 2022. By early 2025 the Collection reported removing more than 14,000 redundant image files from its publicly accessible digital repository, freeing roughly 2.3 terabytes of server space. Paris's approach differs in one key respect: rather than purging duplicates after the fact, the city is blocking submission at the point of upload — a preventive model that places more burden on the submitting institution but keeps the master archive cleaner from day one.
Berlin's Landesarchiv has taken a third path. Since 2023 it has retained duplicate images as separate versioned records, arguing that minor variations — different timestamps, different metadata tags — carry independent evidential value for historians. That philosophy has won backing from Germany's Federal Commissioner for Culture, but it has also left the Landesarchiv managing a collection that archivists there have described publicly as increasingly unwieldy. Paris's Hôtel de Ville directorate studied the Berlin model and explicitly rejected it, according to documentation published on the city's open-data portal in March 2026.
Amsterdam's Stadsarchief sits somewhere between the two. It uses semi-automated flagging to identify probable duplicates but leaves the final deletion decision to a human curator. The system has worked well for the Stadsarchief's canal-district photography collections, which date to the 1860s, but city officials there acknowledge the manual review step creates a backlog that now runs to several months. Paris, by contrast, has invested in GPU-accelerated image comparison infrastructure hosted at the Datacentre de la Ville de Paris in the 13th arrondissement, cutting automated review time to under 48 hours per batch submission.
What This Means for Researchers and the Public
The practical stakes are concrete. The Atelier Parisien d'Urbanisme — known as APUR — regularly submits aerial and street-level surveys of neighbourhoods undergoing transformation, including the Seine-Saint-Denis riverbanks and the Porte de la Chapelle district. Under the new protocol, APUR must run its own pre-screening before sending files to the Bibliothèque historique. The organisation confirmed the additional step in a March 2026 press release on its website, noting it had adjusted internal workflow to accommodate the requirement.
For independent researchers and documentary photographers, the change has a less comfortable edge. Images submitted by freelancers or community groups — such as those documenting the rue du Faubourg-Saint-Antoine market corridor or housing conditions in Clichy-sous-Bois — now face the same automated filter as professional institutional submissions. A photograph flagged as a duplicate of an existing record in the archive is returned to the submitter with a rejection code, and there is currently no formal appeals process in place. The city says it is developing one, with a target launch date of October 2026.
Archivists and heritage researchers who want to engage with the new system should consult the submission guidelines published directly on paris.fr, where the Bibliothèque historique has posted its technical specifications for image resolution, metadata formatting and hash-comparison thresholds. Institutions with large backlogs are advised to contact the directorate before the autumn submission window opens in September, when the volume of Grand Paris Express documentation alone is expected to spike sharply.