Paris's municipal archive directorate, the Direction des Archives de Paris on the Rue des Francs-Bourgeois in the 4th arrondissement, began a systematic duplicate-image purge across its digitised collections in January 2026, targeting an estimated 340,000 redundant image files accumulated since the Grand Numérique digitisation push that started in 2019. The effort is the most structured of any comparable European capital, according to comparative assessments circulated at the February 2026 Digital Heritage Forum in Brussels.
The timing is deliberate. Paris 2024 Olympics documentation generated an unprecedented volume of photographic records — hundreds of thousands of images from venues including the Stade de France in Saint-Denis and the temporary urban installations along the Seine between the Pont d'Iéna and the Pont de l'Alma. Many of those images were submitted by multiple accredited photographers and press agencies, creating cascading duplication in the city's public image repositories. Managing that flood is now a live administrative problem, not a theoretical one.
How Paris Compares to London, Berlin and Tokyo
London's equivalent body, the London Metropolitan Archives in Clerkenwell, has acknowledged duplication as a growing problem but has not launched a systematic automated deduplication programme as of mid-2026. Berlin's Landesarchiv adopted a hash-matching protocol in 2023, but it applies only to new ingested material rather than retrospectively cleaning legacy collections. Tokyo's National Archives has invested heavily in AI-assisted deduplication since 2022, making it arguably the most advanced system globally — though its collections are predominantly document-based rather than photographic, limiting direct comparison.
Paris sits in a distinct middle position. The Direction des Archives has contracted with a French public-sector IT consortium to run perceptual hashing software across the full photographic catalogue, a process expected to cost roughly €1.2 million over 18 months. The contract was awarded in March 2026. Unlike Berlin's forward-only approach, Paris is applying the tool retrospectively — scanning collections going back to 1995, when the first batch of born-digital municipal photographs entered the system.
The Bibliothèque nationale de France, headquartered on the Quai François-Mauriac in the 13th arrondissement, is running a parallel but separate programme focused on press photographs held in the Agence Roger-Viollet collection. The two projects are not formally coordinated, which critics within the archiving profession have described as a structural gap. The BnF programme targets roughly 180,000 images and is scheduled for completion by the end of 2026.
What the Backlog Means for Urban Planning and Research
Duplicate images create real problems beyond storage costs. Urban planners working on the Grand Paris Express metro extension — the largest infrastructure project in Europe by track length — rely on archived site photographs to track changes along corridors including the future Line 16 route through Clichy-sous-Bois and Montfermeil. When the same image appears under multiple catalogue entries with different metadata, researchers waste hours reconciling records. One technical report circulated to Paris City Hall in April 2026 flagged the duplication rate in Seine-Saint-Denis site photography at above 28 percent, complicating documentation for several Grand Paris Express planning inquiries.
Housing researchers face a similar headache. The rental market along corridors slated for new metro stations has been extensively photographed for urban planning impact studies, with images deposited by multiple agencies including the Atelier Parisien d'Urbanisme, known as APUR. When duplicate submissions go unchecked, legal disputes over planning decisions can be complicated by inconsistent photographic records.
For institutions and researchers working with Paris's public image collections, the practical advice is to check submission metadata carefully before depositing new material through the Archives de Paris online portal, which asks for creation date, device identifier and rights holder — three data points that allow automated systems to flag likely duplicates before ingestion rather than after. The retrospective clean-up is necessary, but prevention at the point of deposit is where cities like Tokyo have pulled ahead. Paris's new intake protocols, updated in May 2026, move in that direction. Whether the legacy backlog is resolved before the next wave of urban documentation arrives is the question archivists along the Rue des Francs-Bourgeois are already asking themselves.