What is it about?

The destruction of the Bamiyan Buddhas in Afghanistan removed one of the world’s most significant cultural heritage monuments. In recent years, generative artificial intelligence (AI) has been promoted as a way to digitally “reconstruct” lost heritage, but how reliable are these reconstructions? This study examines how well generative AI can recreate the Western Buddha of Bamiyan by combining computer-based similarity measures with expert archaeological judgement. The researchers generated hundreds of possible digital reconstructions using AI image-generation systems and then evaluated them through a structured, multi-stage process. First, automated image analysis was used to compare AI outputs with historical photographs, identifying those that looked geometrically similar. Next, interdisciplinary reviewers and specialist archaeologists assessed whether the reconstructions made cultural and historical sense. The results show that while AI is effective at producing large numbers of visually plausible images, computer-based similarity scores alone are not enough. Images that scored highly in automated tests were often judged by experts to be archaeologically implausible. Human specialists consistently focused on deeper features such as proportions, integration with the rock niche, and stylistic conventions of Gandhāran sculpture—details that AI systems struggled to reproduce accurately. Overall, the research shows that AI should not be treated as an autonomous tool for heritage reconstruction. Instead, it is best understood as a way to generate visual hypotheses that must be carefully filtered and interpreted by experts. This approach offers a more responsible and transparent way to use AI in the reconstruction of culturally sensitive heritage sites.

Featured Image

Why is it important?

This work is timely because generative AI is increasingly being used in museums, documentaries, and public history projects, often without clear evaluation standards. The study introduces a rigorous framework that combines computational filtering with expert‑in‑the‑loop assessment, showing where AI is helpful and where it fails. The key contribution is demonstrating that high visual similarity does not equal historical accuracy. By empirically showing the gap between algorithmic scores and expert judgement, the research provides evidence-based guidance for responsible AI use in digital heritage. This has implications not only for archaeology but also for museums, cultural institutions, and policymakers seeking to use AI without misrepresenting the past.

Perspectives

Working on this paper reinforced how important human expertise remains in the age of generative AI. While AI systems are impressive at producing images quickly, this study shows that cultural understanding, historical context, and disciplinary knowledge cannot be automated away. I hope this work encourages more thoughtful collaboration between computer scientists, archaeologists, and heritage professionals. Rather than asking whether AI can replace experts, the more productive question is how AI can support expert reasoning while respecting the complexity and sensitivity of cultural heritage.

Prof Tatiana Kalganova
Brunel University

Read the Original

This page is a summary of: Evaluating Generative AI Reconstructions of the Bamiyan Buddhas: Computational Similarity and Expert Archaeological Assessment, January 2026, Elsevier,
DOI: 10.2139/ssrn.6727346.
You can read the full text:

Read

Contributors

The following have contributed to this page