What is it about?

This paper presents RUNE (Reasoning Using Neurosymbolic Entities), a neurosymbolic framework for text-to-image retrieval in remote sensing. RUNE combines large language models, object detection and segmentation, and symbolic reasoning to support accurate and explainable image retrieval. The system translates a user’s natural language query into a first-order logic representation, detects objects and their relationships in remote sensing images, and applies symbolic reasoning to evaluate the consistency between the entities and relations described in the query and those observed in the images. By explicitly reasoning over objects and their interactions, RUNE enables transparent retrieval decisions and significantly reduces false positives, providing a more reliable and interpretable alternative to purely neural approaches. As a use case to evaluate RUNE’s performance, we focus on post-flooding imagery, demonstrating its applicability to real-world scenarios.

Featured Image

Why is it important?

By employing symbolic reasoning for retrieval, our approach measures the compatibility between a natural language query and an image without relying on joint embedding generation. This design enhances explainability by making retrieval decisions directly traceable to the entities detected in the images and to those generated by the LLM during the transformation of the query into a first-order logic representation. In addition, symbolic reasoning significantly reduces false positives by enabling more precise control over reasoning and resource usage—an important advantage for deployment in real-world remote sensing scenarios. As a result, our methodology achieves more accurate and interpretable retrieval while maintaining execution times comparable to fully neural systems.

Perspectives

It was a great pleasure to work on the integration of symbolic reasoning and large language models for addressing text-to-image retrieval in remote sensing. This effort demonstrates that combining symbolic and neural methods can significantly improve explainability, performance, and robustness compared to purely neural approaches. By showing how our methodology can be applied to the retrieval of post-flooding imagery, this work highlights the practical value of neurosymbolic architectures for real-world remote sensing tasks. We hope these results will encourage remote sensing specialists to adopt integrated neurosymbolic systems within their workflows, paving the way for more transparent, reliable, and effective solutions to complex operational challenges.

Emanuele Mezzi
Vrije Universiteit Amsterdam

Read the Original

This page is a summary of: Neurosymbolic Inference On Foundation Models For Remote Sensing Text-to-image Retrieval With Complex Queries, ACM Transactions on Spatial Algorithms and Systems, December 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3786350.
You can read the full text:

Read

Contributors

The following have contributed to this page