What is it about?
AttBalance is a loss function for specialist models in Visual Grounding, to better align the vision and language feature. It focuses the attention value within the ground truth bounding boxes, while use the attention map from momentum version model to retify the regulation.
Featured Image
Photo by Andreas Haubold on Unsplash
Why is it important?
It can better align the visual and language features, resulting in a better performance in Visual Grounding.
Perspectives
Writing this paper is very easy as we have a comprehensive analysis leading to a clear motivation.
Weitai Kang
University of Illinois at Chicago
Read the Original
This page is a summary of: Visual Grounding with Attention-Driven Constraint Balancing, October 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3746027.3755338.
You can read the full text:
Resources
Contributors
The following have contributed to this page







