All Stories

  1. IntentVCNet: Bridging Spatio-Temporal Gaps for Intention-Oriented Controllable Video Captioning
  2. Dual-path Collaborative Generation Network for Emotional Video Captioning
  3. Exploring Visual Relationships via Transformer-based Graphs for Enhanced Image Captioning