All Stories

  1. DeepSVU: Towards In-depth Security-oriented Video Understanding via Unified Physical-world Regularized MoE
  2. Towards LLM-centric Affective Visual Customization via Efficient and Precise Emotion Manipulating
  3. Skynet-V1: Towards Early Warning of Video Abnormal Events via A Spatial-temporal Causal-enhanced MoE Framework
  4. OmniDoctor: Towards LLM-centric Lifelong Learning for New Emerging Medical VQA Tasks
  5. Omni-SILA: Towards <u>Omni</u>-scene Driven Visual <u>S</u>entiment <u>I</u>dentifying, <u>L</u>ocating and <u>A</u>ttributing in Videos
  6. Sherlock: Towards Multi-scene Video Abnormal Event Extraction and Localization via a Global-local Spatial-sensitive LLM
  7. Towards Emotion-enriched Text-to-Motion Generation via LLM-guided Limb-level Emotion Manipulating
  8. Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanced Video Large Language Model