All Stories

  1. MER 2025: When Affective Computing Meets Large Language Models
  2. IntentVCNet: Bridging Spatio-Temporal Gaps for Intention-Oriented Controllable Video Captioning
  3. Analytic Synaptic Dynamic Scaling Balancer for Multimodal Deepfake Continual Detection
  4. AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation
  5. Higher-Order Vision-Language Fusion for Video Popularity Prediction
  6. Preference-Strength-Aware Self-Improving Alignment with Generative Preference Models
  7. Empowering Large Language Model Agent through Step-Level Self-Critique and Self-Training
  8. Self-supervised Bidirectional Synchronization Estimation for Multimodal Deepfake Detection with Short-term Dependency
  9. MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition
  10. THE-FD: Task Hierarchical Emotion-aware for Fake Detection
  11. Tracing Training Progress: Dynamic Influence Based Selection for Active Learning
  12. Enhancing Unsupervised Visible-Infrared Person Re-Identification with Bidirectional-Consistency Gradual Matching
  13. IFS-SED: Incremental Few-Shot Sound Event Detection Using Explicit Learning and Calibration
  14. VTQAGen: BART-based Generative Model For Visual Text Question Answering
  15. Multi-task Pre-training Language Model for Semantic Network Completion
  16. Multiple Temporal Fusion based Weakly-supervised Pre-training Techniques for Video Categorization
  17. Seeing Speech: Magnetic Resonance Imaging-Based Vocal Tract Deformation Visualization Using Cross-Modal Transformer
  18. Squeeze-and-Excitation network-Based Radar Object Detection With Weighted Location Fusion
  19. Multimodal Deep Learning for Social Media Popularity Prediction With Attention Mechanism
  20. A Quantitative Comparison of Different Machine Learning Approaches for Human Spermatozoa Quality Prediction Using Multimodal Datasets
  21. Multi-Scale Generalized Attention-Based Regional Maximum Activation of Convolutions for Beauty Product Retrieval
  22. Ultrasound-Based Silent Speech Interface using Sequential Convolutional Auto-encoder
  23. Articulatory feature extraction from ultrasound images using pretrained convolutional neural networks
  24. Deep Convolutional Neural Network-Based Early Automated Detection of Diabetic Retinopathy Using Fundus Image
  25. Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images
  26. Is Speckle Tracking Feasible for Ultrasound Tongue Images?
  27. Cyclic-feature based Doppler scale estimation for orthogonal frequency-division multiplexing (OFDM) signals over doubly selective underwater acoustic channels
  28. An Articulatory-Based Singing Voice Synthesis Using Tongue and Lips Imaging
  29. SU-F-J-04: Automated Detection of Diabetic Retinopathy Using Deep Convolutional Neural Networks
  30. SU-F-J-226: Structural Similarity-Based Ultrasound Image Similarity Measurement
  31. SU-G-JeP1-03: Automatic Motion Tracking Reset in Ultrasound Liver Image Sequences
  32. SU-G-IeP3-08: Image Reconstruction for Scanning Imaging System Based On Shape-Modulated Point Spreading Function
  33. Contour-based 3D tongue motion visualization using ultrasound image sequences