All Stories

  1. Introduction to the Special Issue on Deep Multimodal Generation and Retrieval
  2. Domain-Agnostic Neural Oil Painting via Normalization Affine Test-Time Adaptation
  3. UniAD: Integrating Geometric and Semantic Cues for Unified Anomaly Detection
  4. Proceedings of the 3rd International Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  5. The 3rd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  6. MORE'25 Multimedia Object Re-ID: Advancements, Challenges, and Opportunities
  7. From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search
  8. CAMeL: Cross-Modality Adaptive Meta-Learning for Text-Based Person Retrieval
  9. EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
  10. Scale Up Composed Image Retrieval Learning via Modification Text Generation
  11. Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene
  12. Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching
  13. The 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  14. The 2nd International Workshop on Deep Multi-modal Generation and Retrieval
  15. Transferring to Real-World Layouts: A Depth-aware Framework for Scene Adaptation
  16. Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery
  17. Self-ensembling depth completion via density-aware consistency
  18. Collaborative group: Composed image retrieval via consensus learning from noisy annotations
  19. Multiple-environment Self-adaptive Network for aerial-view geo-localization
  20. MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and Opportunities
  21. High Fidelity Makeup via 2D and 3D Identity Preservation Net
  22. StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition
  23. Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation
  24. Active Discovering New Slots for Task-Oriented Conversation
  25. Learning Cross-View Geo-Localization Embeddings via Dynamic Weighted Decorrelation Regularization
  26. UAVM '23: 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  27. Deep Multimodal Learning for Information Retrieval
  28. PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation
  29. Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
  30. Learnable Pillar-based Re-ranking for Image-Text Retrieval
  31. Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
  32. Context-Aware Pretraining for Efficient Blind Image Decomposition
  33. Multi-view Consistent Generative Adversarial Networks for Compositional 3D-Aware Image Synthesis
  34. Align and Tell: Boosting Text-Video Retrieval With Local Alignment and Fine-Grained Supervision
  35. Progressive Local Filter Pruning for Image Retrieval Acceleration
  36. U-Turn: Crafting Adversarial Queries with Opposite-Direction Features
  37. Soft Person Reidentification Network Pruning via Blockwise Adjacent Filter Decaying
  38. Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis
  39. Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization
  40. Adaptive Boosting for Domain Adaptation: Toward Robust Predictions in Scene Segmentation
  41. DMRNet++: Learning Discriminative Features with Decoupled Networks and Enriched Pairs for One-Step Person Search
  42. Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization
  43. Parameter-Efficient Person Re-Identification in the 3D Space
  44. SPG-VTON: Semantic Prediction Guidance for Multi-Pose Virtual Try-on
  45. Self-supervised Point Cloud Representation Learning via Separating Mixed Shapes
  46. Decoupled and Memory-Reinforced Networks: Towards Effective Feature Learning for One-Step Person Search
  47. Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation
  48. VehicleNet: Learning Robust Visual Representation for Vehicle Re-Identification
  49. University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization
  50. Real-World Automatic Makeup via Identity Preservation Makeup Net
  51. Unsupervised Scene Adaptation with Memory Regularization in vivo
  52. Dual-path Convolutional Image-Text Embeddings with Instance Loss
  53. Going Beyond Real Data: A Robust Visual Representation for Vehicle Re-identification
  54. Thorax disease classification with attention guided convolutional neural network
  55. Bayesian query expansion for multi-camera person re-identification
  56. Unsupervised Eyeglasses Removal in the Wild
  57. Improving person re-identification by attribute and identity learning
  58. Pedestrian Alignment Network for Large-scale Person Re-Identification
  59. Joint Discriminative and Generative Learning for Person Re-Identification
  60. CamStyle: A Novel Data Augmentation Method for Person Re-Identification
  61. Multi-Pseudo Regularized Label for Generated Data in Person Re-Identification
  62. Camera Style Adaptation for Person Re-identification
  63. An improved artificial intelligence scheme for person reidentification
  64. Macro-Micro Adversarial Network for Human Parsing
  65. Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro