All Stories

  1. Minimizing the pretraining gap: Domain-aligned text-based person retrieval
  2. Harnessing weak pair uncertainty for text-based person search
  3. Look, Compare and Draw: Differential Query Transformer for Automatic Oil Painting
  4. Understanding Image Retrieval Re-Ranking: A Graph Neural Network Perspective
  5. Event-based Lip Reading with Triplane Fusion Network
  6. Progressive Text-to-3D Generation for Automatic 3D Prototyping
  7. CLIP-SR: Collaborative Linguistic and Image Processing for Super-Resolution
  8. FANet: Fovea Attention Network for Robust Aerial Geo-Localization Across Diverse Weather Conditions
  9. Joint Attribute Graph Reasoning and Aggregation for Composed Image Retrieval
  10. RIGI: Rectifying Image-to-3D Generation Inconsistency via Uncertainty-aware Learning
  11. Introduction to the Special Issue on Deep Multimodal Generation and Retrieval
  12. Domain-Agnostic Neural Oil Painting via Normalization Affine Test-Time Adaptation
  13. UniAD: Integrating Geometric and Semantic Cues for Unified Anomaly Detection
  14. Proceedings of the 3rd International Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  15. The 3rd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  16. MORE'25 Multimedia Object Re-ID: Advancements, Challenges, and Opportunities
  17. From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search
  18. CAMeL: Cross-Modality Adaptive Meta-Learning for Text-Based Person Retrieval
  19. EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
  20. Scale Up Composed Image Retrieval Learning via Modification Text Generation
  21. Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene
  22. Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching
  23. The 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  24. The 2nd International Workshop on Deep Multi-modal Generation and Retrieval
  25. Transferring to Real-World Layouts: A Depth-aware Framework for Scene Adaptation
  26. Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery
  27. Self-ensembling depth completion via density-aware consistency
  28. Collaborative group: Composed image retrieval via consensus learning from noisy annotations
  29. Multiple-environment Self-adaptive Network for aerial-view geo-localization
  30. MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and Opportunities
  31. High Fidelity Makeup via 2D and 3D Identity Preservation Net
  32. StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition
  33. Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation
  34. Active Discovering New Slots for Task-Oriented Conversation
  35. Learning Cross-View Geo-Localization Embeddings via Dynamic Weighted Decorrelation Regularization
  36. UAVM '23: 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  37. Deep Multimodal Learning for Information Retrieval
  38. PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation
  39. Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
  40. Learnable Pillar-based Re-ranking for Image-Text Retrieval
  41. Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
  42. Context-Aware Pretraining for Efficient Blind Image Decomposition
  43. Multi-view Consistent Generative Adversarial Networks for Compositional 3D-Aware Image Synthesis
  44. Align and Tell: Boosting Text-Video Retrieval With Local Alignment and Fine-Grained Supervision
  45. Progressive Local Filter Pruning for Image Retrieval Acceleration
  46. U-Turn: Crafting Adversarial Queries with Opposite-Direction Features
  47. Soft Person Reidentification Network Pruning via Blockwise Adjacent Filter Decaying
  48. Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis
  49. Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization
  50. Adaptive Boosting for Domain Adaptation: Toward Robust Predictions in Scene Segmentation
  51. DMRNet++: Learning Discriminative Features with Decoupled Networks and Enriched Pairs for One-Step Person Search
  52. Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization
  53. Parameter-Efficient Person Re-Identification in the 3D Space
  54. SPG-VTON: Semantic Prediction Guidance for Multi-Pose Virtual Try-on
  55. Self-supervised Point Cloud Representation Learning via Separating Mixed Shapes
  56. Decoupled and Memory-Reinforced Networks: Towards Effective Feature Learning for One-Step Person Search
  57. Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation
  58. VehicleNet: Learning Robust Visual Representation for Vehicle Re-Identification
  59. University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization
  60. Real-World Automatic Makeup via Identity Preservation Makeup Net
  61. Unsupervised Scene Adaptation with Memory Regularization in vivo
  62. Dual-path Convolutional Image-Text Embeddings with Instance Loss
  63. Going Beyond Real Data: A Robust Visual Representation for Vehicle Re-identification
  64. Thorax disease classification with attention guided convolutional neural network
  65. Bayesian query expansion for multi-camera person re-identification
  66. Unsupervised Eyeglasses Removal in the Wild
  67. Improving person re-identification by attribute and identity learning
  68. Pedestrian Alignment Network for Large-scale Person Re-Identification
  69. Joint Discriminative and Generative Learning for Person Re-Identification
  70. CamStyle: A Novel Data Augmentation Method for Person Re-Identification
  71. Multi-Pseudo Regularized Label for Generated Data in Person Re-Identification
  72. Camera Style Adaptation for Person Re-identification
  73. An improved artificial intelligence scheme for person reidentification
  74. Macro-Micro Adversarial Network for Human Parsing
  75. Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro