All Stories

  1. Progressive Text-to-3D Generation for Automatic 3D Prototyping
  2. CLIP-SR: Collaborative Linguistic and Image Processing for Super-Resolution
  3. Introduction to the Special Issue on Deep Multimodal Generation and Retrieval
  4. Domain-Agnostic Neural Oil Painting via Normalization Affine Test-Time Adaptation
  5. UniAD: Integrating Geometric and Semantic Cues for Unified Anomaly Detection
  6. Proceedings of the 3rd International Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  7. The 3rd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  8. MORE'25 Multimedia Object Re-ID: Advancements, Challenges, and Opportunities
  9. From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search
  10. CAMeL: Cross-Modality Adaptive Meta-Learning for Text-Based Person Retrieval
  11. EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
  12. Scale Up Composed Image Retrieval Learning via Modification Text Generation
  13. Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene
  14. Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching
  15. The 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  16. The 2nd International Workshop on Deep Multi-modal Generation and Retrieval
  17. Transferring to Real-World Layouts: A Depth-aware Framework for Scene Adaptation
  18. Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery
  19. Self-ensembling depth completion via density-aware consistency
  20. Collaborative group: Composed image retrieval via consensus learning from noisy annotations
  21. Multiple-environment Self-adaptive Network for aerial-view geo-localization
  22. MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and Opportunities
  23. High Fidelity Makeup via 2D and 3D Identity Preservation Net
  24. StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition
  25. Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation
  26. Active Discovering New Slots for Task-Oriented Conversation
  27. Learning Cross-View Geo-Localization Embeddings via Dynamic Weighted Decorrelation Regularization
  28. UAVM '23: 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
  29. Deep Multimodal Learning for Information Retrieval
  30. PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation
  31. Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
  32. Learnable Pillar-based Re-ranking for Image-Text Retrieval
  33. Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
  34. Context-Aware Pretraining for Efficient Blind Image Decomposition
  35. Multi-view Consistent Generative Adversarial Networks for Compositional 3D-Aware Image Synthesis
  36. Align and Tell: Boosting Text-Video Retrieval With Local Alignment and Fine-Grained Supervision
  37. Progressive Local Filter Pruning for Image Retrieval Acceleration
  38. U-Turn: Crafting Adversarial Queries with Opposite-Direction Features
  39. Soft Person Reidentification Network Pruning via Blockwise Adjacent Filter Decaying
  40. Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis
  41. Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization
  42. Adaptive Boosting for Domain Adaptation: Toward Robust Predictions in Scene Segmentation
  43. DMRNet++: Learning Discriminative Features with Decoupled Networks and Enriched Pairs for One-Step Person Search
  44. Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization
  45. Parameter-Efficient Person Re-Identification in the 3D Space
  46. SPG-VTON: Semantic Prediction Guidance for Multi-Pose Virtual Try-on
  47. Self-supervised Point Cloud Representation Learning via Separating Mixed Shapes
  48. Decoupled and Memory-Reinforced Networks: Towards Effective Feature Learning for One-Step Person Search
  49. Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation
  50. VehicleNet: Learning Robust Visual Representation for Vehicle Re-Identification
  51. University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization
  52. Real-World Automatic Makeup via Identity Preservation Makeup Net
  53. Unsupervised Scene Adaptation with Memory Regularization in vivo
  54. Dual-path Convolutional Image-Text Embeddings with Instance Loss
  55. Going Beyond Real Data: A Robust Visual Representation for Vehicle Re-identification
  56. Thorax disease classification with attention guided convolutional neural network
  57. Bayesian query expansion for multi-camera person re-identification
  58. Unsupervised Eyeglasses Removal in the Wild
  59. Improving person re-identification by attribute and identity learning
  60. Pedestrian Alignment Network for Large-scale Person Re-Identification
  61. Joint Discriminative and Generative Learning for Person Re-Identification
  62. CamStyle: A Novel Data Augmentation Method for Person Re-Identification
  63. Multi-Pseudo Regularized Label for Generated Data in Person Re-Identification
  64. Camera Style Adaptation for Person Re-identification
  65. An improved artificial intelligence scheme for person reidentification
  66. Macro-Micro Adversarial Network for Human Parsing
  67. Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro