All Stories

  1. PeeriScope: A Multi-Faceted Framework for Evaluating Peer Review Quality
  2. QueryGym: A Toolkit for Reproducible LLM-Based Query Reformulation
  3. Can LLMs Uphold Research Integrity? Evaluating the Role of LLMs in Peer Review Quality
  4. Query Performance Prediction Using Neural Query Space Proximity
  5. ProActLLM: Proactive Conversational Information Seeking with Large Language Models
  6. Building Trustworthy Peer Review Quality Assessment Systems
  7. RottenReviews: Benchmarking Review Quality with Human and LLM-Based Judgments
  8. A Human-AI Comparative Analysis of Prompt Sensitivity in LLM-Based Relevance Judgment
  9. VAP3: Variation-Aware Prompt Performance Prediction
  10. IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents
  11. Benchmarking LLM-based Relevance Judgment Methods
  12. Query Performance Prediction: Theory, Techniques and Applications
  13. Query Performance Prediction: Techniques and Applications in Modern Information Retrieval
  14. Evaluating Relative Retrieval Effectiveness with Normalized Residual Gain
  15. Offline Evaluation of Set-Based Text-to-Image Generation
  16. Reviewerly: Modeling the Reviewer Assignment Task as an Information Retrieval Problem
  17. Enhanced Retrieval Effectiveness through Selective Query Generation
  18. Retrieving Supporting Evidence for Generative Question Answering
  19. Noisy Perturbations for Estimating Query Difficulty in Dense Retrievers
  20. A is for Adele: An Offline Evaluation Metric for Instant Search
  21. Quantifying Ranker Coverage of Different Query Subspaces
  22. A Preference Judgment Tool for Authoritative Assessment
  23. Gender Fairness in Information Retrieval Systems
  24. Addressing Gender-related Performance Disparities in Neural Rankers
  25. Predicting Efficiency/Effectiveness Trade-offs for Dense vs. Sparse Retrieval Strategy Selection
  26. MS MARCO Chameleons: Challenging the MS MARCO Leaderboard with Extremely Obstinate Queries
  27. Matches Made in Heaven: Toolkit and Large-Scale Datasets for Supervised Query Reformulation
  28. BERT-QPP: Contextualized Pre-trained transformers for Query Performance Prediction
  29. On the Orthogonality of Bias and Utility in Ad hoc Retrieval
  30. Geometric Estimation of Specificity within Embedding Spaces