All Stories

  1. LLM-Assisted Relevance Assessments: When Should We Ask LLMs for Help?
  2. A Journey of Language Tasks Evaluation: A Keynote at SIGIR 2024
  3. Browsing and Searching Metadata of TREC
  4. Human Preferences as Dueling Bandits
  5. Too Many Relevants
  6. TREC Deep Learning Track: Reusable Test Collections in the Large Data Regime
  7. On the Quality of the TREC-COVID IR Test Collections
  8. Coopetition in IR research
  9. Coopetition in IR Research
  10. TREC-COVID
  11. On Building Fair and Reusable Test Collections using Bandit Techniques
  12. Evaluating Evaluation Measure Stability
  13. On the Behavior of PRES Using Incomplete Judgment Sets