All Stories

  1. GrowthHacker: Automated Off-Policy Evaluation Optimization Using Code-Modifying LLM Agents
  2. A Survey on LLM-based Code Generation for Low-Resource and Domain-Specific Programming Languages
  3. HumanEvalComm: Benchmarking the Communication Competence of Code Generation for LLMs and LLM Agent
  4. AutoOffAB: Toward Automated Offline A/B Testing for Data-Driven Requirement Engineering
  5. V-Model in Building ML-Enabled Software