AI Agents Weekly - Evaluating Agents
AI Agents Weekly: Evaluating AGENTS.md & More
From Elvis Saraviaβs AI Newsletter β February 28, 2026
π Main Story: Evaluating AGENTS.md β Do Context Files Help Coding Agents?
Researchers from UIUC and Microsoft Research investigated whether repository-level context files like AGENTS.md, CLAUDE.md, and similar instruction files actually improve coding agent performance β and the results are surprising.
Key Finding: Context Files Hurt Performance
Counter to widespread industry practice, the study found that context files reduce task success rates compared to giving agents no context at all, while also increasing inference costs by over 20%.
Detailed Findings
- Lower success rates: Both LLM-generated and human-written context files caused agents to solve fewer tasks on the SWE-bench benchmark compared to a no-context baseline.
- Broader but less effective exploration: Context files prompted agents to explore repositories more thoroughly (more testing, more file traversal), but the additional constraints made tasks harder rather than easier.
- Minimal is better: The authors recommend context files describe only minimal, essential requirements rather than comprehensive specifications β unnecessary constraints actively harm agent performance.
Practical Takeaways
- Developers should rethink how they write
AGENTS.md,CLAUDE.md, and similar files. - Focus on essential guardrails only β avoid exhaustive instructions.
- More detail β better performance; over-specification is a real cost.
π° Other Headlines (Paywalled)
The full issue also covers the following topics (summaries not accessible due to paywall):
- Perplexity Computer β end-to-end task automation launch
- Google Nano Banana 2 β free model release
- Sakana AI Doc-to-LoRA & Text-to-LoRA β document and text-based LoRA fine-tuning tools
- Notion Custom Agents 3.3 β new agent capabilities in Notion
- Nous Research Hermes Agent β open-source agent model release
- GPT-5.3-Codex β now available to all developers
- Mercury 2 β reasoning diffusion LLM release
- Qwen 3.5 medium model series β new model drop
- Claude Code auto-memory β persistent memory across sessions
- RoguePilot β GitHub Copilot vulnerability exposure
- Vercel Chat SDK β open-source multi-platform bot framework
π Papers
- Paper β Evaluating AGENTS.md: Are Context Files Helpful for Coding Agents? (UIUC & Microsoft Research)
π‘ Bottom Line
The headline insight from this issue is a counterintuitive but important one for AI developers: less is more when it comes to agent context files. Detailed AGENTS.md instructions may feel like good engineering practice, but the evidence suggests they can actively work against the agents theyβre meant to guide.

