AI Agents Weekly - Evaluating Agents

AI Agents Weekly: Evaluating AGENTS.md & More

From Elvis Saravia’s AI Newsletter — February 28, 2026

🔍 Main Story: Evaluating AGENTS.md — Do Context Files Help Coding Agents?

Researchers from UIUC and Microsoft Research investigated whether repository-level context files like AGENTS.md, CLAUDE.md, and similar instruction files actually improve coding agent performance — and the results are surprising.

Key Finding: Context Files Hurt Performance

Counter to widespread industry practice, the study found that context files reduce task success rates compared to giving agents no context at all, while also increasing inference costs by over 20%.

Detailed Findings

Lower success rates: Both LLM-generated and human-written context files caused agents to solve fewer tasks on the SWE-bench benchmark compared to a no-context baseline.
Broader but less effective exploration: Context files prompted agents to explore repositories more thoroughly (more testing, more file traversal), but the additional constraints made tasks harder rather than easier.
Minimal is better: The authors recommend context files describe only minimal, essential requirements rather than comprehensive specifications — unnecessary constraints actively harm agent performance.

Practical Takeaways

Developers should rethink how they write AGENTS.md, CLAUDE.md, and similar files.
Focus on essential guardrails only — avoid exhaustive instructions.
More detail ≠ better performance; over-specification is a real cost.

📰 Other Headlines (Paywalled)

The full issue also covers the following topics (summaries not accessible due to paywall):

Perplexity Computer — end-to-end task automation launch
Google Nano Banana 2 — free model release
Sakana AI Doc-to-LoRA & Text-to-LoRA — document and text-based LoRA fine-tuning tools
Notion Custom Agents 3.3 — new agent capabilities in Notion
Nous Research Hermes Agent — open-source agent model release
GPT-5.3-Codex — now available to all developers
Mercury 2 — reasoning diffusion LLM release
Qwen 3.5 medium model series — new model drop
Claude Code auto-memory — persistent memory across sessions
RoguePilot — GitHub Copilot vulnerability exposure
Vercel Chat SDK — open-source multi-platform bot framework

📄 Papers

Paper — Evaluating AGENTS.md: Are Context Files Helpful for Coding Agents? (UIUC & Microsoft Research)

💡 Bottom Line

The headline insight from this issue is a counterintuitive but important one for AI developers: less is more when it comes to agent context files. Detailed AGENTS.md instructions may feel like good engineering practice, but the evidence suggests they can actively work against the agents they’re meant to guide.

Infographic

Infographic wide