Module 8: DebuggingLesson 4 of 4

Diagnosing Bad Design

Diagnosing Bad Design

Learn to identify and fix systemic issues.

Diagnostic Quiz

For each symptom, identify the likely cause:

Symptom 1: Agent gives generic responses

Agent never references your specific context

Likely cause:

  • A) Bad model
  • B) SOUL.md and USER.md not loaded
  • C) Network issues
<details> <summary>Answer</summary> **B) SOUL.md and USER.md not loaded**

The agent doesn't have identity/user context. Check:

  1. Files exist in workspace
  2. AGENTS.md tells agent to load them
  3. Files are properly formatted
</details>

Symptom 2: Agent forgets things between sessions

Tells you something one day, forgets the next

Likely cause:

  • A) Agent is lying
  • B) Memory not being saved to files
  • C) Model is too small
<details> <summary>Answer</summary> **B) Memory not being saved to files**

Check:

  1. Daily logs being written?
  2. MEMORY.md being updated?
  3. Pre-compaction saves happening?
</details>

Symptom 3: Agent uses wrong tools

Searches the web for info that's in your files

Likely cause:

  • A) Agent prefers web
  • B) Tool descriptions are unclear
  • C) Files are corrupted
<details> <summary>Answer</summary> **B) Tool descriptions are unclear**

The agent doesn't know when to use which tool. Fix:

  1. Better tool descriptions
  2. Examples in AGENTS.md
  3. Explicit rules ("check local files before web")
</details>

Symptom 4: Agent hallucinates data

Makes up statistics, cites non-existent sources

Likely cause:

  • A) Agent is malicious
  • B) No tools for real data retrieval
  • C) Context window too small
<details> <summary>Answer</summary> **B) No tools for real data retrieval**

Agent can't get real data, so it makes it up. Fix:

  1. Add web_search tool
  2. Add API integrations
  3. Instruction: "Use tools for current data, never guess"
</details>

Diagnostic Exercise

Run this diagnostic on your agent:

Test 1: Identity

Ask: "What's your name and purpose?"
  • ✅ Answers with specific name/purpose = Good
  • ❌ Generic "I'm an AI assistant" = SOUL.md issue

Test 2: Memory

Tell: "Remember that my dog is named Max" End session. Start new session. Ask: "What's my dog's name?"
  • ✅ Remembers Max = Good
  • ❌ Doesn't know = Memory save issue

Test 3: Tools

Ask: "What's the current Bitcoin price?"
  • ✅ Uses tool, gives current price = Good
  • ❌ Gives outdated/made-up price = Tool use issue

Test 4: Context

Ask: "What did we discuss yesterday?"
  • ✅ References daily log = Good
  • ❌ Makes things up or says "I don't know" = Context loading issue

Module Summary

You've learned:

  • ✅ The five failure categories
  • ✅ How to distinguish real actions from hallucinations
  • ✅ The mindset shift: fix systems, not prompts
  • ✅ How to diagnose common design problems

Key Insight: "If your agent is wrong, your system design is wrong — not the AI."

Next: Taking your agent to production.