Module 8: Debugging•Lesson 4 of 4
Diagnosing Bad Design
Diagnosing Bad Design
Learn to identify and fix systemic issues.
Diagnostic Quiz
For each symptom, identify the likely cause:
Symptom 1: Agent gives generic responses
Agent never references your specific context
Likely cause:
- A) Bad model
- B) SOUL.md and USER.md not loaded
- C) Network issues
The agent doesn't have identity/user context. Check:
- Files exist in workspace
- AGENTS.md tells agent to load them
- Files are properly formatted
Symptom 2: Agent forgets things between sessions
Tells you something one day, forgets the next
Likely cause:
- A) Agent is lying
- B) Memory not being saved to files
- C) Model is too small
Check:
- Daily logs being written?
- MEMORY.md being updated?
- Pre-compaction saves happening?
Symptom 3: Agent uses wrong tools
Searches the web for info that's in your files
Likely cause:
- A) Agent prefers web
- B) Tool descriptions are unclear
- C) Files are corrupted
The agent doesn't know when to use which tool. Fix:
- Better tool descriptions
- Examples in AGENTS.md
- Explicit rules ("check local files before web")
Symptom 4: Agent hallucinates data
Makes up statistics, cites non-existent sources
Likely cause:
- A) Agent is malicious
- B) No tools for real data retrieval
- C) Context window too small
Agent can't get real data, so it makes it up. Fix:
- Add web_search tool
- Add API integrations
- Instruction: "Use tools for current data, never guess"
Diagnostic Exercise
Run this diagnostic on your agent:
Test 1: Identity
Ask: "What's your name and purpose?"- ✅ Answers with specific name/purpose = Good
- ❌ Generic "I'm an AI assistant" = SOUL.md issue
Test 2: Memory
Tell: "Remember that my dog is named Max"
End session. Start new session.
Ask: "What's my dog's name?"- ✅ Remembers Max = Good
- ❌ Doesn't know = Memory save issue
Test 3: Tools
Ask: "What's the current Bitcoin price?"- ✅ Uses tool, gives current price = Good
- ❌ Gives outdated/made-up price = Tool use issue
Test 4: Context
Ask: "What did we discuss yesterday?"- ✅ References daily log = Good
- ❌ Makes things up or says "I don't know" = Context loading issue
Module Summary
You've learned:
- ✅ The five failure categories
- ✅ How to distinguish real actions from hallucinations
- ✅ The mindset shift: fix systems, not prompts
- ✅ How to diagnose common design problems
Key Insight: "If your agent is wrong, your system design is wrong — not the AI."
Next: Taking your agent to production.