Module 9: From Toy to System•Lesson 4 of 6

Cost Control

Cost Control

AI agents can get expensive. Here's how to manage costs.

Understanding Costs

API Costs

  • Input tokens: What you send (prompts, context)
  • Output tokens: What the model generates
  • Tool calls: Often billed separately

Cost by Model (per 1M tokens)

ModelInputOutput
Claude Opus$15$75
Claude Sonnet$3$15
Claude Haiku$0.25$1.25
GPT-4o$2.50$10
GPT-4o-mini$0.15$0.60

Cost Reduction Strategies

1. Right-Size Your Models

Don't use Opus for everything:

# Coordinator: Smart (but expensive) default: claude-opus-4 # Background tasks: Good enough (cheaper) cron: model: claude-sonnet-4 # Simple tasks: Fast (very cheap) simple: model: claude-haiku

Potential savings: 60-80%

2. Minimize Context

Every token counts:

# Bad: Loading everything Load: SOUL.md, USER.md, MEMORY.md, all daily logs, all project files, all people files... # Good: Load on demand Always: SOUL.md, USER.md, MEMORY.md (index only) On demand: Specific detail files when needed

3. Cache When Possible

If you're doing the same lookups:

  • Store results in files
  • Check cache before calling APIs
  • Set reasonable TTLs

4. Set Budgets

Implement spending limits:

# Example budget config limits: daily: $10 monthly: $200 perSession: $1

Get alerts before hitting limits.

5. Monitor Usage

Track where your money goes:

  • Which sessions cost most?
  • What tasks are expensive?
  • Are there runaway processes?

Cost Monitoring Example

My setup tracking:

šŸ“Š Daily Usage Report - Main sessions: $2.40 (Opus) - Cron jobs: $0.80 (Sonnet) - Sub-agents: $0.30 (Sonnet) Total: $3.50/day Monthly estimate: ~$100

Red Flags

Watch for:

  • Sudden cost spikes
  • Infinite loops (keeps calling APIs)
  • Unnecessary tool calls
  • Oversized contexts
  • Wrong model for task

Budget Template

CategoryModelDaily Budget
Main chatOpus$3
Cron jobsSonnet$1
ResearchSonnet$1
Quick tasksHaiku$0.50
Total$5.50/day

Adjust based on your actual usage patterns.