Module 7: Tools & APIsLesson 4 of 5

Browser, Files, APIs

Browser, Files, APIs

The three most powerful tool categories. Master these.

Browser Tools

Control a real web browser.

When to Use

  • Dynamic websites (JavaScript-rendered)
  • Login-required pages
  • Multi-step web interactions
  • Screenshots for verification

Examples

// Open a page browser({ action: "open", targetUrl: "https://example.com" }) // Take a screenshot browser({ action: "screenshot", fullPage: true }) // Click a button browser({ action: "act", request: { kind: "click", ref: "e12" } }) // Fill a form browser({ action: "act", request: { kind: "type", ref: "e15", text: "hello" } }) // Get page structure browser({ action: "snapshot" })

Best Practices

  • Use snapshot to understand page structure
  • Reference elements by their ref IDs
  • Take screenshots to verify actions
  • Close tabs when done

File Tools

Read and write local files.

When to Use

  • Reading configuration
  • Saving results
  • Creating documents
  • Modifying code

Examples

// Read a file Read({ path: "workspace/MEMORY.md" }) // Write a file (creates if doesn't exist) Write({ path: "output/results.md", content: "# Results\n..." }) // Edit a file (surgical replacement) Edit({ path: "config.yaml", oldText: "debug: false", newText: "debug: true" })

Best Practices

  • Use Edit for precise changes (not full rewrites)
  • Always verify paths exist
  • Use relative paths from workspace
  • Create directories if needed (Write auto-creates)

API/Exec Tools

Run commands and call APIs.

When to Use

  • Running scripts
  • Git operations
  • System administration
  • Custom integrations

Examples

// Run a command exec({ command: "git status" }) // Background process exec({ command: "npm run dev", background: true }) // With timeout exec({ command: "npm install", timeout: 120 }) // TTY-required commands exec({ command: "vim file.txt", pty: true })

Best Practices

  • Set timeouts for long commands
  • Use background: true for servers
  • Check exit codes
  • Capture output for verification

Combining Tools

Real tasks often need multiple tools:

Example: Research and Save

1. web_search("topic") → Get URLs 2. web_fetch(url) → Get content 3. Agent synthesizes → Creates summary 4. Write(summary) → Saves to file

Example: Code Change

1. Read(file) → Understand current state 2. Agent plans changes 3. Edit(file, old, new) → Make change 4. exec("npm test") → Verify 5. exec("git commit") → Save

Example: Web Automation

1. browser.open(url) → Load page 2. browser.snapshot() → Understand structure 3. browser.act(fill form) → Enter data 4. browser.act(click submit) → Submit 5. browser.screenshot() → Verify result