2.4 KiB

Raw Blame History

Performance Considerations

Optimizing agent efficiency and resource usage.

Cost Factors

Agent loading time
Context switching overhead
Tool invocations
Model inference

Optimization Strategies

Right-Size Models

# ❌ Heavyweight for simple task
model: opus
# Task: Format code

# ✅ Appropriate
model: haiku  # or inherit

Focused Descriptions

# ❌ Too many triggers (slow matching)
description: Does everything related to code...

# ✅ Focused (fast matching)
description: |
  SQL injection detector. Triggers on
  SQL security, injection detection, query validation.

Minimal Context

// ❌ Too much context
{
  "task": "Review code",
  "context": ["@entire-codebase", "All git history"]
}

// ✅ Focused context
{
  "task": "Review authentication code",
  "context": ["@src/auth/auth.service.ts", "Focus on JWT validation"]
}

Sequential Over Parallel

// ❌ Parallel (multiple agent contexts)
- Security agent reviewing
- Performance agent reviewing
- Quality agent reviewing

// ✅ Sequential (one at a time)
1. Security agent → results
2. Performance agent → results
3. Quality agent → results

Why: Lower memory overhead, clearer results.

Caching

Agents benefit from prompt caching:

Description and instructions cached
Repeated invocations faster
Tool restrictions cached

Maximize caching:

Keep agent instructions stable
Don't dynamically generate agent content
Reuse agents frequently

Tool Philosophy

# Default: inherit (don't over-specify)
model: inherit

# If restricting, use baseline + needed extras
tools: Glob, Grep, Read, Skill, Task, TaskCreate, TaskUpdate, TaskList, TaskGet, WebSearch

# Full bash when needed (simpler than Bash(*))
tools: Glob, Grep, Read, Skill, Task, TaskCreate, TaskUpdate, TaskList, TaskGet, Bash

Context Size Guidelines

Agent Type	Typical Context	Max Recommended
Quick review	1-3 files	5 files
Standard review	3-10 files	20 files
Deep analysis	Full module	50 files
Research	Varies	Focused queries

Latency vs Quality Tradeoffs

Priority	Model	Context	Use Case
Speed	haiku	Minimal	Quick checks
Balance	sonnet/inherit	Moderate	Standard work
Quality	opus	Full	Critical analysis

2.4 KiB Raw Blame History

Performance Considerations

Cost Factors

Optimization Strategies

Right-Size Models

Focused Descriptions

Minimal Context

Sequential Over Parallel

Caching

Tool Philosophy

Context Size Guidelines

Latency vs Quality Tradeoffs

2.4 KiB

Raw Blame History