Chrome DevTools for Claude Code Agents

Developer Observabilityfor AI Agents

See what's happening under the hood. Benchmark agents. Measure everything.

$ bun install -g @carrier-sh/carrier

Quickstart

# Run an agent and capture full telemetry
carrier deploy code-reviewer "review auth.ts"

# View complete execution log
cat .carrier/deployed/{id}/logs/*.json | jq

# Sample output:
{
  "timestamp": "2024-10-10T10:23:45Z",
  "type": "tool_call",
  "content": {
    "name": "Read",
    "input": { "file_path": "src/auth.ts" }
  },
  "tokens": { "input": 234, "output": 156 }
}

Get observability in 60 seconds

1
Install Carrier
bun install -g @carrier-sh/carrier
2
Run any agent with telemetry
carrier deploy code-reviewer "review auth.ts"
3
View complete telemetry logs
cat .carrier/deployed/*/logs/*.json | jq
4
Benchmark different agents
carrier benchmark "task" --agents=a1,a2,a3

Read the Full Documentation →

The Problem

When you use AI agents, you have zero visibility into what they're doing. No metrics on token usage, tool calls, or performance. No way to compare agents or verify quality before deploying.

📡

Full Telemetry Capture

Every tool call, token used, turn taken. Complete execution logs with timestamps, parameters, and results. Nothing hidden.

⚡

Live Benchmarking

Run multiple agents side-by-side on the same task. See which performs best before deploying. Data-driven agent selection.

👁️

Real-time Streaming

Watch agents work live. See files accessed, commands executed, decisions made. Chrome DevTools for AI.

🗂️

Context Tracking

Automatic capture of files read/written, commands run, tools used. Complete audit trail for every execution.

🔧

Agent Builder

Create agents through conversation. Configure purpose, tone, output format. Test and deploy in minutes.

💾

Structured Data Export

All telemetry stored as queryable JSON. Export to your data warehouse. Build custom analytics.

Before vs After

Claude Code intentionally hides complexity. But power users need data to make informed decisions.

❌

Without Carrier

•Download random .md agents, hope they work
•Zero visibility into what agents are doing
•No idea if they're worth the token cost
•Can't compare different agents objectively
•No way to verify quality before deploying
•Creating agents means manually editing markdown

✅

With Carrier

✓Benchmark agents side-by-side, see real metrics
✓Full telemetry on every tool call, token, turn
✓ROI tracking with complete token usage data
✓Data-driven decisions based on performance metrics
✓Quality verification before production deployment
✓Interactive builder to create agents in minutes

100%

Execution transparency

Every

Tool call captured

JSON

Structured data export

How It Works

Carrier wraps Claude Code agents with comprehensive telemetry, then exports structured data you can query and analyze.

Capture Everything

Every tool call, token used, file accessed, command run. Complete audit trail with timestamps and parameters.

Benchmark & Compare

Run multiple agents side-by-side. See which performs best on speed, quality, and cost metrics.

Analyze & Optimize

Export structured JSON logs. Query with jq, load into your data warehouse, build custom analytics dashboards.

Why This Matters

Different users, same problem: no visibility into agent behavior. Carrier solves this at every level.

👨‍💻

For Developers

❌ Before

Download random .md agents, hope they work

✓ After

Benchmark agents, see real metrics, choose the best

👥

For Teams

❌ Before

No idea what Claude costs or if it's worth it

✓ After

Full telemetry, ROI tracking, cost optimization

🤖

For Future AI

❌ Before

No memory of what worked before

✓ After

Historical data, proven patterns, learned optimizations

The Future is Developer-Driven

As AI agents become critical infrastructure, developers need the same quality tools they have for traditional software. Carrier brings that world to Claude Code.

🔬

Measure Everything

Just like you profile your backend or measure frontend performance, you should understand your agents. Every token, every decision, every outcome.

🧪

Test Before Deploy

You wouldn't ship code without tests. Why deploy agents without benchmarking? Compare approaches, validate quality, make data-driven choices.

📈

Build on Data

Structured telemetry becomes organizational knowledge. Learn what works, optimize over time, share insights with your team.

We're building the developer tools AI agents deserve.
Purpose-built for developers who want control, visibility, and data-driven decisions.

Stop Flying Blind

Join developers using Carrier to gain full visibility into their AI agents. See what works. Measure what matters.

Get Started Star on GitHub