Chrome DevTools for Claude Code Agents

Developer Observabilityfor AI Agents

See what's happening under the hood. Benchmark agents. Measure everything.

$ bun install -g @carrier-sh/carrier

Quickstart

# Run an agent and capture full telemetry
carrier deploy code-reviewer "review auth.ts"

# View complete execution log
cat .carrier/deployed/{id}/logs/*.json | jq

# Sample output:
{
  "timestamp": "2024-10-10T10:23:45Z",
  "type": "tool_call",
  "content": {
    "name": "Read",
    "input": { "file_path": "src/auth.ts" }
  },
  "tokens": { "input": 234, "output": 156 }
}

Get observability in 60 seconds

  1. 1

    Install Carrier

    bun install -g @carrier-sh/carrier
  2. 2

    Run any agent with telemetry

    carrier deploy code-reviewer "review auth.ts"
  3. 3

    View complete telemetry logs

    cat .carrier/deployed/*/logs/*.json | jq
  4. 4

    Benchmark different agents

    carrier benchmark "task" --agents=a1,a2,a3
Read the Full Documentation โ†’

The Problem

When you use AI agents, you have zero visibility into what they're doing. No metrics on token usage, tool calls, or performance. No way to compare agents or verify quality before deploying.

๐Ÿ“ก

Full Telemetry Capture

Every tool call, token used, turn taken. Complete execution logs with timestamps, parameters, and results. Nothing hidden.

โšก

Live Benchmarking

Run multiple agents side-by-side on the same task. See which performs best before deploying. Data-driven agent selection.

๐Ÿ‘๏ธ

Real-time Streaming

Watch agents work live. See files accessed, commands executed, decisions made. Chrome DevTools for AI.

๐Ÿ—‚๏ธ

Context Tracking

Automatic capture of files read/written, commands run, tools used. Complete audit trail for every execution.

๐Ÿ”ง

Agent Builder

Create agents through conversation. Configure purpose, tone, output format. Test and deploy in minutes.

๐Ÿ’พ

Structured Data Export

All telemetry stored as queryable JSON. Export to your data warehouse. Build custom analytics.

Before vs After

Claude Code intentionally hides complexity. But power users need data to make informed decisions.

โŒ

Without Carrier

  • โ€ขDownload random .md agents, hope they work
  • โ€ขZero visibility into what agents are doing
  • โ€ขNo idea if they're worth the token cost
  • โ€ขCan't compare different agents objectively
  • โ€ขNo way to verify quality before deploying
  • โ€ขCreating agents means manually editing markdown
โœ…

With Carrier

  • โœ“Benchmark agents side-by-side, see real metrics
  • โœ“Full telemetry on every tool call, token, turn
  • โœ“ROI tracking with complete token usage data
  • โœ“Data-driven decisions based on performance metrics
  • โœ“Quality verification before production deployment
  • โœ“Interactive builder to create agents in minutes
100%

Execution transparency

Every

Tool call captured

JSON

Structured data export

How It Works

Carrier wraps Claude Code agents with comprehensive telemetry, then exports structured data you can query and analyze.

1

Capture Everything

Every tool call, token used, file accessed, command run. Complete audit trail with timestamps and parameters.

2

Benchmark & Compare

Run multiple agents side-by-side. See which performs best on speed, quality, and cost metrics.

3

Analyze & Optimize

Export structured JSON logs. Query with jq, load into your data warehouse, build custom analytics dashboards.

Why This Matters

Different users, same problem: no visibility into agent behavior. Carrier solves this at every level.

๐Ÿ‘จโ€๐Ÿ’ป

For Developers

โŒ Before
Download random .md agents, hope they work
โœ“ After
Benchmark agents, see real metrics, choose the best
๐Ÿ‘ฅ

For Teams

โŒ Before
No idea what Claude costs or if it's worth it
โœ“ After
Full telemetry, ROI tracking, cost optimization
๐Ÿค–

For Future AI

โŒ Before
No memory of what worked before
โœ“ After
Historical data, proven patterns, learned optimizations

The Future is Developer-Driven

As AI agents become critical infrastructure, developers need the same quality tools they have for traditional software. Carrier brings that world to Claude Code.

๐Ÿ”ฌ

Measure Everything

Just like you profile your backend or measure frontend performance, you should understand your agents. Every token, every decision, every outcome.

๐Ÿงช

Test Before Deploy

You wouldn't ship code without tests. Why deploy agents without benchmarking? Compare approaches, validate quality, make data-driven choices.

๐Ÿ“ˆ

Build on Data

Structured telemetry becomes organizational knowledge. Learn what works, optimize over time, share insights with your team.

We're building the developer tools AI agents deserve.
Purpose-built for developers who want control, visibility, and data-driven decisions.

Stop Flying Blind

Join developers using Carrier to gain full visibility into their AI agents. See what works. Measure what matters.