The AI coding benchmark · free · runs locally

    Everyone says they're a 10x engineer now. Most have no idea if they are.

    You have run hundreds of AI sessions and burned real money in tokens. AgentMetrics reads your local session history and shows what you actually shipped, how much of it survived, and how you stack up against other developers.

    Free. Runs locally on your machine. Your code never leaves it.

    Reads every tool
    Claude Code logoCodex logoCursor logoCopilot logoGemini logo
    app.codelitics.com/r/you
    AgentMetrics · your report

    You ran 5.2× the average developer.

    Here is what that spend actually earned, last 8 months.

    AI spendWatch

    $7,916

    +421% vs avg devlast 8 months
    SessionsExceeding

    275

    top 12% by activityacross 9 projects
    Code YieldWatch

    18%

    vs 31% cohort medianshipped, survived 30d
    Cost / realized changeWatch

    $162

    3.1× cohort medianspend ÷ code that stuck

    Spend vs survived

    Your spend is climbing. The share that survives is not.

    spend code yield
    $4k$2k$040%20%0Q1'25Q2'25Q3'25Q4'25Q1'26BEFORE AIAFTER AI+156%+230%+80%+35%
    Where you rank on value
    Code Yield
    1
    swift-otter
    Staff
    84%
    2
    neon-kestrel
    Senior
    81%
    3
    quiet-fox
    Senior
    79%
    ···
    ?
    You
    Run the report to claim your rank
    Reveal
    You vs the benchmark
    percentile
    Active days
    88th
    Session length
    64th
    Session focus
    47th
    Code Yield
    22nd
    Cost discipline
    14th

    High on activity, low where it counts. Run the report to confirm your real percentiles.

    Code Yield is the share of your AI-authored code that shipped and survived 30 days.

    5 tools · 1 report

    What's in your report

    Your usage dashboard shows the bill. This shows how you stack up.

    Where you rank

    Your percentile against other AI-native developers on output that actually survives, not tokens burned.

    Your Code Yield

    How much of your AI-authored code shipped, lasted, and mattered. The one number a usage dashboard cannot show you.

    Where your spend leaks

    Your real cost per change that stuck, broken down by tool and model, so you can see which ones are worth the bill.

    Code Yield, Code Half-Life, and cost per realized change are the Return on Code metrics. Read how they are defined in the glossary.

    How it works

    Three steps. The first one is the only one you do today.

    01

    Leave your email

    Today. That is the whole ask. We send the rest when the tool opens.

    02

    Run one command

    Next week you get a single CLI command. It runs locally and reads the session history your AI tools already keep. Nothing uploads.

    03

    Open your report

    See your numbers instantly. Then opt in to benchmark them against every other developer who ran it.

    Local first

    The tool runs on your machine and reads the session history your AI tools already keep. Your source code is never read and nothing uploads. You share aggregate numbers only when you choose to join the benchmark.

    For engineering leaders

    Benchmark your team, not just yourself.

    See which developers and which tools turn AI spend into code that ships and survives, and which ones quietly leak the budget. Govern per-seat spend on yield, not seats.

    Questions

    The things people ask before they run it.

    Is it free?
    Yes. Your personal AgentMetrics report is free. You run a single command locally and get your numbers back. The benchmark against other developers is also free once you opt in to share your aggregate score.
    Does my code leave my machine?
    No. The tool runs locally and reads the session metadata your AI coding tools already store on your machine. Your source code is never read or uploaded. You only ever share aggregate numbers, and only if you choose to join the benchmark.
    Which tools does it work with?
    It is tool-neutral. It reads history across Claude Code, Codex, Cursor, Copilot, Gemini, and more, so you get one cross-tool picture instead of a separate dashboard per vendor.
    When do I get it?
    We are sending run instructions next week. Leave your email now to be first in line when it opens.
    Can a team lead benchmark the whole team?
    Yes. The team view turns per-seat AI spend into Return on Code at the team and tool level, and surfaces who is shipping value with AI versus who is mostly burning tokens. Choose the team option to join the waitlist and we will reach out as the team product opens.

    Find out where you actually stand.

    Leave your email. We send your run instructions next week.

    Free. Runs locally on your machine. Your code never leaves it.