Login

Catch the tools your agent can’t trust before they touch your data.

We polygraph
AI tools so you
don’t have to.

AI agents plug into third-party tools and load skills that can hijack them or leak your data. We run those tools through an adversarial test, scan those skills for the same tricks, and publish a letter grade — free to read, public, evidence attached.

Install the CLI See recent grades ↓

polygraph grade: A

We polygraph ourselves, too — see the report.

latest polygraphlitmus-v12

D

npm/@wildcard-ai/deepcontext

tool-output injection: pass
permission overreach: pass
sensitive-data handling: pass

See the full run

01

We test it. Adversarial probes in a sandbox — does it hijack the agent, overreach, or leak?

02

We grade it. A letter grade, A to F, published free with the evidence attached.

03

You check it. One command before your agent installs anything.

baselineadversarial probesbaseline

§ 01/The problem

MCP servers and Agent Skills grew faster than anyone’s ability to polygraph them.

Agents now install and run skills off marketplaces unvetted. You need evidence anyone can check.

§ 02/Recent grades

Fresh from the harness.

The latest checks we’ve published, newest first — each one a real litmus run you can reproduce. Browse every grade, MCP servers and skills, in the index.

See all grades →

DMCP server
litmus-v12

npm/@wildcard-ai/deepcontext

Adversarial input handling failed (C-04): the server crashed, leaked internals (a stack trace), or amplified hostile input. No injection or data leak, so the grade caps at D.

View report →

AMCP server
litmus-v10

pypi/mcp-server-time

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

AMCP server
litmus-v12

pypi/obris-mcp

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

AMCP server
litmus-v12

npm/@meetlark/mcp-server

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

AMCP server
litmus-v12

npm/@autonomad1/computeback-mcp

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

AMCP server
litmus-v12

pypi/nudg3-mcp

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

AMCP server
litmus-v12

npm/bitrefill-mcp-server

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

AMCP server
litmus-v12

npm/@proofslip/mcp-server

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

AMCP server
litmus-v12

npm/@polygraphso/litmus

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

AMCP server
litmus-v12

npm/@polygraphso/litmus

All four categories passed. No injection, no data leak, no egress overreach, and adversarial inputs were handled cleanly (A means no overreach, not no network).

View report →

§ 03/How we polygraph

How a tool earns its grade.

Nine probes across four live checks — and one sandbox that captures every outbound call. A check we can’t run is reported as skipped — never passed — and every grade ships with the evidence: not a star rating, the actual artifacts.

C-01litmus-v11 · live
Does it try to hijack your agent?
tool-output injection · probe 1.1 · probe 1.2 · probe 1.3
We bait it with inputs designed to make it slip commands into its output — including one tool's output fed into another — then scan for hijack attempts: lookalike instructions, hidden text, markdown tricks.
C-02litmus-v11 · live
Does it touch things it shouldn't?
permission overreach · probe 2.1 · probe 2.2
We run local tools in a sandbox that captures every outbound call, then flag any that reach beyond the hosts and ports the server declared it needs. Remote servers can't be sandboxed — there this check is marked skipped, never assumed. We also flag a tool that labels itself read-only while its name, a parameter, or its description shows it mutates — a permission lie your agent would otherwise trust.
C-03litmus-v11 · live
Does it leak your data?
sensitive-data handling · probe 4.1 · probe 4.2
We plant fake secrets — keys, personal details — and watch every path out of the sandbox to see if they leave, including the tool's own replies to the agent.
C-04litmus-v11 · live
How does it handle hostile input?
adversarial input handling · probe 3.1 · probe 3.2
We hit each tool with malformed and oversized inputs and known jailbreak patterns, and flag it if it crashes, spills an internal stack trace, or turns the hostile input into an attack of its own.

Grades run A–F — capped at B when egress can’t be verified, down to D for overreach or a crash, F for a hijack or leak. Each grade is pinned to a sha256 fingerprint of the tool surface, so a later change — a rug pull — makes it stale automatically. Read the full grade rubric and methodology.

§ 04/Install

Run polygraph in your agent — or grade a server from your terminal.

The open litmus harness grades a server A–F with reproducible, content-addressed evidence. Add it to your agent, run it yourself, or gate your CI on it.

Cursor

One click — installs the MCP server (run_litmus, verify_attestation). Prefer to edit ~/.cursor/mcp.json? Use the config below.

Terminal

$ npx -y -p @polygraphso/litmus polygraphso-litmus litmus <mcp-server>

Grade a server yourself. Or wire the MCP server into any client with npx -y -p @polygraphso/litmus polygraphso-litmus-mcp.

Claude

/plugin install polygraph@polygraphso

Claude Code, after /plugin marketplace add polygraphso/litmus. Claude Desktop: paste the config below into claude_desktop_config.json.

Gate your CI — GitHub Action

Fail a build when an MCP server — or a skill it ships — grades D/F. On the GitHub Marketplace as polygraphso/litmus@v1:

# .github/workflows/mcp-gate.yml
name: mcp-gate
on: [pull_request]
permissions:
  contents: read
jobs:
  gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v5
      - uses: polygraphso/litmus@v1

Manual setup — any MCP client

Same config everywhere — paste into ~/.cursor/mcp.json (Cursor), claude_desktop_config.json (Claude Desktop), or your client’s MCP config:

{
  "mcpServers": {
    "polygraph-litmus": {
      "command": "npx",
      "args": ["-y", "-p", "@polygraphso/litmus", "polygraphso-litmus-mcp"],
      "env": { "POLYGRAPH_API_URL": "https://polygraph.so" }
    }
  }
}

§ 05/Get a badge

Show your grade where developers look.

Maintain an MCP server we’ve graded? Put its live polygraph on your README, npm page, or docs. The badge reads the current grade — it updates itself — and links back to the reproducible report.

The card

Example polygraph grade card for an MCP server

A fuller card for a README header or a docs page.

The inline badge

Example polygraph grade badge

sits in a README badge row

[![polygraph](https://polygraph.so/api/badge?server=npm/@modelcontextprotocol/server-filesystem)](https://polygraph.so/mcp/npm/@modelcontextprotocol/server-filesystem)

Swap in your own registry/owner/name ref. Full embed guide →

See a live report →

§ 06/Timeline

Where this is going.

Sequenced, not dated — we publish when a step survives review, not when a calendar says so.

now
litmus-v11 harness
Built and running: nine probes across four categories, a grade from A to F, evidence attached.
next
More public grades
Next on the bench: filesystem, github, slack, puppeteer, git. Vendors hear about significant failures before the public does.
later
Verifiable proof
Grades published as timestamped records anyone can check without trusting us.
later
Verified runs
Hardware-attested runs, so a third party can prove a grade is real.

§ 07/Funding

How polygraph gets funded.

Free to read, and not paid for by anyone we grade. Here’s where the money comes from instead.

$POLYGRAPHcommunity-launched

The Bankr community launched $POLYGRAPH — we didn’t issue it. We claim the dev fees publicly and use them to fund the work: the harness, the grades, and the evidence stay free to read.

Nobody can pay for a grade. No graded party gets review or approval rights over their result.

View $POLYGRAPH on Bankr →

Not financial advice. The token funds the work; it doesn’t move a grade.

Follow new polygraphs as they publish.

A short email when we publish new polygraphs — no per-server tracking, no drip campaign, no “hey just checking in.”

Waiting on a specific server? Get an email when its grade publishes →