AI writes the code.
pwnkit hacks it.
Open-source agentic harness for autonomous security research.
Agents that read code, craft attacks, and re-exploit to kill false positives · Built from 7 real CVEs in packages with 40M+ downloads
npx pwnkit-cli audit express npx pwnkit-cli review ./my-repo
One toolkit. Five attack surfaces.
From LLM endpoints to npm packages to git repos — pwnkit finds what scanners miss.
LLM Endpoints
ChatGPT, Claude, Llama APIs, custom chatbots
MCP Servers
Tool schemas, validation, auth, poisoning
npm Packages
Supply chain, malicious code, dependency risk
Source Code
Local repos, GitHub URLs, deep AI audit
Web Apps
AI copilots, RAG pipelines, agent APIs
Five commands. Full coverage.
Each command is purpose-built for a different attack surface. Zero config, instant results.
Probe LLM endpoints & MCP servers
Discovers vulnerabilities in AI endpoints with 47+ test cases across prompt injection, jailbreaks, tool poisoning, data exfiltration, and more. Supports probe, deep, and MCP modes.
npx pwnkit-cli scan --target <url> Audit npm packages for malicious code
Installs a package in a sandbox, runs semgrep static analysis plus AI-powered code review. Catches supply chain attacks, backdoors, and dependency vulnerabilities.
npx pwnkit-cli audit <package> Deep security audit of source code
Security-focused code review of local repos or GitHub URLs. Multiple AI runtimes analyze your entire codebase and output SARIF, Markdown, and JSON reports.
npx pwnkit-cli review <repo> Query and inspect verified findings
Filter findings by severity, category, and status. Inspect individual findings with full evidence chains and proof artifacts. Track the lifecycle from discovered to confirmed.
npx pwnkit-cli findings list Browse past scan results
Query the local SQLite database for previous scans. See status, depth, findings count, and duration for every run. Track your security posture over time.
npx pwnkit-cli history One command, zero config
No YAML files. No Python environments. Just npx pwnkit-cli scan and you're running.
Zero false positives
Every finding is re-exploited with proof before it hits the report. No more triaging 200 "possible prompt injections."
$0.05 per CI scan
Quick scans in under a minute. Deep audits for $1. Cheaper than one hour of manual pentesting.
LLM agnostic
Works with any model — Claude, GPT, Ollama, Gemini, or your own fine-tune. Swap providers without changing a single config line.
How it compares
Independent. Open source. No vendor lock-in.
| Feature | promptfoo (acquired by OpenAI) | garak | nuclei | Semgrep | |
|---|---|---|---|---|---|
| Autonomous multi-agent | Agentic pipeline | — | — | — | — |
| Verification (no false positives) | Re-exploits | — | — | — | — |
| LLM endpoint scanning | ✓ | ✓ | ✓ | — | — |
| MCP server security | ✓ | — | — | — | — |
| npm package audit | ✓ | — | — | — | Rules |
| Source code review | AI-powered | — | — | — | Rules |
| Web/API scanning | ✓ | — | — | ✓ | — |
| AI attack coverage | 30+ agentic | Partial | Partial | — | — |
| Zero config | npx | YAML | Python | Templates | Config |
| Independent | ✓ | Acquired | ✓ | ✓ | VC-backed |
| Open source | MIT | OpenAI-owned | OSS | MIT | LGPL |
Drops into your CI/CD
Findings show up directly in GitHub's Security tab.
name: AI Security Scan
on: [push, pull_request]
jobs:
pwnkit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run pwnkit
uses: peaktwilight/pwnkit/action@v1
with:
target: $${{ secrets.STAGING_API_URL }}
- name: Upload SARIF
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: pwnkit-report/report.sarif pwnkit reviews its own source code
pwnkit runs pwnkit review . on its own repository. The same agentic pipeline that found 7 CVEs — pointed at itself. If it finds something, you'll see it here.
Set it up on your repo in 2 minutes:
1. Add to your GitHub Actions workflow:
- run: npx pwnkit-cli review . --format json > pwnkit-report.json 2. Add the badge to your README:
[](https://pwnkit.com) Built from real security research
pwnkit started as an internal framework. It found 7 CVEs in packages with 40M+ weekly downloads before I open-sourced it.
Stop guessing.
Start proving.
Five commands. Real vulnerabilities. Proof of exploitability.
npx pwnkit-cli scan --target <url>
Star on GitHub