Funny coincidence, I'm working on a benchmark showcasing AI capabilities in binary analysis.
Actually, AI has huge potential for superhuman capabilities in reverse engineering. This is an extremely tedious job with low productivity. Currently reserved, primarily when there is no other option (e.g., malware analysis). AI can make binary analysis go mainstream for proactive audits to secure against supply-chain attacks.
I built this because reverse engineering software across multiple versions is painful. You spend hours annotating functions in version 1.07, then version 1.08 drops and every address has shifted — all your work invisible.
The core idea is a normalized function hashing system. It hashes functions by their logical structure — mnemonics, operand categories, control flow — not raw bytes or absolute addresses. When a binary is recompiled or rebased, the same function produces the same hash. All your documentation (names, types, comments) transfers automatically.
Beyond that, it's a full MCP bridge with 110 tools for Ghidra: decompilation, disassembly, cross-referencing, annotation, batch analysis, and headless/Docker deployment. It integrates with Claude, Claude Code, or any MCP-compliant client.
For context, the most popular Ghidra MCP server (LaurieWired's, 7K+ stars) has about 15 tools. This started as a fork of that project but grew into 28,600 lines of substantially different code.
Architecture:
Java Ghidra Plugin (22K LOC) → embeds HTTP server inside Ghidra
Python MCP Bridge (6.5K LOC) → 110 tools with batch optimization
Any MCP client → Claude, scripts, CI pipelines
I validated the hashing against Diablo II — dozens of patch versions, each rebuilding DLLs at different base addresses. The hash registry holds 154K+ entries, and I can propagate 1,300+ function annotations from one version to the next automatically.
The headless mode runs in Docker (docker compose up) for batch processing and CI integration — no GUI required.
Funny coincidence, I'm working on a benchmark showcasing AI capabilities in binary analysis.
Actually, AI has huge potential for superhuman capabilities in reverse engineering. This is an extremely tedious job with low productivity. Currently reserved, primarily when there is no other option (e.g., malware analysis). AI can make binary analysis go mainstream for proactive audits to secure against supply-chain attacks.
Hi HN,
I built this because reverse engineering software across multiple versions is painful. You spend hours annotating functions in version 1.07, then version 1.08 drops and every address has shifted — all your work invisible.
The core idea is a normalized function hashing system. It hashes functions by their logical structure — mnemonics, operand categories, control flow — not raw bytes or absolute addresses. When a binary is recompiled or rebased, the same function produces the same hash. All your documentation (names, types, comments) transfers automatically.
Beyond that, it's a full MCP bridge with 110 tools for Ghidra: decompilation, disassembly, cross-referencing, annotation, batch analysis, and headless/Docker deployment. It integrates with Claude, Claude Code, or any MCP-compliant client.
For context, the most popular Ghidra MCP server (LaurieWired's, 7K+ stars) has about 15 tools. This started as a fork of that project but grew into 28,600 lines of substantially different code.
Architecture:
I validated the hashing against Diablo II — dozens of patch versions, each rebuilding DLLs at different base addresses. The hash registry holds 154K+ entries, and I can propagate 1,300+ function annotations from one version to the next automatically.The headless mode runs in Docker (docker compose up) for batch processing and CI integration — no GUI required.
v2.0.0 adds localhost-only binding (security), configurable timeouts, label deletion tools, and .env-based configuration.
Happy to discuss the hashing approach, MCP protocol design decisions, or how this fits into modern RE workflows.
Have you had any issues with models "refusing" to do reverse engineering work?