The Citadel: Neuro-Symbolic Security Gateway
Deploy the intelligent Stdio Firewall to sanitize MCP tool interactions, blocking injection attacks and data exfiltration.
Loom Video
Project Description
The Model Context Protocol (MCP) transforms LLMs into agents but introduces a massive attack surface: it is a “dumb pipe” connecting untrusted inputs directly to sensitive OS tools. Recent research (Palo Alto Unit 42, 2025) highlights that Visual Injection (attacks hidden in images) has a 90% Attack Success Rate (ASR) because traditional text filters cannot see them. Furthermore, Indirect Injection via documents and Session Hijacking threaten to exfiltrate proprietary data.
The Citadel is the first Defense-in-Depth solution for MCP, operating as a “One-Pass” Security Kernel at the transport layer (Stdio). It implements a Tier 2 Detection Strategy (Inference-time) without the latency of heavy LLM-based guardrails:
Visual Injection Defense (OCR Sidecar): We use PaddleOCR (optimized for speed) to extract and scan text from images before the Agent processes them. This neutralizes the “90% ASR” vector where attackers hide prompts in white space or metadata.
Data Sovereignty (DLP Sanitization): Instead of binary blocking, our Output Sanitizer (using Microsoft Presidio + Regex) implements “Information Flow Control.” It surgically Redacts secrets (API Keys, PII) from tool outputs in real-time. The Agent remains functional but mathematically incapable of leaking the secret.
Zero Trust Identity Graph: Addressing “Session Hijacking,” our Go Kernel tracks stateful risk. If an agent’s behavior drifts (e.g., rapid file enumeration), the system’s “Trust Score” degrades, triggering a dynamic Lockdown Mode.
Prior Work
Startup founder building in the secure data requisition space for cross domain and focusing on enterprise.