2026 · Autonomous AI security orchestrator

Rudra

An autonomous multi-agent offensive security platform for creative, non-templated vulnerability exploitation.

Working on it 2026

Why I built this

Tools like Metasploit work from rigid, templated modules: you pick an exploit and fire it. I wanted a system that reasons about a specific target surface, writes custom exploit code for that surface, tests it in a sandbox, and iterates on failures the way a skilled human would. The interesting engineering problem was keeping that autonomy safe: scope enforcement that cannot be overridden by an LLM, sandboxed execution with network isolation, and a full audit trail.

4 Agent types designed

2 / 4 Agents fully implemented

22 GB Total cluster VRAM

Current status

In active development. Two agents are fully implemented with LangGraph orchestration, LiteLLM model routing across local models, and Qdrant vector memory; the exploit and reporting loop is being expanded.

Architecture

System overview

Client to Orchestrator

CLI (rakshak) or FastAPI accepts target hostname/IP and scope definition. Scope validated at input: RFC1918, loopback, and link-local addresses rejected before any agent is spawned. Pre-scan health check verifies Ray cluster, Ollama, Docker, Redis, Kafka, Cassandra, and Qdrant.

Orchestrator (FastAPI + Ray actor)

Manages agent lifecycle, concurrency budget (Redis counter), and heartbeat monitoring (120s timeout). Scope enforcement is hard-coded Python, never delegated to an LLM instruction.

Shared Bus

Kafka · event routing between agents
Redis · concurrency budget and live state
Cassandra · persistent findings and attempt history
Qdrant · semantic CVE search and partial-win similarity

Recon Agent ✅ Complete

Fingerprints open ports, services, versions, frameworks, auth mechanisms, and API endpoints. Writes surface map to Redis and publishes to Kafka rudra.recon.discovered. Results flow into Cassandra target_intelligence.

Analyst Agent ✅ Complete

Consumes surface map, queries Qdrant CVE knowledge base, fetches CVSS scores from NVD API. Scores CVEs by confidence. CVSS scores are always fetched from NVD, never estimated by LLM.

Exploit Agent (Parked)

Reason · LLM analyzes CVE and target surface to plan approach
Write · generates Python exploit, AST-validated before execution
Execute · runs in isolated Docker sandbox, max 300s
Interpret · LLM reads output, classifies success or failure mode
Iterate · up to 5 retries with failure context; extended if progress_score > 0.7

Sandbox

Ephemeral container · fresh Docker instance per exploit run
300s TTL · auto-destroyed on completion or timeout
iptables whitelist · Linux (M1) allows only target IPs
tinyproxy · traffic cop on Windows machines (M2/M3)
Scope breach kill · any out-of-scope connection terminates container immediately

Tech stack

Technologies used

core

Python 3.11FastAPIRay (distributed agents)LiteLLM + OllamaLangGraph

infra

Kafka (event bus)Redis (state)Cassandra (findings)Qdrant (semantic search)Docker (sandbox)

tools

impacketscapypwntoolsparamikoNVD API (CVSS)ruffmypypytest

Key highlights

Proof points

01
Recon Agent fully implemented: fingerprints ports, services, versions, auth mechanisms, and API endpoints.
02
Analyst Agent fully implemented: maps findings to CVEs via Qdrant semantic search with CVSS scores from NVD, never LLM-estimated.
03
Scope enforcement is hard-coded Python: RFC1918 and loopback always blocked regardless of target configuration.
04
AST-based code validator checks every generated exploit for syntax, import whitelist, and blocked patterns before sandbox execution.
05
3-machine Ray cluster provides 22 GB total VRAM (RTX 3080 Ti + A1000 + T600) for distributed agent workloads.

Focus areas

AI securitySandbox designDistributed systemsEvent-driven architectureAPI integration

Explore the work

← All projects