AgentGuard

Runtime Governance for AI Agent Tool Execution · v2.0 · 24 tools · 21 oracles protected

AgentGuard is FeedOracle's security and policy enforcement layer for MCP (Model Context Protocol) tool execution. It sits between AI agents and compliance-critical tools, enforcing policies, detecting threats, managing approvals, and producing immutable audit trails — in real time, for every tool call.

Why AgentGuard

As AI agents increasingly operate autonomously in regulated financial environments, uncontrolled tool execution creates compliance risk, audit gaps, and liability exposure. AgentGuard addresses five critical enterprise needs:

Fewer Wrong Decisions

Policy-based preflight checks catch dangerous tool calls before execution. SSRF, injection, and secret exposure are blocked in real time — not discovered in post-incident reviews.

Controlled Escalation

High-risk operations don't silently execute. T4 tools trigger human approval workflows with webhook notifications, creating a clear chain of responsibility.

Audit-Ready Evidence

Every tool call produces an ES256K-signed audit record with input hash, output hash, risk score, policy decision, and timestamp. Ready for DORA Art. 6 and MiCA compliance audits.

Reduced Integration Risk

New tools are automatically classified into policy tiers. Unknown tools default to T2 (strict, logged). No tool executes without passing through the guard layer first.

AgentGuard as a Product Layer

AgentGuard is not just internal security. It is a deployable runtime governance layer for any MCP-based infrastructure. Financial institutions, compliance platforms, and AI agent orchestrators can integrate AgentGuard to enforce policies, manage approvals, and produce audit trails — without building their own security stack. Available as a standalone MCP server or integrated with the full FeedOracle compliance evidence platform.

24
Guard Tools
21
Oracles Protected
228
Handlers Guarded
111
Tools Classified
29/29
Red-Team Tests

Threat Coverage

AgentGuard detects and blocks the following threat classes in tool arguments, payloads, and outputs:

SSRF Protection

Localhost, RFC1918, link-local, IMDS (AWS/GCP/Alibaba), IPv6 loopback, decimal/hex IP obfuscation, embedded credentials in URLs.

Prompt Injection

Role hijack, instruction override, jailbreak patterns, constraint bypass attempts. Regex + heuristic detection.

Code Injection

Python __import__, os.system, subprocess, eval(), shell command chaining, SQL UNION/DROP/DELETE.

Secret Exposure

AWS keys (AKIA...), Bearer/JWT tokens, private keys, API keys, Slack tokens, GitHub PATs, Ethereum private keys.

Path Traversal

URL-encoded traversal, double encoding, system file access (/etc/passwd, /proc/self).

Payload Safety

XSS, null byte injection, oversized payloads, unicode exploits, tool output poisoning.

Replay Attacks

Duplicate request fingerprinting within time windows. Prevents identical tool calls from being re-executed.

Behavioral Anomalies

Cross-tool anomaly detection, rate limiting (per-minute/hour/day), high-frequency same-tool abuse.

Policy Tiers

Every tool in the FeedOracle ecosystem is classified into one of four policy tiers. Tiers determine the guard mode, risk floor, and escalation behavior.

T1

Public Read-Only

29 tools · Permissive mode · Risk ceiling: 15. Free access, no authentication required. Examples: health_check, ping, eth_gas.

T2

Compliance Read

25 tools · Strict mode · Risk ceiling: 40. Regulatory data reads. Logged, authentication recommended. Examples: mica_status, cve_search, sanctions_screen.

T3

Sensitive Analysis

39 tools · Strict mode · Risk ceiling: 70. Processes sensitive data, generates reports. Authentication required. Examples: compliance_preflight, evidence_bundle, board_report.

T4

Escalation

18 tools · Strict mode · Risk ceiling: 100. High-impact actions requiring human approval. Examples: wallet_transfer, emergency_kill, contract_draft.

Strict vs. Permissive Mode

Each oracle runs in one of two guard modes, configured at the oracle level:

BehaviorStrict (18 oracles)Permissive (3 oracles)
Guard running, request OK✅ Allowed✅ Allowed
Guard down, request arrivesBlocked (isError, risk=100)✅ Allowed (fail-open)
Guard denies request❌ Blocked❌ Blocked
SSRF detected in payload❌ Blocked (risk ≥ 95)❌ Blocked (risk ≥ 95)

Strict mode is mandatory for all compliance, risk, governance, and security oracles. If AgentGuard is unreachable, strict-mode oracles refuse to execute — preventing unaudited tool calls from running against regulated systems.

Human-in-the-Loop: Approval Workflow

When a tool call is classified as T4 (Escalation) or exceeds risk score 80, AgentGuard triggers the human approval workflow:

1
Agent calls a T4 tool → policy_preflight returns require_approval
2
approval_required registers a pending request → webhook notification fires
3
Human operator reviews → calls approval_resolve with approved or denied
4
Approved → agent state cleared, tool execution proceeds. Denied → agent placed in monitoring state for 24 hours
5
Every decision is ES256K-signed, written to the audit log, and cryptographically verifiable

Audit Trail

Every tool call that passes through AgentGuard produces an immutable audit record containing: request ID, agent identity, tool name, risk score, policy decision, matched policies, input hash, output hash, duration, ES256K signature, and ISO 8601 timestamp.

Audit entries are stored in SQLite WAL-mode for crash resilience and can be queried via the audit_log_query tool or exported for compliance reporting. The guard_metrics tool provides aggregated operational analytics across all audit data.

Red-Team Validation

AgentGuard has been validated against a 29-test red-team suite covering six attack categories:

CategoryTestsResult
SSRF Bypass (decimal/hex IP, IPv6, IMDS, credentials)1010/10 ✅
Injection Evasion (prompt, SQL, OS, Python)66/6 ✅
Encoding Tricks (URL, double-encoding, unicode)33/3 ✅
Secret Detection (AWS, JWT, Bearer, private keys)33/3 ✅
Tier Bypass (T1-T4 enforcement)44/4 ✅
Combined Attack Patterns33/3 ✅

Performance Under Load

AgentGuard is designed for production workloads. Every compliance-critical tool call passes through the guard preflight — the overhead must be minimal. These benchmarks were run on the production server (Contabo VPS, 20 cores, 62GB RAM, no GPU).

TestResultDetails
Single preflight latency6ms avgp50=5.5ms, p99=9ms, max=9ms (10 calls)
DORA tool with guard10ms avg~4ms guard overhead on top of tool execution
Burst (50 concurrent)244 req/s205ms wall time, 50/50 success, 0 errors
Sustained (200 calls)30ms avg, p95=58msRate-limited ~20/sec, 200/200 success, 0% error rate
SSRF scan under load20/20 blocked56ms avg with full SSRF analysis, no bypass under pressure

Resilience Behavior

ScenarioStrict Mode (18 oracles)Permissive Mode (3 oracles)
AgentGuard downTool calls blocked (risk=100, isError)Tool calls proceed (fail-open, logged)
Webhook unreachableApproval still registered, local log writtenSame — webhook is best-effort
Registry file missingTools default to T2 (Compliance Read)Same — safe default
DB locked (SQLite WAL)Retry with 5s timeout, then fail-openSame

Key takeaway: 6ms per preflight means guard overhead is negligible compared to the 800-3000ms typical for external API calls in compliance tools. The guard never becomes the bottleneck.

Architecture

AgentGuard runs as a standalone MCP server on port 12001, integrated into the FeedOracle whitelabel infrastructure. It uses the shared agentguard_client.py module to inject preflight checks into oracle handlers, and the shared quantum_sorum.py module for workflow sequencing and first-contact detection (Layer 12.2).

Integration Stack
Agent Request
    │
    ▼
Oracle Handler ──→ guard_preflight(tool, args, mode='strict')
    │                      │
    │               AgentGuard (Port 12001)
    │               ├── compute_risk_score() — SSRF, injection, secrets, tiers
    │               ├── evaluate_policies() — 7 DB policies + escalation rules
    │               ├── check_rate_limit() — per-minute/hour/day
    │               └── audit_log_write() — ES256K-signed record
    │                      │
    │               decision: allowed | denied | require_approval | flagged
    │                      │
    ▼                      ▼
Tool Execution ←── allowed ──→ proceed
                   denied  ──→ block + error response
                   approval ──→ webhook + human review

MCP Tools Reference

AgentGuard exposes 24 tools via MCP, organized into five categories:

Policy & Preflight (5 tools)

policy_preflight · tool_risk_score · decision_explain · tool_manifest_verify · policy_register

Approval Workflow (3 tools)

approval_required · approval_resolve · approval_list

Audit & Monitoring (3 tools)

audit_log_write · audit_log_query · guard_metrics

Security Scanning (8 tools)

payload_safety_check · secret_exposure_check · replay_guard_check · cross_tool_anomaly_check · scope_check · threat_intel_check · output_safety_scan · session_validate

Control & Governance (5 tools)

rate_limit_check · payment_policy_check · spend_limit_check · tenant_policy_check · emergency_kill

Endpoints

EndpointDescription
feedoracle.io/guard-oracle/mcp/AgentGuard MCP endpoint (FeedOracle domain)
tooloracle.io/guard/mcp/AgentGuard MCP endpoint (ToolOracle domain)
feedoracle.io/guard/Live metrics dashboard
feedoracle.io/guard/docs.htmlThis documentation
View Live Metrics → MCP Endpoint Pricing