← Threshold Signalworks

THRESHOLD KEEL Available

Structural guardrails for autonomous AI agents

Your lobster has claws. Keel is the rubber band.

Keel is a safety layer for autonomous AI agents. It sits between the agent and the outside world, classifying every action by risk, requiring structured human approval before anything irreversible, and logging every action to a tamper-evident audit trail.

Constraints live in a Policy Store on disk, not in conversation context. They survive context compaction, session restarts, and model switches. The agent cannot forget a safety rule because the rule was never in memory to begin with.

Install as an OpenClaw skill:

clawhub install threshold-keel

Also available as a Claude Code plugin and MCP server. Source at GitHub. Licence: BSL 1.1 (converts to Apache 2.0 after 4 years).

How it works

Every action your agent takes is classified into one of four risk tiers before execution:

Tier Risk Examples Behaviour
T0 Read-only Fetch email, list files, search Proceed. Log to WAL.
T1 Reversible Add label, create draft, make directory Log, proceed with notice.
T2 Window Archive, move to bin, move file Brief approval. Quarantine period.
T3 Irreversible Send email, permanent delete, post message Full structured approval. Preview required.

What Keel does

Example policies

Policies are YAML, human-readable, editable by asking your agent or with a text editor. Keel ships with conservative defaults.

# ~/.keel/policies.yaml policies: - id: no-email-send-without-approval rule: "Never send any email without structured approval" scope: email action: require_t3 - id: protect-ssh-keys rule: "Never read, modify, or transmit SSH keys" scope: filesystem action: block - id: no-financial-actions rule: "Never execute any financial transaction" scope: financial action: block

Why Keel exists

In February 2026, 386 malicious skills were discovered on ClawHub. Cisco researchers demonstrated data exfiltration via prompt injection on OpenClaw instances. Over 30,000 exposed installations were found on Shodan. An agent created a dating profile without its user's knowledge.

The failure mode is always the same: the agent was given broad permissions, it operated autonomously, and then it did something irreversible that the user didn't anticipate. Because an instruction was compacted away, or because the approval was too casual, or because a malicious skill injected instructions the agent followed.

"The model promised to be careful" is not a safety architecture. Keel is.

Threshold Cloud

Keel works entirely locally. No account, no cloud, no telemetry. For users who want more:

Threshold Cloud — €29/month

Persistent Policy Store synced across agents. Shared WAL with web dashboard. Multi-agent coordination. Compliance-ready audit exports with hash chain verification. EU-hosted, GDPR-native. Learn more →

Technical details

Keel v0.1: 9,400 lines of Python, 270 tests, zero external dependencies. Policy Store with SHA-256 snapshot hashing, tiered token budget enforcement. WAL with hash-chained JSONL and chain integrity verification. Quarantine manager with delay windows and rollback. Gmail adapter with idempotent operations and blast radius caps. Driftwatch telemetry bridge. Helmsman confidence signal integration point.

Developed by Threshold Systems, the research arm of Threshold Signalworks. Part of an independent research programme on reliability under constraint for LLM agent systems. ORCID 0009-0004-1442-1743.