The problem

A model that decides and acts is one misjudgment away from disaster

Give an AI both the reasoning and the trigger, and you inherit its mistakes at machine speed. One bad judgment call, one successful jailbreak, and something runs that nobody approved.

Autonomous offensive tooling is only as trustworthy as the controls around it. Sablewire built Stiletto around a simple premise: the entity that reasons about what to try next should never be the same entity that has the authority to do it.

Capability separation

A reasoning layer that proposes, a control layer that decides

Stiletto splits judgment from execution into two distinct layers, so intelligence never doubles as authority.

Proposes

Reasoning layer

Analyzes the target environment and recommends the next move in an engagement, the way a skilled operator would.

Disposes

Control layer

Independently decides what is allowed to execute and carries out the action. The AI never gets to bypass authorization. Authorization lives in the control layer, not the AI's instructions, so a manipulated prompt cannot push an engagement outside its authorized scope.

Human-in-the-loop

Durable approval gates in front of every high-impact action

Control that holds even when nothing else does.

Fail-closed, default-deny

If a gate cannot confirm approval, the action does not run. Silence and ambiguity resolve to no, not yes.

Approvals survive restarts

Gate state is durable. A restart, a crash, or a long-running campaign does not reopen a door that was never approved.

Scope is enforced, not suggested

Out-of-scope actions cannot execute, regardless of what the AI recommends or how convincingly it argues for them.

Adaptive, not scripted

No pre-armed exploits, no blind payloads

Stiletto does not ship a library of ready-made exploits it fires on sight.

Known issues

Vetted approaches

For a recognized weakness, Stiletto uses a known, reviewed technique, mapped to the finding it targets.

Novel findings

Authored on the spot, reviewed before use

For something new, the reasoning layer drafts a working exploit at runtime, and a human reviews it before the control layer will run it.

Full lifecycle

One autonomous campaign, start to finish

Stiletto runs the complete engagement lifecycle as a single, coordinated campaign, not a set of disconnected scans.

Reconnaissance01
Exploitation02
Post-exploitation03
Lateral movement04
Persistence05
Auditable by design

Every action leaves a record

An engagement is only useful if you can trust and review what happened.

ATT&CK alignment

Findings and actions are mapped to MITRE ATT&CK for shared, standard reporting.

Forensic evidence

Proof, command output, and artifacts are captured at each step, not reconstructed after the fact.

Deterministic replay

Engagements can be replayed for demos and stakeholder review, showing exactly what happened and why.

Trust, but verify

Whether a technique succeeded is decided by an objective check, never the model's own say-so, and every phase runs under a bounded iteration ceiling.

8
approval gates
Fail-closed
default posture
MITRE
ATT&CK aligned
Deterministic
replay

See Stiletto on your own environment

Request a briefing to learn how a gated, autonomous campaign would run against your estate.

Request a briefing