Stiletto
"The model proposes; the graph disposes."
AI-driven internal red-teaming, gated by human judgment.
A model that decides and acts is one misjudgment away from disaster
Give an AI both the reasoning and the trigger, and you inherit its mistakes at machine speed. One bad judgment call, one successful jailbreak, and something runs that nobody approved.
Autonomous offensive tooling is only as trustworthy as the controls around it. Sablewire built Stiletto around a simple premise: the entity that reasons about what to try next should never be the same entity that has the authority to do it.
A reasoning layer that proposes, a control layer that decides
Stiletto splits judgment from execution into two distinct layers, so intelligence never doubles as authority.
Reasoning layer
Analyzes the target environment and recommends the next move in an engagement, the way a skilled operator would.
Control layer
Independently decides what is allowed to execute and carries out the action. The AI never gets to bypass authorization. Authorization lives in the control layer, not the AI's instructions, so a manipulated prompt cannot push an engagement outside its authorized scope.
Durable approval gates in front of every high-impact action
Control that holds even when nothing else does.
Fail-closed, default-deny
If a gate cannot confirm approval, the action does not run. Silence and ambiguity resolve to no, not yes.
Approvals survive restarts
Gate state is durable. A restart, a crash, or a long-running campaign does not reopen a door that was never approved.
Scope is enforced, not suggested
Out-of-scope actions cannot execute, regardless of what the AI recommends or how convincingly it argues for them.
No pre-armed exploits, no blind payloads
Stiletto does not ship a library of ready-made exploits it fires on sight.
Vetted approaches
For a recognized weakness, Stiletto uses a known, reviewed technique, mapped to the finding it targets.
Authored on the spot, reviewed before use
For something new, the reasoning layer drafts a working exploit at runtime, and a human reviews it before the control layer will run it.
One autonomous campaign, start to finish
Stiletto runs the complete engagement lifecycle as a single, coordinated campaign, not a set of disconnected scans.
Every action leaves a record
An engagement is only useful if you can trust and review what happened.
ATT&CK alignment
Findings and actions are mapped to MITRE ATT&CK for shared, standard reporting.
Forensic evidence
Proof, command output, and artifacts are captured at each step, not reconstructed after the fact.
Deterministic replay
Engagements can be replayed for demos and stakeholder review, showing exactly what happened and why.
Trust, but verify
Whether a technique succeeded is decided by an objective check, never the model's own say-so, and every phase runs under a bounded iteration ceiling.
See Stiletto on your own environment
Request a briefing to learn how a gated, autonomous campaign would run against your estate.
Request a briefing