The cost of getting this wrong is not theoretical.
The 0.1% problem
An AI model that follows compliance rules 99.9% of the time violates them 0.1% of the time. In a high-volume regulated workflow, thousands of transactions per day, that is not a rounding error. That is a compliance programme failure. A regulatory breach. Potentially a criminal matter. Probability is not a defence.
Prompt injection is not a solved problem
An AI agent operating in a live enterprise environment receives input from many sources: user messages, API responses, document contents, other agents. Any of these can contain adversarial instructions. Guardrails and defensive prompting are a perimeter, and perimeters get breached. The only robust defence is architectural separation. The agent literally cannot take the harmful action, regardless of what it is told.
A log is not a control
Most "AI governance" in production is observational: log the inputs, log the outputs, have someone review them. This is not governance. It is forensics. By the time you have detected the problem, the non-compliant state transition has already occurred. Real governance sits upstream of execution. It is the validator that decides whether the action is permitted, not the team that reviews it afterwards.
Where we disagree with the mainstream.
Better prompts and fine-tuned models will make AI agents safe enough for regulated use.
Probabilistic systems cannot make deterministic guarantees. No amount of prompt engineering changes this. You can reduce the probability of failure. You cannot eliminate it. In regulated systems, that is not acceptable. The answer is architectural separation, not better persuasion.
Human-in-the-loop is the responsible way to deploy agentic AI in regulated environments.
Most HITL implementations are reassurance theatre. A human approving an AI-proposed action they cannot validate in the time given, against data they cannot fully process, is not a control. It is ceremony. If the action cannot be validated at runtime, the architecture is wrong. Fix the architecture.
AI code generation reduces the need for rigorous software engineering discipline.
AI code generation demands more rigorous engineering, not less. Slop in, slop out. The constraint moves upstream: from writing correct code to specifying correct behaviour. Organisations that invest in formal specification now will generate demonstrably correct systems at 5–10× the speed of those still relying on iterative debugging.
AI governance is primarily a data and model management problem.
Agentic AI governance is an execution problem. Existing frameworks were designed for models that produce outputs. Agents take actions. The risk is not in the model's response, it is in what the agent does next. Governance must sit upstream of execution, as an architectural constraint, not downstream as observation and review.
Explainable AI means you can trace how the model reached its conclusion.
Explainability of the model is not the same as auditability of the action. When a regulator asks whether a specific compliance check was performed on a specific action at a specific time, the answer must be a cryptographically verifiable, hash-chained record generated in real time—not a narrative reconstructed from logs. Those are not equivalent.
What this architecture means for your role.
AI-assisted delivery is changing org design, and not only tooling. If you are asking “If the machine writes more of the product, what is left for me?”, that question is reasonable.
Spec++ does not make software engineers redundant. It moves the most valuable work upstream: from typing implementation to owning the behaviour the organisation is prepared to guarantee.
Instead of reverse-engineering requirements and debugging edge cases in scattered services, you capture rules of the system precisely enough that generated code has almost nowhere to go wrong. You move from owning files and services to owning:
- which states are allowed
- which transitions are legal
- which guarantees the firm can make to regulators and customers
That is a more senior, more accountable position, not a smaller one. AI turns specifications into working systems at speed; your leverage is the clarity and rigour of the spec, not typing faster.
Specification stewardship—between “everyone prompts in chat” and “senior engineers own everything” is a concrete, learnable craft via specification-driven development (SDD) with Spec++:
| Background | What you bring | What SDD + Spec++ adds |
|---|---|---|
| Product | Outcomes, acceptance language, edge cases | Own NL clauses; review clause-level PR diffs before code lands |
| Junior developers | Vertical slices, tests, detail | Maintain specs, run pipeline in CI, triage lift and check failures |
| QA | Traceability, regression instinct | “Does every obligation still hold?”—trace clauses to source, gate on spec pipeline green |
Spec++ Pipeline targets the gap teams feel when they shrink traditional intake but still need fewer people who can keep intent coherent as generation accelerates.
Your clarity becomes the control plane. Learn to operate the spec pipeline, not compete with the agent on typing.
Learning to operate Spec++ and the pipeline is a portfolio skill: visible in the repo (specs/ commits, green Actions runs), legible to engineering leads, aligned with audit demands that chat logs alone cannot satisfy.