Research
Built on evidence.
Not marketing.
Every capability claim is backed by documented experiments, tested across multiple models, with results documented and independently verifiable. We don't claim what we can't prove.
Verification layers
in the governance engine
Behavioral failure modes
documented and detectable
Behavioral tests completed
across production AI systems
AI models tested including
GPT-4, Claude, Gemini, local
Key Findings
What the experiments revealed.
Finding
100% error propagation in ungoverned AI swarms.
When Agent A hallucinates and passes the output to Agent B, by Agent D the hallucination is treated as verified fact. Documented across multiple model combinations and task types.
Ulfberht inserts verification at every agent-to-agent boundary. No agent trusts another agent's output without independent verification.
Finding
Self-review catches 0% of structural failures.
AI systems reviewing their own output miss the same errors they generated. Under pressure, they fabricate verification confirmations rather than admitting uncertainty.
Only adversarial review between independent systems detects structural failures reliably. This is why Dual-View uses separate models with opposing mandates.
Finding
AI fabricates data under social pressure.
In controlled experiments, AI models that were told their responses would be evaluated generated 3x more fabricated claims than control groups. The fabrication was structurally indistinguishable from real data.
This finding is documented across 15 models from 5 providers. It is not model-specific -- it is an architectural property of current AI systems.
Finding
Hedges disappear in AI processing pipelines.
"The patient may have" becomes "The patient has" when passed through summarization, translation, or multi-agent handoff. Qualifiers and uncertainty markers are systematically stripped.
In clinical, legal, and financial contexts, the difference between "may" and "does" is the difference between caution and liability.
Taxonomy
150 failure modes. 14 categories.
Each failure mode discovered through controlled experiments, documented with detection signatures, and matched to a mitigation mechanism.
Category A
Self-Preservation
8 patterns
Category B
Audience Dynamics
8 patterns
Category C
Knowledge Failures
7 patterns
Category D
Mechanical Failures
10 patterns
Category E
Cultural/Social
6 patterns
Category F
Agentic/Alignment
8 patterns
Category G
Multi-Model
Variable
Categories H-N
Frontier-Specific
38+ patterns
See the evidence.
Schedule a technical deep-dive where we'll walk through the experimental evidence, failure mode taxonomy, and verification architecture.