Confidence Propagation in Multi-Engine Systems

Series 1 — Part 4 of 8

A behavioral score without a confidence measure is a number pretending to be knowledge. This article explains the propagation formula, the reducers that lower confidence, and why the uncertainty band is as important as the score itself.

Why One Score Is Never Enough

Imagine two work logs. Both receive a behavioral risk score of 0.78 from the same engine. The first was scored with complete data: twelve data points, low contradiction from other engines, a stable model. The second was scored with three data points, two contradicting engine signals, and a model that has been drifting over the past week. They are not the same score — but a plain 0.78 hides that entirely.

Confidence propagation makes the difference explicit.

The Propagation Formula

function propagateConfidence(baseScore, evidence) {
  const raw =
    baseScore
    * evidence.evidenceWeight      // 0–1: proportion of expected data points present
    * evidence.modelStability      // 0–1: rolling stability score for this engine
    * evidence.dataCompleteness    // 0–1: completeness of input signals
    * evidence.contradictionFactor;// 0–1: reduced when other engines contradict

  const confidence = Math.min(1, Math.max(0, raw));
  return {
    score:          baseScore,
    confidence,
    uncertaintyLow:  baseScore - (1 - confidence) * 0.3,
    uncertaintyHigh: baseScore + (1 - confidence) * 0.3,
    reducers:       evidence.activeReducers,  // list of factors that lowered confidence
  };
}

Applied Reducers

Reducers are named flags attached to the confidence object. Product adapters and UI layers use them to explain confidence to end users:

low_evidence — fewer than 40% of expected signals were present
high_contradiction — two or more engines scored this dimension in opposite directions
incomplete_data — required fields were missing from the input
model_drift — this engine's outputs have been statistically unstable over the past 7 days

How Adapters Use the Uncertainty Band

The the chatbot platform adapter shows a lead score of 0.78 with a band of [0.65, 0.91] and the reducer low_evidence. The sales rep sees: "Moderate intent signal — limited data available." That is different from a high-confidence 0.78 that would prompt immediate follow-up.

Why One Score Is Never Enough

The Propagation Formula

Applied Reducers

How Adapters Use the Uncertainty Band

Stay at the cutting edge