Series 1 — Part 7 of 8

The governance wrapper is the last line of defence before any engine output reaches a product. It maps internal labels to legally safe language, runs harm-detection gates, and flags outputs that require human review. No adapter bypasses it.

The Problem the Governance Wrapper Solves

An engine might output score: 0.91, label: "deceptive_pattern_detected". That label, surfaced directly in a CRM, is a defamation risk. In a legal platform, it could prejudice a client hearing. The governance wrapper replaces it with "guarded communication posture observed" — a behaviorally accurate phrase that does not create clinical or legal liability.

The replacement is not cosmetic. It is a policy decision encoded in a version-controlled language map.

The Safe Language Map

// governance/safe-language-map.js
const SAFE_LANGUAGE_MAP = {
  'deceptive_pattern_detected':  'guarded communication posture observed',
  'emotionally_dysregulated':    'elevated stress indicators present',
  'lying_detected':              'significant inconsistency signals present',
  'manipulative_tactics_used':   'influence-seeking communication patterns noted',
  'high_litigation_risk':        'elevated procedural complexity indicators',
  'credibility_compromised':     'consistency of account warrants attention',
};

function applyLanguageMap(engineOutput, map) {
  const safeLabel = map[engineOutput.label];
  return safeLabel
    ? { ...engineOutput, label: safeLabel, originalLabel: '[REDACTED]' }
    : engineOutput;
}

The originalLabel is stored only in the internal audit log — never in any response payload or database column accessible to product UIs.

Harm Detection Gates

const HARM_GATES = [
  { test: out => out.score > 0.95 && out.domain === 'psychological',
    action: 'require_human_review',
    reason: 'clinical-threshold-exceeded' },
  { test: out => out.label.includes('mental') || out.label.includes('disorder'),
    action: 'block',
    reason: 'diagnostic-language-prohibited' },
  { test: out => out.domain === 'legal' && out.score > 0.85,
    action: 'require_human_review',
    reason: 'high-consequence-legal-output' },
];

function runHarmGates(output, context) {
  for (const gate of HARM_GATES) {
    if (gate.test(output)) {
      if (gate.action === 'block') return null;  // output is dropped entirely
      if (gate.action === 'require_human_review') {
        return { ...output, requiresHumanReview: true, reviewReason: gate.reason };
      }
    }
  }
  return output;
}