structural-intelligence

/papers/shared-answerability-ai-safety/README.md

Shared Answerability as a Condition of AI Safety: Beyond Alignment Theater and Behavioral Adequacy

Author: Vladisav Jovanović
Status: Preprint
Version: Latest archived (Apr 2026)

Abstract

This paper argues that contemporary AI safety discourse often overweights behavioral adequacy, policy posture, and compliance language while underweighting answerability to consequence. A system may appear aligned, harmless, polite, transparent, or well-governed and still remain structurally unsafe if it cannot be meaningfully corrected under real-world consequence. The paper defines alignment theater as the production of visible safety without sufficient consequence-bearing architecture and argues that behaviorally adequate but structurally unanswerable systems can become substitute controllers: systems that steer decisions, workflows, and institutions without inheriting the mass of consequence attached to steering. It proposes shared answerability as a condition of AI safety, where execution, oversight, and consequence remain bound tightly enough that neither the system nor the institution deploying it can offload the cost of being wrong without revision.

Keywords

AI safety; shared answerability; alignment theater; behavioral adequacy; Structural Intelligence; consequence; corrigibility; human-in-the-loop; substitute controller; structural debt; AI governance; accountability; oversight; revision