Future Direction: Silent Failure and Observability-Evaluation Gap

Description

First of six open design directions (Section 12.1). Asks whether the observability-evaluation adoption gap (89% adoption vs 52.4% offline evaluation per LangChain 2026 survey) reflects a missing tooling layer, a missing evaluation interface inside the harness, or a model-capability ceiling. Open sub-questions: whether generator-evaluator separation, sprint contracts, and post-hoc checks belong inside the harness (as additional hook events alongside the documented 27) or outside it as a separate evaluation layer; and whether the existing hook pipeline can host such scaffolding within its current context-cost envelope.

Key claims

Relations

Sources

src-20260423-0cff68d3291b