Streaming Tool Execution 40% Speedup

Description

Performance pattern where StreamingToolExecutor launches tools while the LLM is still generating. Typical 5-tool turn completes in ~18s instead of ~30s. Implementation watches for tool_use blocks during streaming because stop_reason==='tool_use' is documented as unreliable (line 554-555 comment).

Key claims

Relations

Sources

src-20260419-cfed81b8d6a5