Custom Shell Parser

Description

Claude Code contains a multi-layered bash parsing system. The primary parser is a pure-TypeScript bash parser in utils/bash/bashParser.ts that produces tree-sitter-bash-compatible ASTs. This parser was validated against a 3,449-input golden corpus generated from the WASM parser and includes a 50ms wall-clock timeout and a 50,000-node budget cap to bail out on pathological or adversarial input. On top of it, the AST-based security analyzer in utils/bash/ast.ts walks the tree with an explicit allowlist of node types to extract trustworthy argv[] arrays for permission matching, using a fail-closed design where any unrecognized node type causes the command to be classified as "too-complex" and routed to the user for approval.

The pure-TS parser in bashParser.ts is a character-by-character lexer that tracks both JS string indices and UTF-8 byte offsets (for tree-sitter position compatibility). It handles the full range of bash syntax: single/double/ANSI-C quoting, heredocs (including tab-stripping <<-), command substitution ($()), process substitution (<() / >()), parameter expansion (${}), arithmetic expansion ($(())), backtick substitution, pipelines, list operators (&&, ||), and redirections. The lexer is context-sensitive: [ is treated as an operator in command position (test command) but as a word character in argument position (glob/subscript).

Separately, a safe wrapper around the shell-quote NPM library in utils/bash/shellQuote.ts provides tryParseShellCommand() and includes critical security hardening: hasMalformedTokens() detects when shell-quote misinterprets commands containing ambiguous patterns (like JSON-like strings with semicolons), preventing command injection via HackerOne report #3482049. The hasShellQuoteSingleQuoteBug() function detects a specific differential between shell-quote and bash's handling of backslashes inside single quotes, where '\' <payload> '\' could hide payloads from security checks. The AST-based tree-sitter approach in ast.ts was built to replace these fragile differential-detection patches.

Key claims

Relations

Sources

src-20260409-e9925330d110 src-20260410-shell-parser-a: src/utils/bash/bashParser.ts, src/utils/bash/ast.ts, src/utils/bash/shellQuote.ts, src/utils/bash/parser.ts