Security & Audit

Before the v0.2.0 public release, CLQ underwent a structured security and quality audit covering every command, the inspector auth model, the redaction layer, and the middleware execution path. This page is the honest account of what was found, what was fixed, what broke during the fix, and what was deliberately left open. The full timestamped record — original reports, fix sessions, and the final verification pass — lives in qa-report/ on GitHub.

Audit scope: @clq-sh/core@0.1.4 and @clq-sh/cli@0.1.4, tested on Windows 10 Pro / Node.js v18.20.6. 128 tests passing at audit start; 211 passing after all fixes.


What was found and fixed

Medium Fixed in v0.2.0 Finding 1 — Credential leak in clq doctor

What happened: When an environment variable is declared as type: "number" in clq.config.ts but the actual runtime value is non-numeric — for example, an API key accidentally routed to a port variable — loadConfig() would embed the raw value directly in the error message. clq doctor then printed that error to stdout.

Concrete reproduction:

$ SERVICE_PORT=sk-REAL-SECRET-12345678 clq doctor
Checking project configuration…
Config check failed: Required environment variable 'SERVICE_PORT' is not set.
The port number for the service.
(expected a number, got "sk-REAL-SECRET-12345678")

Root cause: One line in config.ts:

throw errors.missingEnvVar(
  key,
  `${decl.description} (expected a number, got "${raw}")`, // raw = process.env[key]
)

The secret: true flag on the declaration did nothing — no code read it before this path.

Impact: Any CI/CD pipeline that logs clq doctor output — common when it's used as a health check before deployment — would capture the literal secret value in cleartext log storage.

Fix: Replaced the raw value interpolation with a structural description. For non-secret vars: "a non-numeric string of length N". For secret: true vars: "a non-numeric value" — no content, no length, nothing reconstructable. A regression test now asserts that the full fake secret is not a substring of JSON.stringify(output) anywhere in the process's stdout.

Low Fixed in v0.2.0 Finding 2 — Redaction bypass in clq inspect

What happened: The inspector's response sanitizer (redactSecrets()) only matched four key-name patterns: secret, token, password, and api_key. A tool handler returning data under any other secret-looking key name would leak the value through /api/call and /api/logs responses. The following keys were confirmed to pass through unredacted:

Key nameTypical contentRedacted?
authorizationBearer sk-...✗ NO
credential / credentialssecret value✗ NO
private_key / privateKeyPEM key✗ NO
access_key / accessKeyAWS/cloud key✗ NO
signing_keyHMAC key✗ NO
bearerBearer token value✗ NO
jwtJWT string✗ NO
passphrasepassphrase✗ NO

Fix: Expanded SECRET_TERMS to 16 patterns covering the most common real-world naming conventions. Every response from the inspector — tool lists, call results, log entries — is passed through redactSecrets() at the boundary regardless of what the handler returned.

What this fix later broke: See the FIXD regression section below.

Correctness Fixed in v0.2.0 FIX2 — Middleware after-hook errors discarding successful tool results

What happened: When a tool's after middleware hook threw an error, the original successful result of the tool call was lost. The MCP client received a failure response even though the tool had completed successfully.

// Before fix — if after() throws, result is lost:
for (const mw of [...middleware].reverse()) {
  await mw.after?.(args.ctx, result)  // rejection propagates up
}
return result

Fix: after-hook errors are now caught individually. The error is logged via console.error, the remaining after hooks still run, and the tool's original result is always returned — a broken after-hook can never make a succeeded tool appear to have failed to the MCP client.

// After fix:
for (const mw of [...middleware].reverse()) {
  try {
    await mw.after?.(args.ctx, result)
  } catch (err) {
    console.error(err)  // logged, not swallowed
  }
}
return result  // always returned

Side effect (cosmetic): Two tests that exercise this behavior emit Error: after-exploded stack traces to stderr on every test run, because they don't mock console.error. The tests pass; the stderr output is noise. This is a documented, permanent cosmetic issue.


The fix that broke things: the FIXD regex regression

After expanding SECRET_TERMS to 16 patterns (Finding 2 fix), a concern emerged: bare substring matching would over-redact innocent fields. The key primaryKey contains key as a suffix and would be redacted even though it's a database key, not a credential. So a follow-up pass (FIXD) added word-boundary anchors to the regex.

What FIXD did (and why it seemed reasonable)

// Before FIXD — pure substring:
const SECRET_KEY_PATTERN = new RegExp(SECRET_TERMS.join("|"), "i")

// After FIXD — word boundaries:
const SECRET_KEY_PATTERN = new RegExp(
  SECRET_TERMS.map((t) => `\\b(?:${t})\\b`).join("|"),
  "i",
)

This correctly stopped redacting primaryKey, foreignKey, and similar false positives. 37 unit tests were added to cover the matrix. FIXD shipped.

The regression FIXD introduced

In JavaScript regex, \b fires between a \w character ([a-zA-Z0-9_]) and a \W character, or at a string edge. The critical flaw: both lowercase and uppercase letters are \w. There is no \b between the r and A in userApiKey — both are word characters. Word boundaries do not fire at camelCase transitions.

This meant real secret keys started passing through unredacted:

KeyExpectedActual (FIXD)
apiKeyREDACTEDpassed through
privateKeyREDACTEDpassed through
userApiKeyREDACTEDpassed through
clientAccessTokenREDACTEDpassed through

The regression was caught because the test suite was comprehensive enough to include these cases. The 37 new tests from FIXD caught the false-positive problem; the existing tests caught the regression when they failed.

How FIXD2 fixed it

The regex approach was abandoned in favor of a structural key-splitting strategy:

  1. Split the key into lowercase word segments by handling camelCase transitions, underscores, and hyphens: userApiKey → ["user", "api", "key"]
  2. Match against two independent sets with explicit positional rules:
    • SINGLE_SECRET_WORDS: matches when the word is the entire key alone, or appears at any non-leading position (index > 0) — so token matches requestToken (non-leading) but not tokenCount (leading in a multi-word key)
    • COMPOUND_SECRET_TERMS: two-word pairs that match when they're the entire key, or the trailing suffix of a longer key — so api/key matches apiKey and userApiKey but not apiVersion

A subsequent verification pass (FIXD3) ran all 27 known edge cases through the production implementation before touching any source file, confirming the logic was correct. No code change was needed in FIXD3 — the implementation was already right.


What's not fixed — and why

Open Low — by design Middleware timeout gap — no hung-hook protection

If a middleware before or after hook never resolves — for example, it hangs on a network call with no timeout, or deadlocks on a resource — the tool call never returns a result or error. The MCP client will eventually time out on its side, but on the server side the promise chain is stuck. Because the stdio MCP transport is sequential in practice, a single hung hook can block all subsequent tool calls, making the server permanently unresponsive for the session.

Why this wasn't fixed: The right solution is to add an AbortSignal to CLQContext that the framework sets with a per-call deadline. Middleware hooks that respect the signal can abort early; hooks that don't will be abandoned when the signal fires. This is a public API change — adding a field to CLQContext — and requires a proper design pass rather than a one-liner patch. Shipping a half-correct timeout (e.g., Promise.race that abandons the hook but doesn't cancel it) would create a worse situation: the hook keeps running in the background, holding resources, while the client receives a timeout error.

Mitigation in the meantime: Middleware hooks that make any I/O call should implement their own timeouts using AbortController or Promise.race with a fallback. This is standard practice for I/O code; CLQ just doesn't enforce it yet.

Open UX — Medium impact Inspector page reload makes the UI permanently unusable

The inspector reads its session token from the URL on load, then immediately strips it from the address bar using history.replaceState — a correct security behavior that prevents the token from appearing in browser history, referrers, or server logs. But a page reload sends no token, gets a 401, and shows "failed to load: HTTP 401" with no explanation. There is no recovery from this state; the user must close the tab and re-open the original URL from the terminal.

Why the security behavior is correct: The token-stripping is intentional and right. A token in the URL bar is visible in shared screens, browser extensions, browser sync, and referrer headers. The UX problem is the lack of a helpful error state — the page doesn't explain what happened or how to recover. This is a missing error message, not a broken security model.

Why it wasn't fixed in this pass: The audit pass was read-only (no source modifications). A correct fix would add a user-facing error state that says something like: "Token expired from URL — please re-open the link printed by clq inspect." Tracked for a future release.

Open Functional — design tradeoff Redaction false positives on cookie and similar substrings

The SECRET_TERMS list includes cookie and token, which were added to catch HTTP Cookie headers and session tokens — legitimate targets. But because the matching is substring-based, fields like cookieCount, acceptedCookies, and tokenCount are also redacted. The impact is a type mismatch: a numeric value like { tokenCount: 1000 } becomes { tokenCount: "[REDACTED]" } — the field is a string where a number was expected, which can break consumers.

Confirmed false-positive cases: cookieCount, acceptedCookies, cookieBannerVisible, tokenCount, jwtAlgorithm, credentialType.

Why this is a tradeoff, not a bug: The comment in redact.ts explicitly says "Matched as substrings" — this was chosen deliberately for maximum coverage. The alternative is a more precise matching rule that would require more context to get right. The current behavior prioritizes never leaking a real secret over never over-redacting a non-secret field. This tradeoff is explicit and documented, not an accident. Tools that return non-secret fields with these names should rename them to avoid ambiguity.


Full audit record

The complete timestamped record — original reports, each fix session, and the final verification pass — is in qa-report/ on GitHub. The index file (qa-report/README.md) lists every file in chronological order with a one-line description. Each file is the unedited output of the session that produced it — nothing has been cleaned up retroactively.