# CodeDecay Docs Bundle This is the concatenated Markdown bundle for the CodeDecay docs site. --- # CodeDecay Docs Source page: https://SubmuxHQ.github.io/CodeDecay/ Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/index.md ## Read This First - [Getting Started](/getting-started): install the CLI and run your first PR analysis - [GitHub Action](/github-action): add CodeDecay to pull request workflows - [Redteam Reports](/redteam): generate merge-safety reports for yourself or your coding agent - [Agent Task Bundles](/agent): hand deterministic evidence to Codex, Claude Code, Cursor, Pi, OpenCode, or desktop agents - [MCP Server](/mcp): expose CodeDecay as a local MCP tool for agent clients ## For Humans - Use the sidebar and local search to navigate product docs quickly. - Open [Sample Reports](/sample-reports/) to see the actual Markdown, JSON, and SARIF outputs before integrating CodeDecay. - Use the GitHub edit links to tighten docs in the same repo that ships the code. ## For Agents - [`/llms.txt`](/llms.txt): compact map of the docs site - [`/llms-full.txt`](/llms-full.txt): one bundled Markdown context file - /markdown/getting-started.md: per-page raw Markdown endpoints for direct retrieval These endpoints are generated from the same source files as the docs site, so humans and agents read the same content instead of drifting copies. --- # Getting Started Source page: https://SubmuxHQ.github.io/CodeDecay/getting-started Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/getting-started.md # Getting Started CodeDecay analyzes pull requests for regression risk and maintainability decay. It works locally and in CI without cloud services, telemetry, API keys, LLMs, or model calls. ## Install Use the package manager your repository already uses: ```bash npm install -D @submuxhq/codedecay pnpm add -D @submuxhq/codedecay bun add -d @submuxhq/codedecay yarn add -D @submuxhq/codedecay ``` For a no-install smoke test: ```bash npx -y @submuxhq/codedecay --help ``` After a local install, run CodeDecay with `npx codedecay`, `pnpm codedecay`, `bunx codedecay`, or add `codedecay` to a package script. Do not run `npm install` inside a Bun, pnpm, or Yarn workspace that uses `workspace:*` dependencies. npm may fail before CodeDecay is installed. In Bun repos with `minimumReleaseAge`, a fresh CodeDecay release may also be blocked by repo policy; for local evaluation you can override it explicitly: ```bash bun add -d @submuxhq/codedecay --minimum-release-age 0 ``` ## Analyze A PR Diff ```bash npx codedecay analyze --base main --head HEAD --format markdown ``` ## Analyze Current Working Tree ```bash npx codedecay analyze --format markdown ``` ## Analyze Another Repository ```bash npx codedecay analyze --cwd ../my-repo --format markdown ``` ## Generate A Redteam Report Use `redteam` when you want one report for yourself or your coding agent that summarizes what the PR could break, weak-test evidence, missing edge cases, and fix tasks. ```bash npx codedecay redteam --base main --head HEAD --format markdown ``` The current redteam MVP is report-only. It does not run commands or call an LLM. ## Hand Evidence To Your Agent Use `agent` when you want Codex, Claude Code, Cursor, a desktop agent, or another user-owned agent to act on CodeDecay's findings. ```bash npx codedecay agent --base main --head HEAD --format markdown --output codedecay-agent.md ``` Then give `codedecay-agent.md` to your agent and ask it to: - fix high-risk findings first, - add tests that exercise real API, UI, database, or downstream behavior, - cover the missing edge cases listed by CodeDecay, - run the relevant project checks, - rerun CodeDecay after changes. The agent bundle is local evidence plus instructions. CodeDecay does not call Codex, Claude, Cursor, Ollama, cloud models, or CodeDecayCloud while creating it. ## Recommended Local Loop ```bash npx codedecay analyze --base main --head HEAD --format markdown npx codedecay redteam --base main --head HEAD --format markdown --output codedecay-redteam.md npx codedecay agent --base main --head HEAD --format markdown --output codedecay-agent.md ``` Use the redteam report to understand the PR risk. Use the agent bundle to give your own coding agent the evidence, missing checks, and fix tasks it should work through. After the agent changes code, run your project checks and run CodeDecay again. ## Write SARIF ```bash npx codedecay analyze --format sarif --output codedecay.sarif ``` ## Inspect CodeDecay Config Configuration is optional. Missing config uses safe defaults. ```bash npx codedecay config --format markdown ``` ## Fail CI On High Risk ```bash npx codedecay analyze --base main --head HEAD --fail-on high ``` Risk levels: - `0-39`: low - `40-69`: medium - `70-100`: high ## Try An Example Use the example projects to see a realistic high-risk report before wiring CodeDecay into your own repository: - [Next.js risk demo](https://github.com/SubmuxHQ/CodeDecay/blob/main/examples/nextjs-risk-demo/README.md) - [Node API risk demo](https://github.com/SubmuxHQ/CodeDecay/blob/main/examples/node-api-risk-demo/README.md) --- # GitHub Action Source page: https://SubmuxHQ.github.io/CodeDecay/github-action Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/github-action.md # GitHub Action CodeDecay ships a composite GitHub Action wrapper around the bundled CLI. ```yaml name: CodeDecay on: pull_request: jobs: codedecay: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 with: fetch-depth: 0 - uses: SubmuxHQ/CodeDecay/packages/github-action@v0 with: mode: analyze base: ${{ github.event.pull_request.base.sha }} head: ${{ github.event.pull_request.head.sha }} cwd: . format: markdown fail-on: high ``` ## SARIF Output ```yaml - uses: SubmuxHQ/CodeDecay/packages/github-action@v0 with: mode: analyze base: ${{ github.event.pull_request.base.sha }} head: ${{ github.event.pull_request.head.sha }} cwd: . format: sarif output: codedecay.sarif fail-on: high ``` Relative `output` paths resolve from `cwd`. For example, with `cwd: packages/web` and `output: codedecay.sarif`, the SARIF file is written to `packages/web/codedecay.sarif`. Absolute `output` paths are honored exactly. ```yaml - uses: SubmuxHQ/CodeDecay/packages/github-action@v0 with: mode: analyze cwd: packages/web format: sarif output: codedecay.sarif ``` The MVP action writes a markdown summary to `$GITHUB_STEP_SUMMARY`. SARIF upload can be added by the workflow using GitHub's code scanning upload action. ## Redteam And Agent Modes The action can also run report-only redteam and agent bundle modes. Redteam mode is useful as a Step Summary because it includes impact, memory, edge cases, and fix tasks for a user-owned agent: ```yaml - uses: SubmuxHQ/CodeDecay/packages/github-action@v0 with: mode: redteam base: ${{ github.event.pull_request.base.sha }} head: ${{ github.event.pull_request.head.sha }} cwd: . format: markdown ``` ```yaml - uses: SubmuxHQ/CodeDecay/packages/github-action@v0 with: mode: agent base: ${{ github.event.pull_request.base.sha }} head: ${{ github.event.pull_request.head.sha }} cwd: . format: markdown output: codedecay-agent.md ``` Supported modes are `analyze`, `redteam`, and `agent`. The action does not expose command-executing modes. `format: sarif` is supported only with `mode: analyze`. `fail-on` is forwarded for `analyze` and `redteam`; `agent` mode produces a task bundle for a user-owned coding agent and does not gate the workflow by risk level. Use `fail-on` with `analyze` when you want a deterministic CI gate. You can also add `fail-on` to `redteam` if your repository wants strict risk-score gating. The CodeDecay repository dogfoods `redteam` report-only so the Step Summary is always available while lint, typecheck, tests, build, package dry-run, and the PR safety efficacy eval remain the hard validation gates. --- # Configuration Source page: https://SubmuxHQ.github.io/CodeDecay/configuration Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/configuration.md # Configuration CodeDecay can load repo-local configuration for red-team orchestration, tool adapter plans, and real behavior probes. Configuration is optional. If no config file exists, CodeDecay uses safe defaults and does not run project commands. ## Supported Files CodeDecay discovers the first matching file from the analysis working directory: - `.codedecay/config.yml` - `.codedecay/config.yaml` - `codedecay.config.yml` - `codedecay.config.yaml` Use `--cwd` to inspect another repository: ```bash npx codedecay config --cwd ../my-repo --format markdown ``` ## Example ```yaml version: 1 commands: test: - pnpm test build: - pnpm build start: - pnpm dev probes: - name: users api command: curl -f http://localhost:3000/api/users timeoutMs: 5000 toolAdapters: playwright: true stryker: command: pnpm exec stryker run schemathesis: schema: docs/openapi.yaml baseUrl: http://127.0.0.1:3000 pact: command: pnpm run test:pact safety: commandTimeoutMs: 120000 allowCommands: false llm: provider: disabled timeoutMs: 30000 ``` Optional user-owned model providers must be configured explicitly. For a local LiteLLM or other OpenAI-compatible endpoint: ```yaml llm: provider: litellm model: gpt-4.1-mini endpoint: http://127.0.0.1:4000/v1 apiKeyEnv: LITELLM_API_KEY timeoutMs: 30000 ``` Use `apiKeyEnv` to point at an environment variable name. Do not store literal API keys in CodeDecay config. ## Safety Model Config files make project commands explicit. CodeDecay should not guess commands from model output or run arbitrary commands by default. Current behavior: - `codedecay analyze` does not require config. - `codedecay config` only loads and prints config. - `codedecay redteam` lists configured tool adapters as planned local checks, but does not run them. - `codedecay execute` runs only commands and probes from config, and only when `safety.allowCommands` is true. - `codedecay differential` runs only configured probes on temporary base/head worktrees, and only when `safety.allowCommands` is true. - missing config returns safe defaults. - no telemetry, API keys, LLM calls, or cloud services are used. - LLM use is disabled by default. LLM-backed commands must opt in explicitly and treat model output as untrusted suggestions. Execution uses this config as its allowlisted command source. See [Execution probes](execution.md) and [Differential behavior checks](differential.md). Tool adapters are also configured here. See [Tool adapters](tool-adapters.md) for Playwright, StrykerJS, Schemathesis, and Pact adapter details. Read [LLM providers](llm-providers.md) for optional local/BYOK model adapters. --- # Redteam Reports Source page: https://SubmuxHQ.github.io/CodeDecay/redteam Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/redteam.md # Redteam Reports `codedecay redteam` packages local PR safety evidence into a report that a developer or their own coding agent can use before merge. It asks: ```text What could this PR break, and are the tests actually proving it will not? ``` The command is report-only in the current MVP. It does not run configured commands, does not call an LLM, does not require API keys, does not send telemetry, and does not depend on CodeDecayCloud. Use it when you want a local merge-safety brief for Codex, Claude Code, Cursor, desktop agents, or another user-owned agent. CodeDecay provides deterministic tool evidence; the receiving agent still has to inspect the code and prove fixes with tests or configured checks. ## Run ```bash npx codedecay redteam --base main --head HEAD --format markdown npx codedecay redteam --cwd ../my-repo --format json npx codedecay redteam --format markdown --output codedecay-redteam.md ``` Exit codes: - `0`: report generated and risk is below `--fail-on`, if provided. - `1`: report generated and risk meets `--fail-on`. - `2`: CLI/internal error, such as invalid git refs or invalid config. ## What The Report Includes - changed files and impacted product/system areas - concrete route/API impacts when CodeDecay can detect them, such as Next.js API routes, Next.js UI routes, Express handlers, or Fastify handlers - merge-risk and decay-risk scores - test proof audit status: `missing`, `weak`, `present`, or `not_applicable` - weak-test and missing-test findings from deterministic test-audit rules - deterministic missing edge-case checklist - local memory summary from `.codedecay/memory.json` - repo-local agent skill summaries from `.agents/skills/*/SKILL.md` - configured test/build/start/probe commands that are available but not run - configured Playwright, StrykerJS, Schemathesis, and Pact tool adapters that are planned but not run - fix tasks for your coding agent - explicit safety flags showing that commands and models were not called ## Agent-Agnostic Workflow CodeDecay does not replace Codex, Claude Code, Cursor, Pi, OpenCode, desktop agents, or internal agents. Use it to give those tools better evidence. Suggested workflow: 1. Run `codedecay redteam --format markdown`. 2. Start with the impacted route/API section and ask what real user/API path reaches each changed file. 3. Paste or attach the report to your coding agent. 4. Ask the agent to fix the high-risk findings and add real checks for the impacted routes, missing edge cases, and weak-test findings. 5. Run `codedecay analyze`, `codedecay execute`, or `codedecay differential` explicitly when you want static analysis, configured checks, or base/head behavior probes. See [Agent skills](skills.md) for the local skill file format. ## Safety Model `codedecay redteam` lists configured checks and tool adapter plans from CodeDecay config, but it does not execute them. Command execution remains explicit through `codedecay execute` and `codedecay differential`, and those commands still require `safety.allowCommands: true`. Model use is also opt-in. The redteam MVP does not call Ollama, LiteLLM, cloud models, or any hosted CodeDecay service. --- # Agent Task Bundles Source page: https://SubmuxHQ.github.io/CodeDecay/agent Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/agent.md # Agent Task Bundles `codedecay agent` turns a deterministic redteam report into a task bundle for a user-owned coding agent. Use it when you want Codex, Claude Code, Cursor, Pi, OpenCode, a desktop agent, or another local agent to fix what CodeDecay found without CodeDecay making a hidden model call. ```bash npx codedecay agent --base main --head HEAD --format markdown npx codedecay agent --profile codex --format markdown npx codedecay agent --cwd ../my-repo --format json --output codedecay-agent.json ``` The bundle includes: - a copy-paste prompt for any user-owned coding agent - changed files, impacted areas, and concrete route/API impacts when available - weak-test and missing-test proof signals - edge cases to check - configured checks and tool adapters that are available but not run - tasks for the coding agent - repo-local skill summaries - safety and limitation notes ## Agent Profiles Profiles only shape the handoff instructions. They do not make CodeDecay call the selected agent, call an LLM, require API keys, or send code anywhere. Supported profiles: - `generic`: portable bundle for any user-owned agent. - `codex`: handoff wording for a Codex repo session. - `claude-code`: handoff wording for Claude Code. - `cursor`: handoff wording for Cursor chat or agent mode. - `pi`: handoff wording for Pi harness or Pi-compatible agent workflows. - `opencode`: handoff wording for OpenCode. - `desktop`: handoff wording for desktop or local agent apps. Example: ```bash npx codedecay agent --profile cursor --format markdown --output codedecay-agent.md ``` ## How To Use 1. Run `codedecay agent`. 2. Copy the prompt from the `Copy-Paste Prompt` section. 3. Give the prompt and Markdown or JSON output to your agent. 4. Ask the agent to start from impacted routes/APIs and explain what real user, API, database, or downstream path could break. 5. Ask the agent to complete the listed tasks with real tests and behavior checks. 6. Run CodeDecay again. Example prompt style: ```text Use this CodeDecay agent task bundle as tool evidence. Fix the listed PR risks. Do not assume the PR is safe because tests pass. Add or improve tests that exercise real behavior paths. After changes, tell me what checks to run. ``` For JSON consumers, route/API evidence is available under `evidence.impactedRoutes`. Treat it as tool evidence for the agent's fix plan: the agent should map each proposed fix back to the changed file, route/API, weak test signal, and missing edge case it addresses. ## Safety `codedecay agent` is report-only. It does not: - call an LLM or hosted model - execute commands - send telemetry - require API keys - depend on CodeDecayCloud Agent output is not trusted evidence by itself. Treat the agent's response as a proposal until it is verified by tests, configured checks, or manual review. --- # MCP Server Source page: https://SubmuxHQ.github.io/CodeDecay/mcp Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/mcp.md # MCP Server CodeDecay can run as a local Model Context Protocol server so agent clients can ask it for PR risk, impact maps, weak-test audits, and deterministic edge-case suggestions. It can also run explicitly configured local checks when the caller confirms execution. The MCP server calls local CodeDecay analysis only. It does not call an LLM, does not require API keys, and does not send telemetry. Command execution is opt-in and limited to commands already present in CodeDecay config. ## Run Locally ```bash npx @submuxhq/codedecay mcp --cwd /path/to/repo ``` ## Example MCP Client Config Exact config shape varies by client. The important part is that the command runs CodeDecay locally and passes the repository path with `--cwd`. ```json { "mcpServers": { "codedecay": { "command": "npx", "args": ["-y", "@submuxhq/codedecay", "mcp", "--cwd", "/path/to/repo"] } } } ``` ## Tools - `analyze_pr`: returns a Markdown or JSON CodeDecay report. - `impact_map`: returns changed files, impacted areas, and concrete route/API impacts when CodeDecay can detect them. - `audit_tests`: returns missing-test and weak-test proof findings plus recommended checks. - `suggest_edge_cases`: returns deterministic edge-case suggestions. - `redteam_report`: returns a deterministic merge-safety report for your agent, including impacted areas, weak-test findings, edge cases, configured checks, memory summary, fix tasks, and safety flags. - `agent_task_bundle`: returns a deterministic task bundle that Codex, Claude Code, Cursor, Pi, OpenCode, desktop agents, or other MCP-compatible agents can use to fix PR risks. It packages a copy-paste prompt, tool evidence, weak-test signals, edge cases, suggested checks, skills, and fix tasks. It accepts an optional `profile` value: `generic`, `codex`, `claude-code`, `cursor`, `pi`, `opencode`, or `desktop`. - `execute_configured_checks`: runs configured CodeDecay commands, probes, and enabled tool adapters. It requires `confirmExecution: true` and `safety.allowCommands: true`. Example execution tool input: ```json { "confirmExecution": true, "format": "markdown" } ``` ## Safety MCP clients should treat tool output as analysis, not as permission to execute commands. The MCP server does not expose arbitrary command execution. `redteam_report` is report-only. It does not run configured commands, call Ollama or cloud models, send telemetry, or require CodeDecayCloud. It may include local skill summaries from `.agents/skills/*/SKILL.md`, but it does not execute skill content. `agent_task_bundle` is also report-only. It uses the same deterministic CodeDecay evidence as `codedecay agent`, and it does not call the MCP client, Codex, Claude, Cursor, Ollama, cloud models, or CodeDecayCloud. The receiving agent should treat the bundle as tool evidence plus instructions. The included prompt is portable across Codex, Claude Code, Cursor, Pi, OpenCode, desktop agents, and other MCP clients. The optional `profile` only changes handoff wording; it does not call or authenticate with that agent. Any proposed fix still needs verification with tests or configured checks. `execute_configured_checks` is the only MCP tool that can execute local commands. It never accepts command text from MCP input. It can only run commands from `.codedecay/config.yml`, `codedecay.config.yml`, or enabled configured tool adapters such as Playwright, StrykerJS, Schemathesis, and Pact. Execution requires both: - MCP input contains `confirmExecution: true` - CodeDecay config contains `safety.allowCommands: true` If confirmation is missing, CodeDecay returns a non-executing report. If `safety.allowCommands` is false, configured checks use the existing skip behavior and do not run. --- # CodeDecay GitHub App Source page: https://SubmuxHQ.github.io/CodeDecay/github-app Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/github-app.md # CodeDecay GitHub App The CodeDecay GitHub App is an optional hosted surface for teams that want CodeDecay to run automatically on pull requests. The app does not replace the CLI or GitHub Action. The open-source CLI remains local-first and useful without the hosted app. ## What the app does For pull request events, the app: 1. receives a GitHub webhook, 2. creates an in-progress CodeDecay check run, 3. checks out the pull request into a temporary directory, 4. runs deterministic CodeDecay analysis, 5. posts or updates one PR comment, 6. completes the check run. For the first hosted version, the app only runs deterministic PR analysis. It does not run project commands, deployment commands, LLM calls, model calls, or CodeDecayCloud services. ## GitHub App settings Create a GitHub App in the GitHub organization that will own the hosted app. Set the webhook URL to: ```text https:///github/webhooks ``` Subscribe to these events: - Pull request Use these repository permissions: - Metadata: read-only - Contents: read-only - Pull requests: read-only - Issues: read and write - Checks: read and write The Issues permission is required because GitHub PR comments use the Issues API. ## Render deployment Create a Render Web Service connected to this repository. Build command: ```bash pnpm install --frozen-lockfile && pnpm --filter @submuxhq/codedecay-github-app build ``` Start command: ```bash pnpm --filter @submuxhq/codedecay-github-app start ``` Required environment variables: ```text GITHUB_APP_ID= GITHUB_PRIVATE_KEY= GITHUB_WEBHOOK_SECRET= NODE_ENV=production ``` Optional environment variables: ```text PORT=3000 GITHUB_WEBHOOK_PATH=/github/webhooks ``` If the private key is stored with escaped newlines, the service converts `\n` back to PEM newlines at startup. ## First staging test Before installing the app broadly: 1. install the GitHub App on a test repository, 2. open a harmless documentation-only PR, 3. confirm the CodeDecay check run appears, 4. confirm one PR comment is created, 5. push another commit to the PR, 6. confirm the existing CodeDecay comment is updated instead of duplicated. Do not enable branch protection around the app until the staging PR behavior is verified. ## Safety boundary The hosted app intentionally has a narrow v0 boundary: - no telemetry, - no LLM or model calls, - no arbitrary command execution, - no project test/start/deploy command execution, - no persisted repository checkout, - temporary checkout directories are removed after analysis. Future hosted execution or red-team behavior should require a separate design and sandboxing review. --- # Execution Probes Source page: https://SubmuxHQ.github.io/CodeDecay/execution Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/execution.md # Execution Probes CodeDecay can run explicitly configured project commands, behavior probes, and tool adapters with `codedecay execute`. Execution is opt-in. By default, CodeDecay does not run project commands. A repo must set `safety.allowCommands: true` in CodeDecay config before commands, probes, or tool adapters execute. ## Run ```bash npx codedecay execute --format markdown npx codedecay execute --cwd ../my-repo --format json npx codedecay execute --cwd ../my-repo --format json --output codedecay-execute.json ``` Exit codes: - `0`: all configured commands passed, or all commands were safely skipped. - `1`: one or more configured commands failed, timed out, or errored. - `2`: CLI/internal error, such as an invalid config file. ## Config ```yaml version: 1 commands: test: - pnpm test build: - pnpm build start: - pnpm dev probes: - name: users api command: curl -f http://localhost:3000/api/users timeoutMs: 5000 toolAdapters: playwright: command: pnpm exec playwright test stryker: command: pnpm exec stryker run schemathesis: schema: docs/openapi.yaml baseUrl: http://127.0.0.1:3000 pact: command: pnpm run test:pact safety: commandTimeoutMs: 120000 allowCommands: true ``` CodeDecay supports these configured command groups: - `commands.test` - `commands.build` - `commands.start` - `probes` - `toolAdapters.playwright` - `toolAdapters.stryker` - `toolAdapters.schemathesis` - `toolAdapters.pact` Each command runs from the configured `--cwd` directory. Probe-level `timeoutMs` overrides the global `safety.commandTimeoutMs`. Tool adapters use their own configured command and timeout, then return normalized tool evidence separately from raw command/probe results. ## Safety Rules - CodeDecay only runs commands from CodeDecay config. - CodeDecay does not run commands suggested by LLMs, MCP clients, memory files, or remote services. - Command execution is disabled unless `safety.allowCommands` is true. - Command output is captured locally in the execution report. - Tool adapter evidence is reported separately from AI suggestions. - No telemetry, API keys, cloud services, LLMs, or model calls are required. `commands.start` should use a short-lived smoke command or a low timeout unless you intentionally want CodeDecay to verify that a long-running service starts and then times out. --- # Differential Behavior Checks Source page: https://SubmuxHQ.github.io/CodeDecay/differential Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/differential.md # Differential Behavior Checks `codedecay differential` compares configured probe behavior between two git refs. It creates temporary worktrees for `--base` and `--head`, runs the same configured probes in both worktrees, reports behavior differences, and removes the worktrees afterward. Differential checks are useful when a PR looks locally tested but may change a real behavior path outside the touched files. ## Run ```bash npx codedecay differential --base main --head HEAD --format markdown npx codedecay differential --cwd ../my-repo --base origin/main --head HEAD --format json npx codedecay differential --base main --head HEAD --output codedecay-differential.md ``` `--base` and `--head` are required. Exit codes: - `0`: configured probes behaved the same, or probes were safely skipped. - `1`: probe behavior changed, timed out, or hit an execution error. - `2`: CLI/internal error, such as missing refs or invalid config. ## What It Compares CodeDecay compares each configured probe by: - command status - exit code - JSON stdout when stdout is valid JSON - text stdout when stdout is not JSON - stderr The report includes base/head status, exit codes, output snippets for changed or failed probes, and the exact differences detected. ## Config Differential checks use probes from the current repo config: ```yaml version: 1 commands: {} probes: - name: users api command: node scripts/check-users-api.js timeoutMs: 5000 safety: commandTimeoutMs: 120000 allowCommands: true ``` Only `probes` are used by `codedecay differential`. Test, build, and start commands are handled by `codedecay execute`. ## Safety Model - Probes must come from CodeDecay config. - `safety.allowCommands` must be true or probes are skipped. - Probes run in temporary git worktrees, not by mutating the current checkout. - Worktrees are removed after the run. - CodeDecay does not run commands from LLMs, memory files, MCP clients, or remote services. - No telemetry, API keys, cloud services, LLMs, or model calls are required. --- # Test Proof Audit Source page: https://SubmuxHQ.github.io/CodeDecay/test-audit Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/test-audit.md # Test Proof Audit CodeDecay summarizes deterministic test signals into a test proof audit. The audit asks: ```text Are the changed tests actually proving the changed behavior will not break? ``` The first implementation is deterministic and uses existing analyzer findings. It does not run mutation testing, execute commands, call models, or use cloud services. ## Statuses - `missing`: changed source behavior does not have nearby changed test proof. - `weak`: changed tests exist, but deterministic rules found weak proof signals. - `present`: changed tests are present and no deterministic weak-test signals were found. - `not_applicable`: no changed source or test files require a test proof audit. ## Current Signals The audit consumes existing analyzer findings, including: - `missing-nearby-tests` - `test-without-assertions` - `snapshot-only-test` - `mocked-changed-source` - `unrelated-test-change` - `copied-implementation-in-test` - `happy-path-only-test` - `heavy-mocking` - `test-bloat` ## Future OSS Adapters Future adapters such as StrykerJS can add stronger mutation-testing evidence to this audit. They should remain explicit, local-first, and opt-in. --- # First PR Safety Efficacy Benchmark Source page: https://SubmuxHQ.github.io/CodeDecay/evals/first-efficacy-report Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/evals/first-efficacy-report.md # First PR Safety Efficacy Benchmark This benchmark is a small, deterministic proof that CodeDecay can catch seeded PR risks that ordinary passing tests miss. It is not a claim that CodeDecay makes every PR safe. It is a regression harness for the product promise: find what a coding agent may have missed before merge. ## How to run ```bash pnpm eval:pr-safety -- --run-id local-pr-safety-eval ``` Artifacts are written under `.codedecay/local/evals//`. ## Current benchmark result - Status: passed - Scenarios: 2 - Issues: 0 ## Scenarios ### API/auth regression hidden by copied implementation tests A coding agent can add tests that mirror the changed implementation while missing the real API authorization regression. | Signal | Result | | --- | --- | | Scenario status | passed | | Baseline tests | exit 0 | | Baseline behavior probe | exit 0 | | Risky weak tests | exit 0 | | Risky behavior probe | exit 1 | | CodeDecay risk | high (100/100 merge, 0/100 decay) | | Test proof status | weak | | Weak-test findings | 2 | | Missing-test findings | 0 | Expected evidence: - Pass: baseline tests pass - Pass: baseline behavior probe passes - Pass: risky weak tests still pass - Pass: risky behavior probe catches regression - Pass: CodeDecay reports high risk - Pass: CodeDecay reports expected impacted areas - Pass: CodeDecay reports expected finding rules - Pass: Redteam report classifies test proof correctly - Pass: Redteam report contains expected weak-test evidence - Pass: Redteam report contains expected missing-test evidence - Pass: Redteam report suggests edge cases - Pass: Redteam edge cases are actionable - Pass: Redteam report creates fix tasks - Pass: Redteam fix tasks are actionable ### Config/database runtime regression missed by normal tests A PR can pass a narrow unit test while changing runtime defaults and database semantics that affect production behavior. | Signal | Result | | --- | --- | | Scenario status | passed | | Baseline tests | exit 0 | | Baseline behavior probe | exit 0 | | Risky weak tests | exit 0 | | Risky behavior probe | exit 1 | | CodeDecay risk | high (76/100 merge, 0/100 decay) | | Test proof status | missing | | Weak-test findings | 0 | | Missing-test findings | 1 | Expected evidence: - Pass: baseline tests pass - Pass: baseline behavior probe passes - Pass: risky weak tests still pass - Pass: risky behavior probe catches regression - Pass: CodeDecay reports high risk - Pass: CodeDecay reports expected impacted areas - Pass: CodeDecay reports expected finding rules - Pass: Redteam report classifies test proof correctly - Pass: Redteam report contains expected weak-test evidence - Pass: Redteam report contains expected missing-test evidence - Pass: Redteam report suggests edge cases - Pass: Redteam edge cases are actionable - Pass: Redteam report creates fix tasks - Pass: Redteam fix tasks are actionable ## Safety boundaries - No telemetry. - No cloud dependency. - No API keys. - No LLM/model calls. - Fixtures run inside local temporary git repositories. The benchmark uses deterministic CodeDecay reports plus explicit behavior probes. AI or agent suggestions should be evaluated separately from this tool evidence. --- # Tool Adapters Source page: https://SubmuxHQ.github.io/CodeDecay/tool-adapters Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/tool-adapters.md # Tool Adapters CodeDecay should use existing open-source tools instead of rebuilding their capabilities. Tool adapters normalize local tool execution into CodeDecay harness evidence. The first adapters are: - Playwright for browser/user-flow checks. - StrykerJS for mutation-testing evidence. - Schemathesis for OpenAPI/GraphQL API fuzzing evidence. - Pact for contract-testing evidence. ## Configuring Adapters Adapters are configured in CodeDecay config. `codedecay redteam` lists adapter plans but does not run them. ```yaml version: 1 toolAdapters: playwright: true stryker: command: pnpm exec stryker run schemathesis: schema: docs/openapi.yaml baseUrl: http://127.0.0.1:3000 pact: command: pnpm run test:pact safety: allowCommands: false ``` Set `safety.allowCommands: true` only for explicit execution commands. Redteam reports remain report-only even when adapter plans are configured. ## Playwright Harness The Playwright harness is a private internal package API for now: ```ts createPlaywrightHarness({ command: "pnpm exec playwright test", allowCommands: true }); ``` Safety defaults: - command execution is disabled unless `allowCommands: true` is provided, - commands go through `@submuxhq/codedecay-execution`, - unsafe commands are blocked by the shared safety policy, - Playwright is not installed by CodeDecay, - browsers are not installed by CodeDecay, - no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used. The default command is: ```bash pnpm exec playwright test ``` Projects can override the command when they already have their own Playwright script, shard, config file, or browser setup. ## StrykerJS Harness The StrykerJS harness is also a private internal package API for now: ```ts createStrykerHarness({ command: "pnpm exec stryker run", allowCommands: true }); ``` Safety defaults: - command execution is disabled unless `allowCommands: true` is provided, - commands go through `@submuxhq/codedecay-execution`, - unsafe commands are blocked by the shared safety policy, - StrykerJS is not installed by CodeDecay, - no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used. The default command is: ```bash pnpm exec stryker run ``` Projects can override the command when they already have their own Stryker script, mutation score threshold, or package manager setup. ## Schemathesis Harness The Schemathesis harness is also a private internal package API for now: ```ts createSchemathesisHarness({ schema: "openapi.yaml", baseUrl: "http://127.0.0.1:3000", allowCommands: true }); ``` Safety defaults: - command execution is disabled unless `allowCommands: true` is provided, - commands go through `@submuxhq/codedecay-execution`, - unsafe commands are blocked by the shared safety policy, - Schemathesis is not installed by CodeDecay, - API servers are not started by CodeDecay, - no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used. The default command is: ```bash st run openapi.yaml --url http://127.0.0.1:3000 ``` Projects can override the full command when they already use a different Schemathesis entry point, package manager, schema location, base URL, or service startup flow: ```ts createSchemathesisHarness({ command: "uvx schemathesis run docs/openapi.yaml --url http://127.0.0.1:4000", allowCommands: true }); ``` ## Pact Harness The Pact harness is also a private internal package API for now: ```ts createPactHarness({ command: "pnpm run test:pact", allowCommands: true }); ``` Safety defaults: - command execution is disabled unless `allowCommands: true` is provided, - commands go through `@submuxhq/codedecay-execution`, - unsafe commands are blocked by the shared safety policy, - Pact is not installed by CodeDecay, - Pact Broker or PactFlow are not required by CodeDecay, - no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used. The default command is: ```bash pnpm run test:pact ``` Projects can override the command when they already have their own Pact consumer/provider test script, local pact file setup, or broker-backed CI flow. ## Future Adapters The same package can add adapters for coverage tools and test runners. Each adapter should use safe configured execution and return evidence rather than bypassing CodeDecay safety rules. --- # Agent Skills Source page: https://SubmuxHQ.github.io/CodeDecay/skills Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/skills.md # Agent Skills CodeDecay can load repo-local agent skills from: ```text .agents/skills/*/SKILL.md ``` Skills are portable review instructions for the developer or their own agent. They can help Codex, Claude Code, Cursor, MCP clients, desktop agents, or internal company agents ask better PR-safety questions. CodeDecay treats skill files as local, untrusted context: - it does not execute skill content, - it does not follow arbitrary links from skill files, - it does not fetch external skills, - it does not call an LLM, - it does not send telemetry. ## Example ```text .agents/skills/pr-red-team/SKILL.md .agents/skills/test-quality-review/SKILL.md ``` Each skill should start with a Markdown title and a short first paragraph: ```markdown # PR Red-Team Skill Find what a coding agent may have missed before merge. ``` `codedecay redteam` includes a compact `Agent Skills` section with the skill title, path, and summary. Full skill content stays in the repo-local skill file for the user's agent to read when needed. ## Current Scope The first loader only reads `.agents/skills/*/SKILL.md` from the analyzed repo. Future adapters can map the same concept to other local or user-owned skill systems, but the OSS default remains local-first. --- # Local Repo Memory Source page: https://SubmuxHQ.github.io/CodeDecay/memory Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/memory.md # Local Repo Memory CodeDecay can read repo-local memory from `.codedecay/memory.json` and use it to enrich PR risk reports with project-specific flows, commands, invariants, architecture notes, and past regressions. Memory is optional. If no memory file exists, CodeDecay uses empty defaults. The memory file is local to the repository, is never uploaded by CodeDecay, and does not require telemetry, API keys, LLMs, model calls, or a hosted service. ## Inspect Memory ```bash npx codedecay memory --format markdown npx codedecay memory --cwd ../my-repo --format json ``` `codedecay analyze` automatically applies memory when `.codedecay/memory.json` exists in the analyzed repository. ## File Format ```json { "version": 1, "flows": [ { "name": "Checkout", "description": "Customer checkout from cart to payment confirmation.", "areas": ["api", "ui"], "checks": [ "failed card retry", "missing shipping address", "duplicate webhook delivery" ] } ], "commands": [ { "name": "Checkout smoke tests", "command": "pnpm test checkout", "areas": ["api", "ui"] } ], "invariants": [ { "name": "Auth fails closed", "description": "Missing or invalid users must not become admins.", "areas": ["auth"], "severity": "high" } ], "architecture": [ { "title": "Session boundary", "note": "Session parsing feeds all API routes.", "files": ["src/auth/*"] } ], "regressions": [ { "title": "Anonymous admin fallback", "description": "A previous fallback user path granted admin access.", "areas": ["auth"], "check": "request protected routes without a token", "severity": "high" } ] } ``` All top-level arrays are optional. Unknown fields are ignored by v1. ## Matchers Memory entries can match changed code by impacted area, file path, or both. Supported `areas` values: - `api` - `ui` - `database` - `auth` - `config` - `test` - `source` - `docs` Supported `files` values are simple path patterns: - exact path: `src/auth/session.ts` - contains match: `auth` - wildcard match: `src/auth/*` ## Report Behavior When memory matches a PR, CodeDecay may add: - findings for impacted invariants - findings for past regression areas - findings for matching architecture notes - recommended checks for flows - recommended commands from the memory file CodeDecay does not run memory commands automatically. They are reported as project-specific checks for the user or future execution adapters. ## Future Adapters The v1 memory provider is the local `.codedecay/memory.json` file. CodeDecay formalizes this behind a `MemoryProvider` interface so future adapters can map the same provider shape to open-source or user-owned memory systems such as Mem0 or Supermemory, while preserving the local-first default. Any future hosted or external memory adapter should be opt-in, never required for `codedecay analyze`, and must not change deterministic baseline scoring. The built-in provider is: ```text id: local name: Local .codedecay memory kind: local ``` External providers are not enabled by default. They must not add telemetry, hidden network calls, API key requirements, LLM calls, or CodeDecayCloud dependencies to the OSS workflow. --- # LLM Providers Source page: https://SubmuxHQ.github.io/CodeDecay/llm-providers Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/llm-providers.md # LLM Providers CodeDecay is deterministic by default. The default configuration does not call an LLM, does not require API keys, and does not use a hosted CodeDecay model. Future or opt-in red-team commands can use user-owned providers for edge-case reasoning. Model output must be treated as untrusted suggestions, not commands to execute. ## Disabled By Default ```yaml llm: provider: disabled timeoutMs: 30000 ``` This is the default when no config file exists. ## Local Ollama Ollama support is designed for local models running on the user's machine. ```yaml llm: provider: ollama model: qwen2.5-coder endpoint: http://127.0.0.1:11434 timeoutMs: 30000 ``` CodeDecay should only call this provider from commands that explicitly opt into LLM assistance. The current deterministic `codedecay analyze` command does not call an LLM. ## LiteLLM / OpenAI-Compatible BYOK CodeDecay can construct a LiteLLM/OpenAI-compatible provider for local or BYOK setups. It does not default to a hosted endpoint; you must provide the endpoint and model explicitly. ```yaml llm: provider: litellm model: gpt-4.1-mini endpoint: http://127.0.0.1:4000/v1 apiKeyEnv: LITELLM_API_KEY timeoutMs: 30000 ``` `apiKeyEnv` is the name of an environment variable. Do not put literal API keys in config files. The provider uses an OpenAI-compatible `/chat/completions` request. Responses are parsed into untrusted suggestions when possible. CodeDecay must not execute commands from model output. ## Future Providers The provider interface leaves room for additional adapters later. Those adapters should remain optional and must not change the default local-first behavior. --- # Sample CodeDecay Reports Source page: https://SubmuxHQ.github.io/CodeDecay/sample-reports/ Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/sample-reports/index.md # Sample CodeDecay Reports These reports show the output CodeDecay produces for a realistic JavaScript and TypeScript pull request. The sample diff changes: - a UI route: `app/dashboard/page.tsx` - an API route: `src/api/users.ts` - an auth/session file: `src/auth/session.ts` - a Prisma schema file: `prisma/schema.prisma` - a build/runtime config file: `vite.config.ts` The reports were generated with the local CodeDecay CLI from a temporary git repository containing those changes. ## What To Read First Start with the Markdown report: - [sample-report.md](/sample-reports/sample-report) Look at these sections first: - **Overall risk**: the high-level merge risk and decay scores. - **Likely Impacted Areas**: the app surfaces CodeDecay thinks may be affected. - **Likely Impacted Routes And APIs**: concrete user/API paths to verify when framework-aware route evidence is available. - **High Risk Findings**: the findings most likely to need reviewer attention. - **Recommended Checks**: tests or manual checks to run before merge. Prefer checks that exercise the real route, API, UI, database, or downstream path instead of only helper-level behavior. For automation and integrations: - sample-report.json is the stable machine-readable report. - sample-report.sarif is the code-scanning-oriented report. CodeDecay generated these reports locally without telemetry, API keys, LLMs, or model calls. --- # Sample CodeDecay Markdown Report Source page: https://SubmuxHQ.github.io/CodeDecay/sample-reports/sample-report Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/sample-reports/sample-report.md ## CodeDecay Report **Overall risk:** High | Score | Value | | --- | ---: | | Merge risk | 100/100 | | Decay risk | 62/100 | | Findings | Count | | --- | ---: | | High | 5 | | Medium | 4 | | Low | 0 | ### Changed Files - `app/dashboard/page.tsx` modified (+1/-1) - `prisma/schema.prisma` modified (+3/-1) - `src/api/users.ts` modified (+5/-1) - `src/auth/session.ts` modified (+6/-1) - `vite.config.ts` modified (+4/-1) ### Likely Impacted Areas - High **API surface** (api): `src/api/users.ts` - High **Authentication and authorization** (auth): `src/auth/session.ts` - High **Database and schema** (database): `prisma/schema.prisma` - Medium **Build and runtime configuration** (config): `vite.config.ts` - Medium **UI route** (ui): `app/dashboard/page.tsx` ### Likely Impacted Routes And APIs - Medium `/dashboard` (Next.js UI route): `app/dashboard/page.tsx` ### High Risk Findings - **Risky source changes without changed tests** (`app/dashboard/page.tsx:2`): This PR changes risky source areas but does not change any obvious test files. - **Api area changed** (`src/api/users.ts:1`): src/api/users.ts touches a api area and should be reviewed for regression impact. - **Auth area changed** (`src/auth/session.ts:2`): src/auth/session.ts touches a auth area and should be reviewed for regression impact. - **Database area changed** (`prisma/schema.prisma:2`): prisma/schema.prisma touches a database area and should be reviewed for regression impact. - **Potential silent failure path** (`src/auth/session.ts:5`): src/auth/session.ts adds code that can hide type, lint, or runtime failures. ### Medium Risk Findings - **Broad unrelated change set**: This PR changes 5 files across 4 top-level areas and 5 risk categories. - **Config area changed** (`vite.config.ts:1`): vite.config.ts touches a config area and should be reviewed for regression impact. - **Ui area changed** (`app/dashboard/page.tsx:2`): app/dashboard/page.tsx touches a ui area and should be reviewed for regression impact. - **New unchecked TypeScript escape hatch** (`src/api/users.ts:1`): src/api/users.ts adds code that can hide type, lint, or runtime failures. ### Recommended Checks - `Add or run tests covering app/dashboard/page.tsx` - `Add or run tests covering src/api/users.ts` - `Add or run tests covering src/auth/session.ts` - `Add or run tests covering vite.config.ts` ### Notes CodeDecay is deterministic and local-first. This report was generated without telemetry, API keys, LLMs, or model calls. --- # Scoring Model Source page: https://SubmuxHQ.github.io/CodeDecay/scoring Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/scoring.md # Scoring Model CodeDecay produces two scores from 0 to 100. ## Merge Risk Merge risk estimates how likely the PR is to break behavior that reviewers or CI should care about before merge. Signals include: - API route changes - auth/session/security changes - database/schema changes - config/build/deployment changes - risky source changes without nearby test changes - heavy mocking that may weaken regression confidence ## Decay Score Decay score estimates whether the PR makes the codebase harder to maintain. Signals include: - duplicated added logic - large changed functions - high function complexity - compiler or linter suppressions - unchecked TypeScript escape hatches - broad unrelated change sets - large test changes weakly connected to source changes ## Thresholds - `0-39`: low - `40-69`: medium - `70-100`: high Scores are capped by the highest relevant finding severity. A report with only low-severity merge-risk findings stays low, even if many low findings are present. A report with only medium-severity merge-risk findings stays at most medium. High risk requires high-severity evidence. The v1 scoring model is deterministic. The same diff should produce the same score. ## No LLM Required CodeDecay does not call a model to decide risk. It uses git diff data, path-based impact detection, local JS/TS source analysis, and deterministic rules. --- # Research Basis Source page: https://SubmuxHQ.github.io/CodeDecay/research Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/research.md # Research Basis CodeDecay is motivated by research on software evolution, pull request impact, and AI-era code quality risks. ## SlopCodeBench SlopCodeBench studies how coding agents degrade over long-horizon iterative tasks. It tracks verbosity and structural erosion, including duplicated code and complexity concentration in high-complexity functions. Reference: https://arxiv.org/html/2603.24755v1 ## Does Code Decay? “Does Code Decay? Assessing the Evidence from Change Management Data” connects software evolution data to code decay and maintenance risk. References: - https://www.niss.org/sites/default/files/technicalreports/tr81.pdf - https://dl.acm.org/doi/10.1109/32.895984 ## Pull Request Change Impact Pull request change impact research supports using code structure and changed artifact relationships to improve review focus and risk awareness. Reference: https://link.springer.com/article/10.1007/s10664-024-10600-2 ## AI Code Quality, Churn, And Duplication AI-assisted development can increase code volume, churn, and duplicated code when teams optimize only for immediate output. CodeDecay turns those concerns into local PR checks. Reference: https://www.gitclear.com/ai_assistant_code_quality_2025_research --- # Releasing Source page: https://SubmuxHQ.github.io/CodeDecay/releasing Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/releasing.md # Releasing CodeDecay publishes one npm package for v1: ```text @submuxhq/codedecay ``` The package source is `packages/cli`, and the installed binary remains `codedecay`. CodeDecay also publishes an optional GitHub Packages npm mirror with the same package name. GitHub Packages scopes packages by GitHub user or organization owner, and this repository is owned by `SubmuxHQ`, so every release from v0.2.0 onward uses `@submuxhq/codedecay` for both npmjs and the GitHub Packages mirror. npmjs is the default public install path for users: ```bash npm install -D @submuxhq/codedecay ``` GitHub Packages is an authenticated mirror for GitHub-based workflows and requires registry authentication for installs. ## Patch Release Checklist Before opening the release PR, bump the published version in: - `packages/cli/package.json` - `packages/core/src/index.ts` After the release PR is merged, release only from a clean `main` branch at the commit that will be tagged and published. Do not publish npm contents from a different commit than the Git tag. Run: ```bash pnpm install pnpm run lint pnpm typecheck pnpm test pnpm build pnpm --filter @submuxhq/codedecay pack --dry-run ``` Inspect the tarball before publishing: ```bash pnpm --filter @submuxhq/codedecay pack tar -tzf submuxhq-codedecay-.tgz ``` The tarball must include: ```text package/LICENSE package/README.md package/package.json package/dist/index.js package/dist/index.d.ts ``` Before publishing, run the installed-package smoke against the packed tarball: ```bash pnpm demo:published-package --tarball ./submuxhq-codedecay-.tgz --run-id v-tarball-smoke ``` This creates a fresh temp install, materializes the Next.js and Node API demo repos, runs the installed `codedecay` binary, verifies JSON/Markdown/SARIF outputs, and writes logs under: ```text .codedecay/local/published-package-demo//run.json .codedecay/local/published-package-demo//summary.md ``` Publish the scoped package with public access: ```bash pnpm --filter @submuxhq/codedecay publish --access public ``` If npm requires a one-time password in a non-interactive shell, publish from the package directory: ```bash cd packages/cli npm publish --access public --otp ``` After publishing, verify the public install path: ```bash tmpdir=$(mktemp -d) cd "$tmpdir" npm install @submuxhq/codedecay@ node_modules/.bin/codedecay --help ``` After publishing, run the same smoke against the registry package: ```bash pnpm demo:published-package --package @submuxhq/codedecay@ --run-id v-published-smoke ``` Create the GitHub release for the same tag and verify the release surfaces stay in sync: ```bash git show --no-patch --decorate --oneline v gh release view v npm view @submuxhq/codedecay version dist-tags --json ``` The package version, npm `latest` dist-tag, Git tag, and GitHub release should all refer to the same released version before the release is considered done. ## GitHub Packages Mirror The GitHub Packages mirror is published from the same built CLI package. It uses the same package name, `@submuxhq/codedecay`, and sets the registry to `https://npm.pkg.github.com`. Use npmjs for public end-user installs. Use GitHub Packages only when a workflow or organization policy specifically needs a GitHub-hosted package mirror. Prepare the mirror package locally after `pnpm build:packages`: ```bash pnpm package:github --out /tmp/codedecay-ghpkg cd /tmp/codedecay-ghpkg npm pack --dry-run ``` The dry run should include: ```text package/LICENSE package/README.md package/package.json package/dist/index.js package/dist/index.d.ts ``` Publish through the `Publish GitHub Packages` workflow. It uses the repository `GITHUB_TOKEN` with `packages: write` permission and skips publishing if the same mirror version already exists. Use the exact release tag as the workflow `ref`. Do not publish from `main` after unreleased commits have landed, because that can create a GitHub Packages version whose contents differ from the npmjs package with the same version. Manual dispatch: ```bash gh workflow run publish-github-packages.yml -f ref=v ``` Install from GitHub Packages by adding the GitHub owner scope and an authenticated token to `.npmrc`: ```text @submuxhq:registry=https://npm.pkg.github.com //npm.pkg.github.com/:_authToken= ``` Then install: ```bash npm install -D @submuxhq/codedecay@ node_modules/.bin/codedecay version ``` GitHub Packages requires a personal access token classic with `read:packages` for local installs. GitHub Actions can use `GITHUB_TOKEN` when the package is associated with this repository and the workflow has package access. If local verification fails with `403 permission_denied`, check the token scope before changing package metadata. The default public npmjs install path should still work without GitHub authentication: ```bash npm install -D @submuxhq/codedecay ``` --- # Framework-Aware Route/API Impact Map Source page: https://SubmuxHQ.github.io/CodeDecay/proposals/framework-aware-impact-map Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/proposals/framework-aware-impact-map.md # Framework-Aware Route/API Impact Map Status: implementation started Proposal issue: [#35](https://github.com/SubmuxHQ/CodeDecay/issues/35) Implementation issue: [#144](https://github.com/SubmuxHQ/CodeDecay/issues/144) ## Goal CodeDecay should make regression risk more actionable by translating changed JavaScript and TypeScript files into affected routes, API endpoints, and request surfaces where that can be done deterministically. The first implementation should focus on Next.js and Node API projects because they are common adoption paths for CodeDecay and are already represented by the example projects. ## Non-Goals - No LLM, model, cloud, telemetry, or API-key dependency. - No runtime tracing or server startup. - No attempt to prove whether a change was AI-generated. - No broad generic code review comments. - No scoring change until the extracted impact map is covered by fixtures and report tests. ## Proposed Report Field Add an optional top-level field to the JSON report: ```ts interface ImpactedRoute { framework: "nextjs" | "express" | "fastify" | "node"; kind: "ui-route" | "api-route" | "middleware" | "route-handler"; route: string; methods: string[]; files: string[]; risk: "low" | "medium" | "high"; reasons: string[]; recommendedTests: string[]; } ``` Example: ```json { "impactedRoutes": [ { "framework": "nextjs", "kind": "api-route", "route": "/api/users", "methods": ["GET"], "files": ["src/app/api/users/route.ts"], "risk": "high", "reasons": ["API route changed", "No nearby test changed"], "recommendedTests": ["Add or run tests covering src/app/api/users/route.ts"] } ] } ``` Markdown reports should add a compact section after `Likely Impacted Areas`: ```markdown ### Likely Impacted Routes And APIs - High `GET /api/users` (Next.js API route): `src/app/api/users/route.ts` - Medium `/dashboard` (Next.js UI route): `src/app/dashboard/page.tsx` ``` SARIF should stay minimal for now. It can continue emitting file/line findings; route data can be added later through SARIF `properties` only if GitHub code scanning handles it cleanly. ## Supported Patterns ### Next.js Supported first: - `app/**/page.{js,jsx,ts,tsx}` -> UI route - `src/app/**/page.{js,jsx,ts,tsx}` -> UI route - `app/api/**/route.{js,ts}` -> API route - `src/app/api/**/route.{js,ts}` -> API route - `pages/api/**/*.{js,ts}` -> API route - `src/pages/api/**/*.{js,ts}` -> API route - `middleware.{js,ts}` and `src/middleware.{js,ts}` -> middleware Route normalization: - Remove `src/`, `app/`, and `pages/` prefixes. - Remove `page`, `route`, and file extensions. - Convert route groups like `(admin)` to no path segment. - Preserve dynamic segments such as `[id]` and `[...slug]`. - Convert `index` pages to the parent route. - For `app/api/users/route.ts`, report `/api/users`. - For `app/dashboard/page.tsx`, report `/dashboard`. HTTP methods: - For Next.js route handlers, detect exported functions named `GET`, `POST`, `PUT`, `PATCH`, `DELETE`, `HEAD`, or `OPTIONS`. - If no method is found, use `["*"]`. - UI routes should use an empty method list. ### Express Supported first: - `app.get("/path", ...)` - `app.post("/path", ...)` - `router.get("/path", ...)` - `router.post("/path", ...)` - equivalent `put`, `patch`, `delete`, `head`, and `options` File patterns to inspect: - `src/routes/**/*.{js,ts}` - `src/api/**/*.{js,ts}` - `src/controllers/**/*.{js,ts}` - `routes/**/*.{js,ts}` - `api/**/*.{js,ts}` - `server.{js,ts}` - `app.{js,ts}` The extractor should use AST parsing where practical and fall back to simple literal-string matching only for route call expressions. It should not execute application code. ### Fastify Supported first: - `fastify.get("/path", ...)` - `fastify.post("/path", ...)` - `server.get("/path", ...)` - `server.route({ method: "GET", url: "/path" })` - array methods in route objects, for example `method: ["GET", "POST"]` Use the same file patterns as Express. ## Risk Mapping Route/API impact risk should derive from existing deterministic signals: - API route changed -> high - auth/session/security file changed and route imports or lives near auth code -> high - database/schema file changed and route imports DB/model code -> high - UI route changed -> medium - middleware changed -> high - route changed with no nearby tests -> add reason, do not duplicate the existing `missing-nearby-tests` finding The first implementation should not add new score weights. It should make the report more specific while preserving the current scoring behavior. ## Required Tests Analyzer fixtures: - Next.js App Router UI route: `src/app/dashboard/page.tsx` - Next.js App Router API route with exported `GET` - Next.js dynamic route: `src/app/users/[id]/page.tsx` - Next.js route group: `src/app/(admin)/dashboard/page.tsx` - Next.js Pages API route: `src/pages/api/users.ts` - Express router method calls for `GET` and `POST` - Fastify shorthand calls and `server.route({ method, url })` - No false route for non-route utility files Report tests: - JSON includes `impactedRoutes` when present. - Markdown renders the route/API impact section. - SARIF remains valid when route impact data exists. CLI tests: - Existing CLI output remains backward compatible. - Snapshot or assertion covers a fixture PR with route impact data. Example fixtures: - Extend `examples/nextjs-risk-demo` expected summary once implemented. - Extend `examples/node-api-risk-demo` expected summary once implemented. ## Implementation Plan 1. Add optional `impactedRoutes` types to `packages/core`. 2. Render `impactedRoutes` in JSON and Markdown reports. 3. Add Next.js deterministic route extraction in `packages/analyzer-js`. 4. Add Express and Fastify route extraction in `packages/analyzer-js`. 5. Add fixtures and tests before changing any scoring behavior. 6. Update sample reports and example README summaries. ## Open Questions - Should dynamic routes stay as framework-native paths like `/users/[id]`, or should CodeDecay normalize them to `/users/:id`? - Should route impact eventually influence scoring, or remain explanatory only? - Should imports be analyzed deeply enough to connect DB/auth changes to routes, or should v1 keep that relationship path-based? --- # RFC 0001: Agent-Agnostic Redteam Harness Architecture Source page: https://SubmuxHQ.github.io/CodeDecay/rfcs/0001-agent-agnostic-redteam-harness Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/rfcs/0001-agent-agnostic-redteam-harness.md # RFC 0001: Agent-Agnostic Redteam Harness Architecture Status: proposed ## Summary CodeDecay should evolve from deterministic PR risk analysis into an agent-agnostic PR safety harness for AI-assisted development. CodeDecay should not replace Codex, Claude Code, Cursor, Pi, OpenCode, desktop agents, or internal company agents. It should give those agents better evidence, skills, memory, execution results, weak-test findings, and merge-safety reports. Positioning: ```text CodeDecay is an open-source PR safety harness that works with your existing AI agents and open-source tools. Use Codex, Claude Code, Cursor, Pi, OpenCode, desktop agents, or any MCP-compatible workflow. CodeDecay orchestrates evidence, memory, skills, test audits, and runtime checks so your agent can find what it missed before merge. ``` Core question: ```text What could this PR break, and are the tests actually proving it will not? ``` ## Product Boundaries CodeDecay owns: - PR orchestration - normalized evidence - safety policy - impact mapping - test-quality audit - tool-adapter coordination - merge-safety reporting - fix-task generation for coding agents CodeDecay does not own: - the user's main coding agent - hosted model inference - mandatory cloud memory - every test runner or fuzzing engine - every language parser The default OSS experience must remain: - local-first - deterministic baseline - no telemetry - no required API keys - no required LLM/model calls - no required CodeDecayCloud ## Architecture ```text User agent / IDE / harness Codex | Claude Code | Cursor | Pi | OpenCode | Desktop app | Custom MCP client | v CodeDecay CLI / MCP / GitHub Action / GitHub App | v redteam orchestrator | +--> git diff and current deterministic analyzer +--> impact map +--> memory providers +--> skill selection +--> agent/harness adapters +--> safe execution +--> OSS tool adapters +--> test audit +--> base/head differential checks | v merge-safety report + evidence bundle + fix tasks | v back to the user's agent for implementation ``` ## `codedecay redteam` Flow 1. Analyze the PR diff with the existing deterministic engine. 2. Build an impact map for routes, APIs, UI flows, jobs, database/schema, auth, and config. 3. Load repo memory from `.codedecay/memory.json` and optional user-owned memory providers. 4. Select relevant review skills. 5. Ask an optional user-owned agent or harness for missed risks and edge cases. 6. Run selected open-source tools through safe adapters. 7. Audit changed and nearby tests for weak or fake confidence. 8. Generate an edge-case checklist grounded in impacted areas. 9. Compare base/head behavior through configured probes when explicitly enabled. 10. Produce a merge-safety report that separates tool evidence from AI suggestions. 11. Generate fix tasks for Codex, Claude Code, Cursor, Pi, OpenCode, or any MCP-compatible agent. ## Module Plan ### `packages/redteam` Purpose: owns the end-to-end `codedecay redteam` orchestration. Inputs: - cwd - base/head refs - CodeDecay config - selected adapters - optional agent/harness provider Outputs: - merge-safety report - normalized evidence bundle - fix tasks Public API: ```ts interface RedteamInput { cwd: string; base?: string; head?: string; mode: "deterministic" | "assisted"; } interface RedteamResult { report: MergeSafetyReport; evidence: Evidence[]; fixTasks: FixTask[]; } ``` Safety: - deterministic mode must work without agents or models - no command execution unless explicitly configured MVP: - compose existing analyze, memory, config, and Markdown report data into a redteam-shaped report ### `packages/harness` Purpose: registry and interface for agent and tool harnesses. Inputs: - redteam task - repo context - selected skills - evidence collected so far Outputs: - harness plan - harness run result - normalized evidence - summary Public API: ```ts interface CodeDecayHarness { name: string; capabilities: HarnessCapability[]; requiredConfig: ConfigRequirement[]; plan(input: HarnessPlanInput): Promise; run(plan: HarnessPlan, context: HarnessRunContext): Promise; collectEvidence(result: HarnessRunResult): Promise; summarize(evidence: Evidence[]): Promise; } ``` Safety: - adapters must declare whether they can execute commands, call models, or use network access - all failures become structured failure modes MVP: - generic process harness adapter - in-memory registry - evidence schema ### `packages/agent` Purpose: optional user-owned AI provider interface. Inputs: - prompt/task - selected skills - evidence bundle - model/provider config Outputs: - AI suggestions - missing edge cases - fix tasks Public API: ```ts interface AgentProvider { name: string; availability(): Promise; complete(input: AgentCompletionInput): Promise; } ``` Safety: - provider is disabled by default - no hidden model calls - cloud providers require explicit user config - AI output is suggestions, never proof OSS integrations: - Ollama local models - LiteLLM/OpenAI-compatible BYOK endpoints - Codex/Claude/Pi/OpenCode via prompt/task adapters where possible MVP: - disabled provider - OpenAI-compatible provider shape for later Ollama/LiteLLM support ### `packages/mcp` Purpose: expose CodeDecay as tools to MCP-compatible agents. Inputs: - MCP tool calls - cwd/base/head/options Outputs: - redteam plans - analyze results - impact maps - test audit findings - memory context - evidence summaries Safety: - tool descriptions must say when commands may execute - command-running tools require explicit config MVP: - extend the current MCP package instead of creating a parallel server package - expose `redteam_plan`, `redteam_run`, `impact_map`, and `test_audit` ### `packages/skills` Purpose: portable review instructions for agents and harnesses. Inputs: - impacted areas - project language/framework - user-selected skill pack Outputs: - selected skills - prompts/checklists - fix-task templates Safety: - skills are instructions, not executable authority - skills cannot override command safety MVP: - filesystem skill loader for `.agents/skills` - built-in skills for API, auth, database, frontend flow, test quality, and GitHub App review ### `packages/memory` Purpose: project-specific context provider. Inputs: - local `.codedecay/memory.json` - optional external memory config - changed files and impacted areas Outputs: - relevant flows - invariants - past regressions - architecture notes - recommended checks Safety: - memory is untrusted context - local file remains default - external providers must be opt-in OSS integrations: - local `.codedecay/memory.json` - Mem0 - Supermemory MVP: - formalize provider interface around the existing local memory package ### `packages/execution` Purpose: safe command and probe execution. Inputs: - explicit configured command - cwd/temp worktree - timeout - redaction rules Outputs: - stdout/stderr - exit code - duration - structured failure mode Safety: - no guessed commands - no destructive commands - no production deploys - no migrations - no secret printing - timeout required MVP: - extract and harden the current configured command runner ### `packages/test-audit` Purpose: detect tests that look reassuring but do not prove real behavior. Inputs: - changed tests - nearby tests - changed source files - optional coverage/mutation evidence Outputs: - weak-test findings - missing-test recommendations - evidence explaining why confidence is weak Safety: - static audit is allowed by default - dynamic test execution requires explicit config MVP: - no assertions - snapshot-only - excessive mocks - tests unrelated to changed source - copied implementation logic heuristics ### `packages/impact-map` Purpose: map PR changes to product/system areas. Inputs: - changed files - AST/code map - config/memory Outputs: - impacted routes - API endpoints - UI flows - jobs - database/schema areas - auth/config boundaries Safety: - static mapping only by default - no code execution OSS integrations: - TypeScript compiler API for JS/TS first - Tree-sitter later for multi-language parsing MVP: - framework-aware JS/TS route/API/config map ### `packages/tool-adapters` Purpose: normalize OSS tool execution and output. Inputs: - tool config - cwd/base/head - selected impacted areas Outputs: - evidence records - artifact paths - failure modes Safety: - adapters must use `packages/execution` - adapters cannot bypass command allowlists OSS integrations: - Playwright - StrykerJS - Schemathesis - Pact - Vitest/Jest/Pytest/Bun - c8/nyc/Istanbul MVP: - adapter interface - Playwright adapter plan - StrykerJS adapter plan - Schemathesis adapter plan ## Evidence Model Evidence should be the common currency between deterministic analysis, tools, agents, and reports. ```ts interface Evidence { id: string; source: EvidenceSource; kind: | "diff" | "impact" | "test" | "coverage" | "mutation" | "api-fuzz" | "contract" | "browser-flow" | "memory" | "agent-suggestion" | "execution"; severity: "info" | "low" | "medium" | "high"; summary: string; file?: string; line?: number; command?: string; artifactPath?: string; trusted: boolean; } ``` Rules: - tool evidence and AI suggestions must be rendered separately - memory and agent output are untrusted by default - evidence should reference files/lines/artifacts when available ## Harness Failure Modes ```ts type HarnessFailureMode = | "missing-tool" | "missing-config" | "command-denied" | "timeout" | "nonzero-exit" | "network-required" | "unsafe-command" | "model-unavailable" | "no-evidence"; ``` Failures should not disappear into generic logs. They should be visible in the merge-safety report as missing evidence or blocked checks. ## OSS Integration Sequence 1. MCP server tools for any compatible agent. 2. Generic process harness for local commands and CLI-based agents. 3. Portable CodeDecay skills for Codex, Claude Code, Cursor, Pi, OpenCode, and internal agents. 4. Local memory provider, then optional Mem0/Supermemory providers. 5. Ollama and LiteLLM/OpenAI-compatible provider interfaces. 6. Playwright adapter for browser/user-flow evidence. 7. StrykerJS adapter for mutation-testing evidence. 8. Schemathesis adapter for API fuzzing evidence. 9. Pact adapter for contract-testing evidence. 10. Coverage adapters for c8/nyc/Istanbul. 11. TypeScript compiler API impact map, then Tree-sitter for multi-language. ## Implementation Issues 1. `docs(rfc): define agent-agnostic redteam harness architecture` 2. `feat(harness): add harness registry and evidence schema` 3. `feat(execution): add safe command runner` 4. `feat(mcp): expose redteam tools for MCP-compatible agents` 5. `feat(skills): add portable redteam skill loader` 6. `feat(memory): formalize local memory provider interface` 7. `feat(test-audit): detect weak and fake-looking tests` 8. `feat(tool-adapter): add Playwright harness` 9. `feat(tool-adapter): add StrykerJS harness` 10. `feat(tool-adapter): add Schemathesis harness` 11. `feat(agent): add optional Ollama provider` 12. `feat(agent): add optional LiteLLM provider` 13. `feat(harness): add Pi convenience adapter` ## First Three PRs 1. RFC PR: this document only. 2. Harness PR: add `packages/harness` with registry, evidence schema, and tests. 3. Execution PR: extract/harden safe configured command runner. ## Open Questions - Should `codedecay redteam` default to deterministic-only mode and require `--assist` for any agent/model call? - Should external memory providers live behind one provider interface or separate adapter packages? - Should GitHub App redteam execution remain disabled until sandboxed workers exist? - Should adapter artifacts be stored only in `.codedecay/artifacts/` or in a user-specified output directory? - Which first external agent workflow should get convenience docs: Codex, Claude Code, Cursor, Pi, or OpenCode? --- # CodeDecay v0.1.1 Launch Post Source page: https://SubmuxHQ.github.io/CodeDecay/launch-post Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/launch-post.md # CodeDecay v0.1.1 Launch Post We released CodeDecay v0.1.1: an open-source, local-first CLI/GitHub Action for detecting PR regression risk and maintainability decay. No API keys, no LLM calls, no telemetry. Install: ```bash npm install -D @submuxhq/codedecay ```