# CodeDecay Docs Bundle
This is the concatenated Markdown bundle for the CodeDecay docs site.
---
# CodeDecay Docs
Source page: https://SubmuxHQ.github.io/CodeDecay/
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/index.md
## Read This First
- [Getting Started](/getting-started): install the CLI and run your first PR analysis
- [GitHub Action](/github-action): add CodeDecay to pull request workflows
- [Redteam Reports](/redteam): generate merge-safety reports for yourself or your coding agent
- [Agent Task Bundles](/agent): hand deterministic evidence to Codex, Claude Code, Cursor, Pi, OpenCode, or desktop agents
- [MCP Server](/mcp): expose CodeDecay as a local MCP tool for agent clients
## For Humans
- Use the sidebar and local search to navigate product docs quickly.
- Open [Sample Reports](/sample-reports/) to see the actual Markdown, JSON, and SARIF outputs before integrating CodeDecay.
- Use the GitHub edit links to tighten docs in the same repo that ships the code.
## For Agents
- [`/llms.txt`](/llms.txt): compact map of the docs site
- [`/llms-full.txt`](/llms-full.txt): one bundled Markdown context file
- /markdown/getting-started.md: per-page raw Markdown endpoints for direct retrieval
These endpoints are generated from the same source files as the docs site, so humans and agents read the same content instead of drifting copies.
---
# Getting Started
Source page: https://SubmuxHQ.github.io/CodeDecay/getting-started
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/getting-started.md
# Getting Started
CodeDecay analyzes pull requests for regression risk and maintainability decay.
It works locally and in CI without cloud services, telemetry, API keys, LLMs, or
model calls.
## Install
Use the package manager your repository already uses:
```bash
npm install -D @submuxhq/codedecay
pnpm add -D @submuxhq/codedecay
bun add -d @submuxhq/codedecay
yarn add -D @submuxhq/codedecay
```
For a no-install smoke test:
```bash
npx -y @submuxhq/codedecay --help
```
After a local install, run CodeDecay with `npx codedecay`, `pnpm codedecay`,
`bunx codedecay`, or add `codedecay` to a package script.
Do not run `npm install` inside a Bun, pnpm, or Yarn workspace that uses
`workspace:*` dependencies. npm may fail before CodeDecay is installed. In Bun
repos with `minimumReleaseAge`, a fresh CodeDecay release may also be blocked by
repo policy; for local evaluation you can override it explicitly:
```bash
bun add -d @submuxhq/codedecay --minimum-release-age 0
```
## Analyze A PR Diff
```bash
npx codedecay analyze --base main --head HEAD --format markdown
```
## Analyze Current Working Tree
```bash
npx codedecay analyze --format markdown
```
## Analyze Another Repository
```bash
npx codedecay analyze --cwd ../my-repo --format markdown
```
## Generate A Redteam Report
Use `redteam` when you want one report for yourself or your coding agent that
summarizes what the PR could break, weak-test evidence, missing edge cases, and
fix tasks.
```bash
npx codedecay redteam --base main --head HEAD --format markdown
```
The current redteam MVP is report-only. It does not run commands or call an LLM.
## Hand Evidence To Your Agent
Use `agent` when you want Codex, Claude Code, Cursor, a desktop agent, or
another user-owned agent to act on CodeDecay's findings.
```bash
npx codedecay agent --base main --head HEAD --format markdown --output codedecay-agent.md
```
Then give `codedecay-agent.md` to your agent and ask it to:
- fix high-risk findings first,
- add tests that exercise real API, UI, database, or downstream behavior,
- cover the missing edge cases listed by CodeDecay,
- run the relevant project checks,
- rerun CodeDecay after changes.
The agent bundle is local evidence plus instructions. CodeDecay does not call
Codex, Claude, Cursor, Ollama, cloud models, or CodeDecayCloud while creating
it.
## Recommended Local Loop
```bash
npx codedecay analyze --base main --head HEAD --format markdown
npx codedecay redteam --base main --head HEAD --format markdown --output codedecay-redteam.md
npx codedecay agent --base main --head HEAD --format markdown --output codedecay-agent.md
```
Use the redteam report to understand the PR risk. Use the agent bundle to give
your own coding agent the evidence, missing checks, and fix tasks it should
work through. After the agent changes code, run your project checks and run
CodeDecay again.
## Write SARIF
```bash
npx codedecay analyze --format sarif --output codedecay.sarif
```
## Inspect CodeDecay Config
Configuration is optional. Missing config uses safe defaults.
```bash
npx codedecay config --format markdown
```
## Fail CI On High Risk
```bash
npx codedecay analyze --base main --head HEAD --fail-on high
```
Risk levels:
- `0-39`: low
- `40-69`: medium
- `70-100`: high
## Try An Example
Use the example projects to see a realistic high-risk report before wiring
CodeDecay into your own repository:
- [Next.js risk demo](https://github.com/SubmuxHQ/CodeDecay/blob/main/examples/nextjs-risk-demo/README.md)
- [Node API risk demo](https://github.com/SubmuxHQ/CodeDecay/blob/main/examples/node-api-risk-demo/README.md)
---
# GitHub Action
Source page: https://SubmuxHQ.github.io/CodeDecay/github-action
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/github-action.md
# GitHub Action
CodeDecay ships a composite GitHub Action wrapper around the bundled CLI.
```yaml
name: CodeDecay
on:
pull_request:
jobs:
codedecay:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
with:
mode: analyze
base: ${{ github.event.pull_request.base.sha }}
head: ${{ github.event.pull_request.head.sha }}
cwd: .
format: markdown
fail-on: high
```
## SARIF Output
```yaml
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
with:
mode: analyze
base: ${{ github.event.pull_request.base.sha }}
head: ${{ github.event.pull_request.head.sha }}
cwd: .
format: sarif
output: codedecay.sarif
fail-on: high
```
Relative `output` paths resolve from `cwd`. For example, with `cwd:
packages/web` and `output: codedecay.sarif`, the SARIF file is written to
`packages/web/codedecay.sarif`. Absolute `output` paths are honored exactly.
```yaml
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
with:
mode: analyze
cwd: packages/web
format: sarif
output: codedecay.sarif
```
The MVP action writes a markdown summary to `$GITHUB_STEP_SUMMARY`. SARIF upload
can be added by the workflow using GitHub's code scanning upload action.
## Redteam And Agent Modes
The action can also run report-only redteam and agent bundle modes. Redteam
mode is useful as a Step Summary because it includes impact, memory, edge cases,
and fix tasks for a user-owned agent:
```yaml
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
with:
mode: redteam
base: ${{ github.event.pull_request.base.sha }}
head: ${{ github.event.pull_request.head.sha }}
cwd: .
format: markdown
```
```yaml
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
with:
mode: agent
base: ${{ github.event.pull_request.base.sha }}
head: ${{ github.event.pull_request.head.sha }}
cwd: .
format: markdown
output: codedecay-agent.md
```
Supported modes are `analyze`, `redteam`, and `agent`. The action does not
expose command-executing modes. `format: sarif` is supported only with
`mode: analyze`. `fail-on` is forwarded for `analyze` and `redteam`; `agent`
mode produces a task bundle for a user-owned coding agent and does not gate the
workflow by risk level.
Use `fail-on` with `analyze` when you want a deterministic CI gate. You can also
add `fail-on` to `redteam` if your repository wants strict risk-score gating.
The CodeDecay repository dogfoods `redteam` report-only so the Step Summary is
always available while lint, typecheck, tests, build, package dry-run, and the
PR safety efficacy eval remain the hard validation gates.
---
# Configuration
Source page: https://SubmuxHQ.github.io/CodeDecay/configuration
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/configuration.md
# Configuration
CodeDecay can load repo-local configuration for red-team orchestration, tool
adapter plans, and real behavior probes.
Configuration is optional. If no config file exists, CodeDecay uses safe
defaults and does not run project commands.
## Supported Files
CodeDecay discovers the first matching file from the analysis working directory:
- `.codedecay/config.yml`
- `.codedecay/config.yaml`
- `codedecay.config.yml`
- `codedecay.config.yaml`
Use `--cwd` to inspect another repository:
```bash
npx codedecay config --cwd ../my-repo --format markdown
```
## Example
```yaml
version: 1
commands:
test:
- pnpm test
build:
- pnpm build
start:
- pnpm dev
probes:
- name: users api
command: curl -f http://localhost:3000/api/users
timeoutMs: 5000
toolAdapters:
playwright: true
stryker:
command: pnpm exec stryker run
schemathesis:
schema: docs/openapi.yaml
baseUrl: http://127.0.0.1:3000
pact:
command: pnpm run test:pact
safety:
commandTimeoutMs: 120000
allowCommands: false
llm:
provider: disabled
timeoutMs: 30000
```
Optional user-owned model providers must be configured explicitly. For a local
LiteLLM or other OpenAI-compatible endpoint:
```yaml
llm:
provider: litellm
model: gpt-4.1-mini
endpoint: http://127.0.0.1:4000/v1
apiKeyEnv: LITELLM_API_KEY
timeoutMs: 30000
```
Use `apiKeyEnv` to point at an environment variable name. Do not store literal
API keys in CodeDecay config.
## Safety Model
Config files make project commands explicit. CodeDecay should not guess commands
from model output or run arbitrary commands by default.
Current behavior:
- `codedecay analyze` does not require config.
- `codedecay config` only loads and prints config.
- `codedecay redteam` lists configured tool adapters as planned local checks,
but does not run them.
- `codedecay execute` runs only commands and probes from config, and only when
`safety.allowCommands` is true.
- `codedecay differential` runs only configured probes on temporary base/head
worktrees, and only when `safety.allowCommands` is true.
- missing config returns safe defaults.
- no telemetry, API keys, LLM calls, or cloud services are used.
- LLM use is disabled by default. LLM-backed commands must opt in
explicitly and treat model output as untrusted suggestions.
Execution uses this config as its allowlisted command source. See
[Execution probes](execution.md) and
[Differential behavior checks](differential.md).
Tool adapters are also configured here. See [Tool adapters](tool-adapters.md)
for Playwright, StrykerJS, Schemathesis, and Pact adapter details.
Read [LLM providers](llm-providers.md) for optional local/BYOK model adapters.
---
# Redteam Reports
Source page: https://SubmuxHQ.github.io/CodeDecay/redteam
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/redteam.md
# Redteam Reports
`codedecay redteam` packages local PR safety evidence into a report that a
developer or their own coding agent can use before merge.
It asks:
```text
What could this PR break, and are the tests actually proving it will not?
```
The command is report-only in the current MVP. It does not run configured
commands, does not call an LLM, does not require API keys, does not send
telemetry, and does not depend on CodeDecayCloud.
Use it when you want a local merge-safety brief for Codex, Claude Code, Cursor,
desktop agents, or another user-owned agent. CodeDecay provides deterministic
tool evidence; the receiving agent still has to inspect the code and prove fixes
with tests or configured checks.
## Run
```bash
npx codedecay redteam --base main --head HEAD --format markdown
npx codedecay redteam --cwd ../my-repo --format json
npx codedecay redteam --format markdown --output codedecay-redteam.md
```
Exit codes:
- `0`: report generated and risk is below `--fail-on`, if provided.
- `1`: report generated and risk meets `--fail-on`.
- `2`: CLI/internal error, such as invalid git refs or invalid config.
## What The Report Includes
- changed files and impacted product/system areas
- concrete route/API impacts when CodeDecay can detect them, such as Next.js
API routes, Next.js UI routes, Express handlers, or Fastify handlers
- merge-risk and decay-risk scores
- test proof audit status: `missing`, `weak`, `present`, or `not_applicable`
- weak-test and missing-test findings from deterministic test-audit rules
- deterministic missing edge-case checklist
- local memory summary from `.codedecay/memory.json`
- repo-local agent skill summaries from `.agents/skills/*/SKILL.md`
- configured test/build/start/probe commands that are available but not run
- configured Playwright, StrykerJS, Schemathesis, and Pact tool adapters that
are planned but not run
- fix tasks for your coding agent
- explicit safety flags showing that commands and models were not called
## Agent-Agnostic Workflow
CodeDecay does not replace Codex, Claude Code, Cursor, Pi, OpenCode, desktop
agents, or internal agents. Use it to give those tools better evidence.
Suggested workflow:
1. Run `codedecay redteam --format markdown`.
2. Start with the impacted route/API section and ask what real user/API path
reaches each changed file.
3. Paste or attach the report to your coding agent.
4. Ask the agent to fix the high-risk findings and add real checks for the
impacted routes, missing edge cases, and weak-test findings.
5. Run `codedecay analyze`, `codedecay execute`, or `codedecay differential`
explicitly when you want static analysis, configured checks, or base/head
behavior probes.
See [Agent skills](skills.md) for the local skill file format.
## Safety Model
`codedecay redteam` lists configured checks and tool adapter plans from
CodeDecay config, but it does not execute them. Command execution remains
explicit through `codedecay execute` and `codedecay differential`, and those
commands still require `safety.allowCommands: true`.
Model use is also opt-in. The redteam MVP does not call Ollama, LiteLLM, cloud
models, or any hosted CodeDecay service.
---
# Agent Task Bundles
Source page: https://SubmuxHQ.github.io/CodeDecay/agent
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/agent.md
# Agent Task Bundles
`codedecay agent` turns a deterministic redteam report into a task bundle for a
user-owned coding agent.
Use it when you want Codex, Claude Code, Cursor, Pi, OpenCode, a desktop agent,
or another local agent to fix what CodeDecay found without CodeDecay making a
hidden model call.
```bash
npx codedecay agent --base main --head HEAD --format markdown
npx codedecay agent --profile codex --format markdown
npx codedecay agent --cwd ../my-repo --format json --output codedecay-agent.json
```
The bundle includes:
- a copy-paste prompt for any user-owned coding agent
- changed files, impacted areas, and concrete route/API impacts when available
- weak-test and missing-test proof signals
- edge cases to check
- configured checks and tool adapters that are available but not run
- tasks for the coding agent
- repo-local skill summaries
- safety and limitation notes
## Agent Profiles
Profiles only shape the handoff instructions. They do not make CodeDecay call
the selected agent, call an LLM, require API keys, or send code anywhere.
Supported profiles:
- `generic`: portable bundle for any user-owned agent.
- `codex`: handoff wording for a Codex repo session.
- `claude-code`: handoff wording for Claude Code.
- `cursor`: handoff wording for Cursor chat or agent mode.
- `pi`: handoff wording for Pi harness or Pi-compatible agent workflows.
- `opencode`: handoff wording for OpenCode.
- `desktop`: handoff wording for desktop or local agent apps.
Example:
```bash
npx codedecay agent --profile cursor --format markdown --output codedecay-agent.md
```
## How To Use
1. Run `codedecay agent`.
2. Copy the prompt from the `Copy-Paste Prompt` section.
3. Give the prompt and Markdown or JSON output to your agent.
4. Ask the agent to start from impacted routes/APIs and explain what real user,
API, database, or downstream path could break.
5. Ask the agent to complete the listed tasks with real tests and behavior
checks.
6. Run CodeDecay again.
Example prompt style:
```text
Use this CodeDecay agent task bundle as tool evidence.
Fix the listed PR risks.
Do not assume the PR is safe because tests pass.
Add or improve tests that exercise real behavior paths.
After changes, tell me what checks to run.
```
For JSON consumers, route/API evidence is available under
`evidence.impactedRoutes`. Treat it as tool evidence for the agent's fix plan:
the agent should map each proposed fix back to the changed file, route/API, weak
test signal, and missing edge case it addresses.
## Safety
`codedecay agent` is report-only.
It does not:
- call an LLM or hosted model
- execute commands
- send telemetry
- require API keys
- depend on CodeDecayCloud
Agent output is not trusted evidence by itself. Treat the agent's response as a
proposal until it is verified by tests, configured checks, or manual review.
---
# MCP Server
Source page: https://SubmuxHQ.github.io/CodeDecay/mcp
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/mcp.md
# MCP Server
CodeDecay can run as a local Model Context Protocol server so agent clients can
ask it for PR risk, impact maps, weak-test audits, and deterministic edge-case
suggestions. It can also run explicitly configured local checks when the caller
confirms execution.
The MCP server calls local CodeDecay analysis only. It does not call an LLM,
does not require API keys, and does not send telemetry. Command execution is
opt-in and limited to commands already present in CodeDecay config.
## Run Locally
```bash
npx @submuxhq/codedecay mcp --cwd /path/to/repo
```
## Example MCP Client Config
Exact config shape varies by client. The important part is that the command
runs CodeDecay locally and passes the repository path with `--cwd`.
```json
{
"mcpServers": {
"codedecay": {
"command": "npx",
"args": ["-y", "@submuxhq/codedecay", "mcp", "--cwd", "/path/to/repo"]
}
}
}
```
## Tools
- `analyze_pr`: returns a Markdown or JSON CodeDecay report.
- `impact_map`: returns changed files, impacted areas, and concrete route/API
impacts when CodeDecay can detect them.
- `audit_tests`: returns missing-test and weak-test proof findings plus
recommended checks.
- `suggest_edge_cases`: returns deterministic edge-case suggestions.
- `redteam_report`: returns a deterministic merge-safety report for your agent,
including impacted areas, weak-test findings, edge cases, configured checks,
memory summary, fix tasks, and safety flags.
- `agent_task_bundle`: returns a deterministic task bundle that Codex, Claude
Code, Cursor, Pi, OpenCode, desktop agents, or other MCP-compatible agents can
use to fix PR risks. It packages a copy-paste prompt, tool evidence, weak-test
signals, edge cases, suggested checks, skills, and fix tasks. It accepts an
optional `profile` value: `generic`, `codex`, `claude-code`, `cursor`, `pi`,
`opencode`, or `desktop`.
- `execute_configured_checks`: runs configured CodeDecay commands, probes, and
enabled tool adapters. It requires `confirmExecution: true` and
`safety.allowCommands: true`.
Example execution tool input:
```json
{
"confirmExecution": true,
"format": "markdown"
}
```
## Safety
MCP clients should treat tool output as analysis, not as permission to execute
commands. The MCP server does not expose arbitrary command execution.
`redteam_report` is report-only. It does not run configured commands, call
Ollama or cloud models, send telemetry, or require CodeDecayCloud. It may include
local skill summaries from `.agents/skills/*/SKILL.md`, but it does not execute
skill content.
`agent_task_bundle` is also report-only. It uses the same deterministic
CodeDecay evidence as `codedecay agent`, and it does not call the MCP client,
Codex, Claude, Cursor, Ollama, cloud models, or CodeDecayCloud. The receiving
agent should treat the bundle as tool evidence plus instructions. The included
prompt is portable across Codex, Claude Code, Cursor, Pi, OpenCode, desktop
agents, and other MCP clients. The optional `profile` only changes handoff
wording; it does not call or authenticate with that agent. Any proposed fix
still needs verification with tests or configured checks.
`execute_configured_checks` is the only MCP tool that can execute local commands.
It never accepts command text from MCP input. It can only run commands from
`.codedecay/config.yml`, `codedecay.config.yml`, or enabled configured tool
adapters such as Playwright, StrykerJS, Schemathesis, and Pact.
Execution requires both:
- MCP input contains `confirmExecution: true`
- CodeDecay config contains `safety.allowCommands: true`
If confirmation is missing, CodeDecay returns a non-executing report. If
`safety.allowCommands` is false, configured checks use the existing skip behavior
and do not run.
---
# CodeDecay GitHub App
Source page: https://SubmuxHQ.github.io/CodeDecay/github-app
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/github-app.md
# CodeDecay GitHub App
The CodeDecay GitHub App is an optional hosted surface for teams that want
CodeDecay to run automatically on pull requests.
The app does not replace the CLI or GitHub Action. The open-source CLI remains
local-first and useful without the hosted app.
## What the app does
For pull request events, the app:
1. receives a GitHub webhook,
2. creates an in-progress CodeDecay check run,
3. checks out the pull request into a temporary directory,
4. runs deterministic CodeDecay analysis,
5. posts or updates one PR comment,
6. completes the check run.
For the first hosted version, the app only runs deterministic PR analysis. It
does not run project commands, deployment commands, LLM calls, model calls, or
CodeDecayCloud services.
## GitHub App settings
Create a GitHub App in the GitHub organization that will own the hosted app.
Set the webhook URL to:
```text
https:///github/webhooks
```
Subscribe to these events:
- Pull request
Use these repository permissions:
- Metadata: read-only
- Contents: read-only
- Pull requests: read-only
- Issues: read and write
- Checks: read and write
The Issues permission is required because GitHub PR comments use the Issues API.
## Render deployment
Create a Render Web Service connected to this repository.
Build command:
```bash
pnpm install --frozen-lockfile && pnpm --filter @submuxhq/codedecay-github-app build
```
Start command:
```bash
pnpm --filter @submuxhq/codedecay-github-app start
```
Required environment variables:
```text
GITHUB_APP_ID=
GITHUB_PRIVATE_KEY=
GITHUB_WEBHOOK_SECRET=
NODE_ENV=production
```
Optional environment variables:
```text
PORT=3000
GITHUB_WEBHOOK_PATH=/github/webhooks
```
If the private key is stored with escaped newlines, the service converts `\n`
back to PEM newlines at startup.
## First staging test
Before installing the app broadly:
1. install the GitHub App on a test repository,
2. open a harmless documentation-only PR,
3. confirm the CodeDecay check run appears,
4. confirm one PR comment is created,
5. push another commit to the PR,
6. confirm the existing CodeDecay comment is updated instead of duplicated.
Do not enable branch protection around the app until the staging PR behavior is
verified.
## Safety boundary
The hosted app intentionally has a narrow v0 boundary:
- no telemetry,
- no LLM or model calls,
- no arbitrary command execution,
- no project test/start/deploy command execution,
- no persisted repository checkout,
- temporary checkout directories are removed after analysis.
Future hosted execution or red-team behavior should require a separate design
and sandboxing review.
---
# Execution Probes
Source page: https://SubmuxHQ.github.io/CodeDecay/execution
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/execution.md
# Execution Probes
CodeDecay can run explicitly configured project commands, behavior probes, and
tool adapters with `codedecay execute`.
Execution is opt-in. By default, CodeDecay does not run project commands. A repo
must set `safety.allowCommands: true` in CodeDecay config before commands,
probes, or tool adapters execute.
## Run
```bash
npx codedecay execute --format markdown
npx codedecay execute --cwd ../my-repo --format json
npx codedecay execute --cwd ../my-repo --format json --output codedecay-execute.json
```
Exit codes:
- `0`: all configured commands passed, or all commands were safely skipped.
- `1`: one or more configured commands failed, timed out, or errored.
- `2`: CLI/internal error, such as an invalid config file.
## Config
```yaml
version: 1
commands:
test:
- pnpm test
build:
- pnpm build
start:
- pnpm dev
probes:
- name: users api
command: curl -f http://localhost:3000/api/users
timeoutMs: 5000
toolAdapters:
playwright:
command: pnpm exec playwright test
stryker:
command: pnpm exec stryker run
schemathesis:
schema: docs/openapi.yaml
baseUrl: http://127.0.0.1:3000
pact:
command: pnpm run test:pact
safety:
commandTimeoutMs: 120000
allowCommands: true
```
CodeDecay supports these configured command groups:
- `commands.test`
- `commands.build`
- `commands.start`
- `probes`
- `toolAdapters.playwright`
- `toolAdapters.stryker`
- `toolAdapters.schemathesis`
- `toolAdapters.pact`
Each command runs from the configured `--cwd` directory. Probe-level
`timeoutMs` overrides the global `safety.commandTimeoutMs`. Tool adapters use
their own configured command and timeout, then return normalized tool evidence
separately from raw command/probe results.
## Safety Rules
- CodeDecay only runs commands from CodeDecay config.
- CodeDecay does not run commands suggested by LLMs, MCP clients, memory files,
or remote services.
- Command execution is disabled unless `safety.allowCommands` is true.
- Command output is captured locally in the execution report.
- Tool adapter evidence is reported separately from AI suggestions.
- No telemetry, API keys, cloud services, LLMs, or model calls are required.
`commands.start` should use a short-lived smoke command or a low timeout unless
you intentionally want CodeDecay to verify that a long-running service starts
and then times out.
---
# Differential Behavior Checks
Source page: https://SubmuxHQ.github.io/CodeDecay/differential
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/differential.md
# Differential Behavior Checks
`codedecay differential` compares configured probe behavior between two git
refs. It creates temporary worktrees for `--base` and `--head`, runs the same
configured probes in both worktrees, reports behavior differences, and removes
the worktrees afterward.
Differential checks are useful when a PR looks locally tested but may change a
real behavior path outside the touched files.
## Run
```bash
npx codedecay differential --base main --head HEAD --format markdown
npx codedecay differential --cwd ../my-repo --base origin/main --head HEAD --format json
npx codedecay differential --base main --head HEAD --output codedecay-differential.md
```
`--base` and `--head` are required.
Exit codes:
- `0`: configured probes behaved the same, or probes were safely skipped.
- `1`: probe behavior changed, timed out, or hit an execution error.
- `2`: CLI/internal error, such as missing refs or invalid config.
## What It Compares
CodeDecay compares each configured probe by:
- command status
- exit code
- JSON stdout when stdout is valid JSON
- text stdout when stdout is not JSON
- stderr
The report includes base/head status, exit codes, output snippets for changed
or failed probes, and the exact differences detected.
## Config
Differential checks use probes from the current repo config:
```yaml
version: 1
commands: {}
probes:
- name: users api
command: node scripts/check-users-api.js
timeoutMs: 5000
safety:
commandTimeoutMs: 120000
allowCommands: true
```
Only `probes` are used by `codedecay differential`. Test, build, and start
commands are handled by `codedecay execute`.
## Safety Model
- Probes must come from CodeDecay config.
- `safety.allowCommands` must be true or probes are skipped.
- Probes run in temporary git worktrees, not by mutating the current checkout.
- Worktrees are removed after the run.
- CodeDecay does not run commands from LLMs, memory files, MCP clients, or
remote services.
- No telemetry, API keys, cloud services, LLMs, or model calls are required.
---
# Test Proof Audit
Source page: https://SubmuxHQ.github.io/CodeDecay/test-audit
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/test-audit.md
# Test Proof Audit
CodeDecay summarizes deterministic test signals into a test proof audit.
The audit asks:
```text
Are the changed tests actually proving the changed behavior will not break?
```
The first implementation is deterministic and uses existing analyzer findings.
It does not run mutation testing, execute commands, call models, or use cloud
services.
## Statuses
- `missing`: changed source behavior does not have nearby changed test proof.
- `weak`: changed tests exist, but deterministic rules found weak proof
signals.
- `present`: changed tests are present and no deterministic weak-test signals
were found.
- `not_applicable`: no changed source or test files require a test proof audit.
## Current Signals
The audit consumes existing analyzer findings, including:
- `missing-nearby-tests`
- `test-without-assertions`
- `snapshot-only-test`
- `mocked-changed-source`
- `unrelated-test-change`
- `copied-implementation-in-test`
- `happy-path-only-test`
- `heavy-mocking`
- `test-bloat`
## Future OSS Adapters
Future adapters such as StrykerJS can add stronger mutation-testing evidence to
this audit. They should remain explicit, local-first, and opt-in.
---
# First PR Safety Efficacy Benchmark
Source page: https://SubmuxHQ.github.io/CodeDecay/evals/first-efficacy-report
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/evals/first-efficacy-report.md
# First PR Safety Efficacy Benchmark
This benchmark is a small, deterministic proof that CodeDecay can catch seeded PR risks that ordinary passing tests miss.
It is not a claim that CodeDecay makes every PR safe. It is a regression harness for the product promise: find what a coding agent may have missed before merge.
## How to run
```bash
pnpm eval:pr-safety -- --run-id local-pr-safety-eval
```
Artifacts are written under `.codedecay/local/evals//`.
## Current benchmark result
- Status: passed
- Scenarios: 2
- Issues: 0
## Scenarios
### API/auth regression hidden by copied implementation tests
A coding agent can add tests that mirror the changed implementation while missing the real API authorization regression.
| Signal | Result |
| --- | --- |
| Scenario status | passed |
| Baseline tests | exit 0 |
| Baseline behavior probe | exit 0 |
| Risky weak tests | exit 0 |
| Risky behavior probe | exit 1 |
| CodeDecay risk | high (100/100 merge, 0/100 decay) |
| Test proof status | weak |
| Weak-test findings | 2 |
| Missing-test findings | 0 |
Expected evidence:
- Pass: baseline tests pass
- Pass: baseline behavior probe passes
- Pass: risky weak tests still pass
- Pass: risky behavior probe catches regression
- Pass: CodeDecay reports high risk
- Pass: CodeDecay reports expected impacted areas
- Pass: CodeDecay reports expected finding rules
- Pass: Redteam report classifies test proof correctly
- Pass: Redteam report contains expected weak-test evidence
- Pass: Redteam report contains expected missing-test evidence
- Pass: Redteam report suggests edge cases
- Pass: Redteam edge cases are actionable
- Pass: Redteam report creates fix tasks
- Pass: Redteam fix tasks are actionable
### Config/database runtime regression missed by normal tests
A PR can pass a narrow unit test while changing runtime defaults and database semantics that affect production behavior.
| Signal | Result |
| --- | --- |
| Scenario status | passed |
| Baseline tests | exit 0 |
| Baseline behavior probe | exit 0 |
| Risky weak tests | exit 0 |
| Risky behavior probe | exit 1 |
| CodeDecay risk | high (76/100 merge, 0/100 decay) |
| Test proof status | missing |
| Weak-test findings | 0 |
| Missing-test findings | 1 |
Expected evidence:
- Pass: baseline tests pass
- Pass: baseline behavior probe passes
- Pass: risky weak tests still pass
- Pass: risky behavior probe catches regression
- Pass: CodeDecay reports high risk
- Pass: CodeDecay reports expected impacted areas
- Pass: CodeDecay reports expected finding rules
- Pass: Redteam report classifies test proof correctly
- Pass: Redteam report contains expected weak-test evidence
- Pass: Redteam report contains expected missing-test evidence
- Pass: Redteam report suggests edge cases
- Pass: Redteam edge cases are actionable
- Pass: Redteam report creates fix tasks
- Pass: Redteam fix tasks are actionable
## Safety boundaries
- No telemetry.
- No cloud dependency.
- No API keys.
- No LLM/model calls.
- Fixtures run inside local temporary git repositories.
The benchmark uses deterministic CodeDecay reports plus explicit behavior probes. AI or agent suggestions should be evaluated separately from this tool evidence.
---
# Tool Adapters
Source page: https://SubmuxHQ.github.io/CodeDecay/tool-adapters
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/tool-adapters.md
# Tool Adapters
CodeDecay should use existing open-source tools instead of rebuilding their
capabilities. Tool adapters normalize local tool execution into CodeDecay
harness evidence.
The first adapters are:
- Playwright for browser/user-flow checks.
- StrykerJS for mutation-testing evidence.
- Schemathesis for OpenAPI/GraphQL API fuzzing evidence.
- Pact for contract-testing evidence.
## Configuring Adapters
Adapters are configured in CodeDecay config. `codedecay redteam` lists adapter
plans but does not run them.
```yaml
version: 1
toolAdapters:
playwright: true
stryker:
command: pnpm exec stryker run
schemathesis:
schema: docs/openapi.yaml
baseUrl: http://127.0.0.1:3000
pact:
command: pnpm run test:pact
safety:
allowCommands: false
```
Set `safety.allowCommands: true` only for explicit execution commands. Redteam
reports remain report-only even when adapter plans are configured.
## Playwright Harness
The Playwright harness is a private internal package API for now:
```ts
createPlaywrightHarness({
command: "pnpm exec playwright test",
allowCommands: true
});
```
Safety defaults:
- command execution is disabled unless `allowCommands: true` is provided,
- commands go through `@submuxhq/codedecay-execution`,
- unsafe commands are blocked by the shared safety policy,
- Playwright is not installed by CodeDecay,
- browsers are not installed by CodeDecay,
- no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used.
The default command is:
```bash
pnpm exec playwright test
```
Projects can override the command when they already have their own Playwright
script, shard, config file, or browser setup.
## StrykerJS Harness
The StrykerJS harness is also a private internal package API for now:
```ts
createStrykerHarness({
command: "pnpm exec stryker run",
allowCommands: true
});
```
Safety defaults:
- command execution is disabled unless `allowCommands: true` is provided,
- commands go through `@submuxhq/codedecay-execution`,
- unsafe commands are blocked by the shared safety policy,
- StrykerJS is not installed by CodeDecay,
- no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used.
The default command is:
```bash
pnpm exec stryker run
```
Projects can override the command when they already have their own Stryker
script, mutation score threshold, or package manager setup.
## Schemathesis Harness
The Schemathesis harness is also a private internal package API for now:
```ts
createSchemathesisHarness({
schema: "openapi.yaml",
baseUrl: "http://127.0.0.1:3000",
allowCommands: true
});
```
Safety defaults:
- command execution is disabled unless `allowCommands: true` is provided,
- commands go through `@submuxhq/codedecay-execution`,
- unsafe commands are blocked by the shared safety policy,
- Schemathesis is not installed by CodeDecay,
- API servers are not started by CodeDecay,
- no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used.
The default command is:
```bash
st run openapi.yaml --url http://127.0.0.1:3000
```
Projects can override the full command when they already use a different
Schemathesis entry point, package manager, schema location, base URL, or
service startup flow:
```ts
createSchemathesisHarness({
command: "uvx schemathesis run docs/openapi.yaml --url http://127.0.0.1:4000",
allowCommands: true
});
```
## Pact Harness
The Pact harness is also a private internal package API for now:
```ts
createPactHarness({
command: "pnpm run test:pact",
allowCommands: true
});
```
Safety defaults:
- command execution is disabled unless `allowCommands: true` is provided,
- commands go through `@submuxhq/codedecay-execution`,
- unsafe commands are blocked by the shared safety policy,
- Pact is not installed by CodeDecay,
- Pact Broker or PactFlow are not required by CodeDecay,
- no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used.
The default command is:
```bash
pnpm run test:pact
```
Projects can override the command when they already have their own Pact
consumer/provider test script, local pact file setup, or broker-backed CI flow.
## Future Adapters
The same package can add adapters for coverage tools and test runners. Each
adapter should use safe configured execution and return evidence rather than
bypassing CodeDecay safety rules.
---
# Agent Skills
Source page: https://SubmuxHQ.github.io/CodeDecay/skills
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/skills.md
# Agent Skills
CodeDecay can load repo-local agent skills from:
```text
.agents/skills/*/SKILL.md
```
Skills are portable review instructions for the developer or their own agent.
They can help Codex, Claude Code, Cursor, MCP clients, desktop agents, or
internal company agents ask better PR-safety questions.
CodeDecay treats skill files as local, untrusted context:
- it does not execute skill content,
- it does not follow arbitrary links from skill files,
- it does not fetch external skills,
- it does not call an LLM,
- it does not send telemetry.
## Example
```text
.agents/skills/pr-red-team/SKILL.md
.agents/skills/test-quality-review/SKILL.md
```
Each skill should start with a Markdown title and a short first paragraph:
```markdown
# PR Red-Team Skill
Find what a coding agent may have missed before merge.
```
`codedecay redteam` includes a compact `Agent Skills` section with the skill
title, path, and summary. Full skill content stays in the repo-local skill file
for the user's agent to read when needed.
## Current Scope
The first loader only reads `.agents/skills/*/SKILL.md` from the analyzed repo.
Future adapters can map the same concept to other local or user-owned skill
systems, but the OSS default remains local-first.
---
# Local Repo Memory
Source page: https://SubmuxHQ.github.io/CodeDecay/memory
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/memory.md
# Local Repo Memory
CodeDecay can read repo-local memory from `.codedecay/memory.json` and use it
to enrich PR risk reports with project-specific flows, commands, invariants,
architecture notes, and past regressions.
Memory is optional. If no memory file exists, CodeDecay uses empty defaults.
The memory file is local to the repository, is never uploaded by CodeDecay, and
does not require telemetry, API keys, LLMs, model calls, or a hosted service.
## Inspect Memory
```bash
npx codedecay memory --format markdown
npx codedecay memory --cwd ../my-repo --format json
```
`codedecay analyze` automatically applies memory when `.codedecay/memory.json`
exists in the analyzed repository.
## File Format
```json
{
"version": 1,
"flows": [
{
"name": "Checkout",
"description": "Customer checkout from cart to payment confirmation.",
"areas": ["api", "ui"],
"checks": [
"failed card retry",
"missing shipping address",
"duplicate webhook delivery"
]
}
],
"commands": [
{
"name": "Checkout smoke tests",
"command": "pnpm test checkout",
"areas": ["api", "ui"]
}
],
"invariants": [
{
"name": "Auth fails closed",
"description": "Missing or invalid users must not become admins.",
"areas": ["auth"],
"severity": "high"
}
],
"architecture": [
{
"title": "Session boundary",
"note": "Session parsing feeds all API routes.",
"files": ["src/auth/*"]
}
],
"regressions": [
{
"title": "Anonymous admin fallback",
"description": "A previous fallback user path granted admin access.",
"areas": ["auth"],
"check": "request protected routes without a token",
"severity": "high"
}
]
}
```
All top-level arrays are optional. Unknown fields are ignored by v1.
## Matchers
Memory entries can match changed code by impacted area, file path, or both.
Supported `areas` values:
- `api`
- `ui`
- `database`
- `auth`
- `config`
- `test`
- `source`
- `docs`
Supported `files` values are simple path patterns:
- exact path: `src/auth/session.ts`
- contains match: `auth`
- wildcard match: `src/auth/*`
## Report Behavior
When memory matches a PR, CodeDecay may add:
- findings for impacted invariants
- findings for past regression areas
- findings for matching architecture notes
- recommended checks for flows
- recommended commands from the memory file
CodeDecay does not run memory commands automatically. They are reported as
project-specific checks for the user or future execution adapters.
## Future Adapters
The v1 memory provider is the local `.codedecay/memory.json` file. CodeDecay
formalizes this behind a `MemoryProvider` interface so future adapters can map
the same provider shape to open-source or user-owned memory systems such as
Mem0 or Supermemory, while preserving the local-first default.
Any future hosted or external memory adapter should be opt-in, never required
for `codedecay analyze`, and must not change deterministic baseline scoring.
The built-in provider is:
```text
id: local
name: Local .codedecay memory
kind: local
```
External providers are not enabled by default. They must not add telemetry,
hidden network calls, API key requirements, LLM calls, or CodeDecayCloud
dependencies to the OSS workflow.
---
# LLM Providers
Source page: https://SubmuxHQ.github.io/CodeDecay/llm-providers
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/llm-providers.md
# LLM Providers
CodeDecay is deterministic by default. The default configuration does not call
an LLM, does not require API keys, and does not use a hosted CodeDecay model.
Future or opt-in red-team commands can use user-owned providers for edge-case
reasoning. Model output must be treated as untrusted suggestions, not commands
to execute.
## Disabled By Default
```yaml
llm:
provider: disabled
timeoutMs: 30000
```
This is the default when no config file exists.
## Local Ollama
Ollama support is designed for local models running on the user's machine.
```yaml
llm:
provider: ollama
model: qwen2.5-coder
endpoint: http://127.0.0.1:11434
timeoutMs: 30000
```
CodeDecay should only call this provider from commands that explicitly opt into
LLM assistance. The current deterministic `codedecay analyze` command does not
call an LLM.
## LiteLLM / OpenAI-Compatible BYOK
CodeDecay can construct a LiteLLM/OpenAI-compatible provider for local or BYOK
setups. It does not default to a hosted endpoint; you must provide the endpoint
and model explicitly.
```yaml
llm:
provider: litellm
model: gpt-4.1-mini
endpoint: http://127.0.0.1:4000/v1
apiKeyEnv: LITELLM_API_KEY
timeoutMs: 30000
```
`apiKeyEnv` is the name of an environment variable. Do not put literal API keys
in config files.
The provider uses an OpenAI-compatible `/chat/completions` request. Responses
are parsed into untrusted suggestions when possible. CodeDecay must not execute
commands from model output.
## Future Providers
The provider interface leaves room for additional adapters later. Those adapters
should remain optional and must not change the default local-first behavior.
---
# Sample CodeDecay Reports
Source page: https://SubmuxHQ.github.io/CodeDecay/sample-reports/
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/sample-reports/index.md
# Sample CodeDecay Reports
These reports show the output CodeDecay produces for a realistic JavaScript and
TypeScript pull request.
The sample diff changes:
- a UI route: `app/dashboard/page.tsx`
- an API route: `src/api/users.ts`
- an auth/session file: `src/auth/session.ts`
- a Prisma schema file: `prisma/schema.prisma`
- a build/runtime config file: `vite.config.ts`
The reports were generated with the local CodeDecay CLI from a temporary git
repository containing those changes.
## What To Read First
Start with the Markdown report:
- [sample-report.md](/sample-reports/sample-report)
Look at these sections first:
- **Overall risk**: the high-level merge risk and decay scores.
- **Likely Impacted Areas**: the app surfaces CodeDecay thinks may be affected.
- **Likely Impacted Routes And APIs**: concrete user/API paths to verify when
framework-aware route evidence is available.
- **High Risk Findings**: the findings most likely to need reviewer attention.
- **Recommended Checks**: tests or manual checks to run before merge. Prefer
checks that exercise the real route, API, UI, database, or downstream path
instead of only helper-level behavior.
For automation and integrations:
- sample-report.json is the stable machine-readable report.
- sample-report.sarif is the code-scanning-oriented report.
CodeDecay generated these reports locally without telemetry, API keys, LLMs, or
model calls.
---
# Sample CodeDecay Markdown Report
Source page: https://SubmuxHQ.github.io/CodeDecay/sample-reports/sample-report
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/sample-reports/sample-report.md
## CodeDecay Report
**Overall risk:** High
| Score | Value |
| --- | ---: |
| Merge risk | 100/100 |
| Decay risk | 62/100 |
| Findings | Count |
| --- | ---: |
| High | 5 |
| Medium | 4 |
| Low | 0 |
### Changed Files
- `app/dashboard/page.tsx` modified (+1/-1)
- `prisma/schema.prisma` modified (+3/-1)
- `src/api/users.ts` modified (+5/-1)
- `src/auth/session.ts` modified (+6/-1)
- `vite.config.ts` modified (+4/-1)
### Likely Impacted Areas
- High **API surface** (api): `src/api/users.ts`
- High **Authentication and authorization** (auth): `src/auth/session.ts`
- High **Database and schema** (database): `prisma/schema.prisma`
- Medium **Build and runtime configuration** (config): `vite.config.ts`
- Medium **UI route** (ui): `app/dashboard/page.tsx`
### Likely Impacted Routes And APIs
- Medium `/dashboard` (Next.js UI route): `app/dashboard/page.tsx`
### High Risk Findings
- **Risky source changes without changed tests** (`app/dashboard/page.tsx:2`): This PR changes risky source areas but does not change any obvious test files.
- **Api area changed** (`src/api/users.ts:1`): src/api/users.ts touches a api area and should be reviewed for regression impact.
- **Auth area changed** (`src/auth/session.ts:2`): src/auth/session.ts touches a auth area and should be reviewed for regression impact.
- **Database area changed** (`prisma/schema.prisma:2`): prisma/schema.prisma touches a database area and should be reviewed for regression impact.
- **Potential silent failure path** (`src/auth/session.ts:5`): src/auth/session.ts adds code that can hide type, lint, or runtime failures.
### Medium Risk Findings
- **Broad unrelated change set**: This PR changes 5 files across 4 top-level areas and 5 risk categories.
- **Config area changed** (`vite.config.ts:1`): vite.config.ts touches a config area and should be reviewed for regression impact.
- **Ui area changed** (`app/dashboard/page.tsx:2`): app/dashboard/page.tsx touches a ui area and should be reviewed for regression impact.
- **New unchecked TypeScript escape hatch** (`src/api/users.ts:1`): src/api/users.ts adds code that can hide type, lint, or runtime failures.
### Recommended Checks
- `Add or run tests covering app/dashboard/page.tsx`
- `Add or run tests covering src/api/users.ts`
- `Add or run tests covering src/auth/session.ts`
- `Add or run tests covering vite.config.ts`
### Notes
CodeDecay is deterministic and local-first. This report was generated without telemetry, API keys, LLMs, or model calls.
---
# Scoring Model
Source page: https://SubmuxHQ.github.io/CodeDecay/scoring
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/scoring.md
# Scoring Model
CodeDecay produces two scores from 0 to 100.
## Merge Risk
Merge risk estimates how likely the PR is to break behavior that reviewers or CI
should care about before merge.
Signals include:
- API route changes
- auth/session/security changes
- database/schema changes
- config/build/deployment changes
- risky source changes without nearby test changes
- heavy mocking that may weaken regression confidence
## Decay Score
Decay score estimates whether the PR makes the codebase harder to maintain.
Signals include:
- duplicated added logic
- large changed functions
- high function complexity
- compiler or linter suppressions
- unchecked TypeScript escape hatches
- broad unrelated change sets
- large test changes weakly connected to source changes
## Thresholds
- `0-39`: low
- `40-69`: medium
- `70-100`: high
Scores are capped by the highest relevant finding severity. A report with only
low-severity merge-risk findings stays low, even if many low findings are
present. A report with only medium-severity merge-risk findings stays at most
medium. High risk requires high-severity evidence.
The v1 scoring model is deterministic. The same diff should produce the same
score.
## No LLM Required
CodeDecay does not call a model to decide risk. It uses git diff data,
path-based impact detection, local JS/TS source analysis, and deterministic
rules.
---
# Research Basis
Source page: https://SubmuxHQ.github.io/CodeDecay/research
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/research.md
# Research Basis
CodeDecay is motivated by research on software evolution, pull request impact,
and AI-era code quality risks.
## SlopCodeBench
SlopCodeBench studies how coding agents degrade over long-horizon iterative
tasks. It tracks verbosity and structural erosion, including duplicated code and
complexity concentration in high-complexity functions.
Reference:
https://arxiv.org/html/2603.24755v1
## Does Code Decay?
“Does Code Decay? Assessing the Evidence from Change Management Data” connects
software evolution data to code decay and maintenance risk.
References:
- https://www.niss.org/sites/default/files/technicalreports/tr81.pdf
- https://dl.acm.org/doi/10.1109/32.895984
## Pull Request Change Impact
Pull request change impact research supports using code structure and changed
artifact relationships to improve review focus and risk awareness.
Reference:
https://link.springer.com/article/10.1007/s10664-024-10600-2
## AI Code Quality, Churn, And Duplication
AI-assisted development can increase code volume, churn, and duplicated code
when teams optimize only for immediate output. CodeDecay turns those concerns
into local PR checks.
Reference:
https://www.gitclear.com/ai_assistant_code_quality_2025_research
---
# Releasing
Source page: https://SubmuxHQ.github.io/CodeDecay/releasing
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/releasing.md
# Releasing
CodeDecay publishes one npm package for v1:
```text
@submuxhq/codedecay
```
The package source is `packages/cli`, and the installed binary remains
`codedecay`.
CodeDecay also publishes an optional GitHub Packages npm mirror with the same
package name. GitHub Packages scopes packages by GitHub user or organization
owner, and this repository is owned by `SubmuxHQ`, so every release from v0.2.0
onward uses `@submuxhq/codedecay` for both npmjs and the GitHub Packages mirror.
npmjs is the default public install path for users:
```bash
npm install -D @submuxhq/codedecay
```
GitHub Packages is an authenticated mirror for GitHub-based workflows and
requires registry authentication for installs.
## Patch Release Checklist
Before opening the release PR, bump the published version in:
- `packages/cli/package.json`
- `packages/core/src/index.ts`
After the release PR is merged, release only from a clean `main` branch at the
commit that will be tagged and published. Do not publish npm contents from a
different commit than the Git tag.
Run:
```bash
pnpm install
pnpm run lint
pnpm typecheck
pnpm test
pnpm build
pnpm --filter @submuxhq/codedecay pack --dry-run
```
Inspect the tarball before publishing:
```bash
pnpm --filter @submuxhq/codedecay pack
tar -tzf submuxhq-codedecay-.tgz
```
The tarball must include:
```text
package/LICENSE
package/README.md
package/package.json
package/dist/index.js
package/dist/index.d.ts
```
Before publishing, run the installed-package smoke against the packed tarball:
```bash
pnpm demo:published-package --tarball ./submuxhq-codedecay-.tgz --run-id v-tarball-smoke
```
This creates a fresh temp install, materializes the Next.js and Node API demo
repos, runs the installed `codedecay` binary, verifies JSON/Markdown/SARIF
outputs, and writes logs under:
```text
.codedecay/local/published-package-demo//run.json
.codedecay/local/published-package-demo//summary.md
```
Publish the scoped package with public access:
```bash
pnpm --filter @submuxhq/codedecay publish --access public
```
If npm requires a one-time password in a non-interactive shell, publish from the
package directory:
```bash
cd packages/cli
npm publish --access public --otp
```
After publishing, verify the public install path:
```bash
tmpdir=$(mktemp -d)
cd "$tmpdir"
npm install @submuxhq/codedecay@
node_modules/.bin/codedecay --help
```
After publishing, run the same smoke against the registry package:
```bash
pnpm demo:published-package --package @submuxhq/codedecay@ --run-id v-published-smoke
```
Create the GitHub release for the same tag and verify the release surfaces stay
in sync:
```bash
git show --no-patch --decorate --oneline v
gh release view v
npm view @submuxhq/codedecay version dist-tags --json
```
The package version, npm `latest` dist-tag, Git tag, and GitHub release should
all refer to the same released version before the release is considered done.
## GitHub Packages Mirror
The GitHub Packages mirror is published from the same built CLI package. It uses
the same package name, `@submuxhq/codedecay`, and sets the registry to
`https://npm.pkg.github.com`.
Use npmjs for public end-user installs. Use GitHub Packages only when a workflow
or organization policy specifically needs a GitHub-hosted package mirror.
Prepare the mirror package locally after `pnpm build:packages`:
```bash
pnpm package:github --out /tmp/codedecay-ghpkg
cd /tmp/codedecay-ghpkg
npm pack --dry-run
```
The dry run should include:
```text
package/LICENSE
package/README.md
package/package.json
package/dist/index.js
package/dist/index.d.ts
```
Publish through the `Publish GitHub Packages` workflow. It uses the repository
`GITHUB_TOKEN` with `packages: write` permission and skips publishing if the
same mirror version already exists.
Use the exact release tag as the workflow `ref`. Do not publish from `main`
after unreleased commits have landed, because that can create a GitHub Packages
version whose contents differ from the npmjs package with the same version.
Manual dispatch:
```bash
gh workflow run publish-github-packages.yml -f ref=v
```
Install from GitHub Packages by adding the GitHub owner scope and an
authenticated token to `.npmrc`:
```text
@submuxhq:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=
```
Then install:
```bash
npm install -D @submuxhq/codedecay@
node_modules/.bin/codedecay version
```
GitHub Packages requires a personal access token classic with `read:packages`
for local installs. GitHub Actions can use `GITHUB_TOKEN` when the package is
associated with this repository and the workflow has package access.
If local verification fails with `403 permission_denied`, check the token scope
before changing package metadata. The default public npmjs install path should
still work without GitHub authentication:
```bash
npm install -D @submuxhq/codedecay
```
---
# Framework-Aware Route/API Impact Map
Source page: https://SubmuxHQ.github.io/CodeDecay/proposals/framework-aware-impact-map
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/proposals/framework-aware-impact-map.md
# Framework-Aware Route/API Impact Map
Status: implementation started
Proposal issue: [#35](https://github.com/SubmuxHQ/CodeDecay/issues/35)
Implementation issue: [#144](https://github.com/SubmuxHQ/CodeDecay/issues/144)
## Goal
CodeDecay should make regression risk more actionable by translating changed
JavaScript and TypeScript files into affected routes, API endpoints, and request
surfaces where that can be done deterministically.
The first implementation should focus on Next.js and Node API projects because
they are common adoption paths for CodeDecay and are already represented by the
example projects.
## Non-Goals
- No LLM, model, cloud, telemetry, or API-key dependency.
- No runtime tracing or server startup.
- No attempt to prove whether a change was AI-generated.
- No broad generic code review comments.
- No scoring change until the extracted impact map is covered by fixtures and
report tests.
## Proposed Report Field
Add an optional top-level field to the JSON report:
```ts
interface ImpactedRoute {
framework: "nextjs" | "express" | "fastify" | "node";
kind: "ui-route" | "api-route" | "middleware" | "route-handler";
route: string;
methods: string[];
files: string[];
risk: "low" | "medium" | "high";
reasons: string[];
recommendedTests: string[];
}
```
Example:
```json
{
"impactedRoutes": [
{
"framework": "nextjs",
"kind": "api-route",
"route": "/api/users",
"methods": ["GET"],
"files": ["src/app/api/users/route.ts"],
"risk": "high",
"reasons": ["API route changed", "No nearby test changed"],
"recommendedTests": ["Add or run tests covering src/app/api/users/route.ts"]
}
]
}
```
Markdown reports should add a compact section after `Likely Impacted Areas`:
```markdown
### Likely Impacted Routes And APIs
- High `GET /api/users` (Next.js API route): `src/app/api/users/route.ts`
- Medium `/dashboard` (Next.js UI route): `src/app/dashboard/page.tsx`
```
SARIF should stay minimal for now. It can continue emitting file/line findings;
route data can be added later through SARIF `properties` only if GitHub code
scanning handles it cleanly.
## Supported Patterns
### Next.js
Supported first:
- `app/**/page.{js,jsx,ts,tsx}` -> UI route
- `src/app/**/page.{js,jsx,ts,tsx}` -> UI route
- `app/api/**/route.{js,ts}` -> API route
- `src/app/api/**/route.{js,ts}` -> API route
- `pages/api/**/*.{js,ts}` -> API route
- `src/pages/api/**/*.{js,ts}` -> API route
- `middleware.{js,ts}` and `src/middleware.{js,ts}` -> middleware
Route normalization:
- Remove `src/`, `app/`, and `pages/` prefixes.
- Remove `page`, `route`, and file extensions.
- Convert route groups like `(admin)` to no path segment.
- Preserve dynamic segments such as `[id]` and `[...slug]`.
- Convert `index` pages to the parent route.
- For `app/api/users/route.ts`, report `/api/users`.
- For `app/dashboard/page.tsx`, report `/dashboard`.
HTTP methods:
- For Next.js route handlers, detect exported functions named `GET`, `POST`,
`PUT`, `PATCH`, `DELETE`, `HEAD`, or `OPTIONS`.
- If no method is found, use `["*"]`.
- UI routes should use an empty method list.
### Express
Supported first:
- `app.get("/path", ...)`
- `app.post("/path", ...)`
- `router.get("/path", ...)`
- `router.post("/path", ...)`
- equivalent `put`, `patch`, `delete`, `head`, and `options`
File patterns to inspect:
- `src/routes/**/*.{js,ts}`
- `src/api/**/*.{js,ts}`
- `src/controllers/**/*.{js,ts}`
- `routes/**/*.{js,ts}`
- `api/**/*.{js,ts}`
- `server.{js,ts}`
- `app.{js,ts}`
The extractor should use AST parsing where practical and fall back to simple
literal-string matching only for route call expressions. It should not execute
application code.
### Fastify
Supported first:
- `fastify.get("/path", ...)`
- `fastify.post("/path", ...)`
- `server.get("/path", ...)`
- `server.route({ method: "GET", url: "/path" })`
- array methods in route objects, for example `method: ["GET", "POST"]`
Use the same file patterns as Express.
## Risk Mapping
Route/API impact risk should derive from existing deterministic signals:
- API route changed -> high
- auth/session/security file changed and route imports or lives near auth code
-> high
- database/schema file changed and route imports DB/model code -> high
- UI route changed -> medium
- middleware changed -> high
- route changed with no nearby tests -> add reason, do not duplicate the
existing `missing-nearby-tests` finding
The first implementation should not add new score weights. It should make the
report more specific while preserving the current scoring behavior.
## Required Tests
Analyzer fixtures:
- Next.js App Router UI route: `src/app/dashboard/page.tsx`
- Next.js App Router API route with exported `GET`
- Next.js dynamic route: `src/app/users/[id]/page.tsx`
- Next.js route group: `src/app/(admin)/dashboard/page.tsx`
- Next.js Pages API route: `src/pages/api/users.ts`
- Express router method calls for `GET` and `POST`
- Fastify shorthand calls and `server.route({ method, url })`
- No false route for non-route utility files
Report tests:
- JSON includes `impactedRoutes` when present.
- Markdown renders the route/API impact section.
- SARIF remains valid when route impact data exists.
CLI tests:
- Existing CLI output remains backward compatible.
- Snapshot or assertion covers a fixture PR with route impact data.
Example fixtures:
- Extend `examples/nextjs-risk-demo` expected summary once implemented.
- Extend `examples/node-api-risk-demo` expected summary once implemented.
## Implementation Plan
1. Add optional `impactedRoutes` types to `packages/core`.
2. Render `impactedRoutes` in JSON and Markdown reports.
3. Add Next.js deterministic route extraction in `packages/analyzer-js`.
4. Add Express and Fastify route extraction in `packages/analyzer-js`.
5. Add fixtures and tests before changing any scoring behavior.
6. Update sample reports and example README summaries.
## Open Questions
- Should dynamic routes stay as framework-native paths like `/users/[id]`, or
should CodeDecay normalize them to `/users/:id`?
- Should route impact eventually influence scoring, or remain explanatory only?
- Should imports be analyzed deeply enough to connect DB/auth changes to routes,
or should v1 keep that relationship path-based?
---
# RFC 0001: Agent-Agnostic Redteam Harness Architecture
Source page: https://SubmuxHQ.github.io/CodeDecay/rfcs/0001-agent-agnostic-redteam-harness
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/rfcs/0001-agent-agnostic-redteam-harness.md
# RFC 0001: Agent-Agnostic Redteam Harness Architecture
Status: proposed
## Summary
CodeDecay should evolve from deterministic PR risk analysis into an
agent-agnostic PR safety harness for AI-assisted development.
CodeDecay should not replace Codex, Claude Code, Cursor, Pi, OpenCode, desktop
agents, or internal company agents. It should give those agents better evidence,
skills, memory, execution results, weak-test findings, and merge-safety reports.
Positioning:
```text
CodeDecay is an open-source PR safety harness that works with your existing AI
agents and open-source tools.
Use Codex, Claude Code, Cursor, Pi, OpenCode, desktop agents, or any
MCP-compatible workflow. CodeDecay orchestrates evidence, memory, skills, test
audits, and runtime checks so your agent can find what it missed before merge.
```
Core question:
```text
What could this PR break, and are the tests actually proving it will not?
```
## Product Boundaries
CodeDecay owns:
- PR orchestration
- normalized evidence
- safety policy
- impact mapping
- test-quality audit
- tool-adapter coordination
- merge-safety reporting
- fix-task generation for coding agents
CodeDecay does not own:
- the user's main coding agent
- hosted model inference
- mandatory cloud memory
- every test runner or fuzzing engine
- every language parser
The default OSS experience must remain:
- local-first
- deterministic baseline
- no telemetry
- no required API keys
- no required LLM/model calls
- no required CodeDecayCloud
## Architecture
```text
User agent / IDE / harness
Codex | Claude Code | Cursor | Pi | OpenCode | Desktop app | Custom MCP client
|
v
CodeDecay CLI / MCP / GitHub Action / GitHub App
|
v
redteam orchestrator
|
+--> git diff and current deterministic analyzer
+--> impact map
+--> memory providers
+--> skill selection
+--> agent/harness adapters
+--> safe execution
+--> OSS tool adapters
+--> test audit
+--> base/head differential checks
|
v
merge-safety report + evidence bundle + fix tasks
|
v
back to the user's agent for implementation
```
## `codedecay redteam` Flow
1. Analyze the PR diff with the existing deterministic engine.
2. Build an impact map for routes, APIs, UI flows, jobs, database/schema, auth,
and config.
3. Load repo memory from `.codedecay/memory.json` and optional user-owned memory
providers.
4. Select relevant review skills.
5. Ask an optional user-owned agent or harness for missed risks and edge cases.
6. Run selected open-source tools through safe adapters.
7. Audit changed and nearby tests for weak or fake confidence.
8. Generate an edge-case checklist grounded in impacted areas.
9. Compare base/head behavior through configured probes when explicitly enabled.
10. Produce a merge-safety report that separates tool evidence from AI
suggestions.
11. Generate fix tasks for Codex, Claude Code, Cursor, Pi, OpenCode, or any
MCP-compatible agent.
## Module Plan
### `packages/redteam`
Purpose: owns the end-to-end `codedecay redteam` orchestration.
Inputs:
- cwd
- base/head refs
- CodeDecay config
- selected adapters
- optional agent/harness provider
Outputs:
- merge-safety report
- normalized evidence bundle
- fix tasks
Public API:
```ts
interface RedteamInput {
cwd: string;
base?: string;
head?: string;
mode: "deterministic" | "assisted";
}
interface RedteamResult {
report: MergeSafetyReport;
evidence: Evidence[];
fixTasks: FixTask[];
}
```
Safety:
- deterministic mode must work without agents or models
- no command execution unless explicitly configured
MVP:
- compose existing analyze, memory, config, and Markdown report data into a
redteam-shaped report
### `packages/harness`
Purpose: registry and interface for agent and tool harnesses.
Inputs:
- redteam task
- repo context
- selected skills
- evidence collected so far
Outputs:
- harness plan
- harness run result
- normalized evidence
- summary
Public API:
```ts
interface CodeDecayHarness {
name: string;
capabilities: HarnessCapability[];
requiredConfig: ConfigRequirement[];
plan(input: HarnessPlanInput): Promise;
run(plan: HarnessPlan, context: HarnessRunContext): Promise;
collectEvidence(result: HarnessRunResult): Promise;
summarize(evidence: Evidence[]): Promise;
}
```
Safety:
- adapters must declare whether they can execute commands, call models, or use
network access
- all failures become structured failure modes
MVP:
- generic process harness adapter
- in-memory registry
- evidence schema
### `packages/agent`
Purpose: optional user-owned AI provider interface.
Inputs:
- prompt/task
- selected skills
- evidence bundle
- model/provider config
Outputs:
- AI suggestions
- missing edge cases
- fix tasks
Public API:
```ts
interface AgentProvider {
name: string;
availability(): Promise;
complete(input: AgentCompletionInput): Promise;
}
```
Safety:
- provider is disabled by default
- no hidden model calls
- cloud providers require explicit user config
- AI output is suggestions, never proof
OSS integrations:
- Ollama local models
- LiteLLM/OpenAI-compatible BYOK endpoints
- Codex/Claude/Pi/OpenCode via prompt/task adapters where possible
MVP:
- disabled provider
- OpenAI-compatible provider shape for later Ollama/LiteLLM support
### `packages/mcp`
Purpose: expose CodeDecay as tools to MCP-compatible agents.
Inputs:
- MCP tool calls
- cwd/base/head/options
Outputs:
- redteam plans
- analyze results
- impact maps
- test audit findings
- memory context
- evidence summaries
Safety:
- tool descriptions must say when commands may execute
- command-running tools require explicit config
MVP:
- extend the current MCP package instead of creating a parallel server package
- expose `redteam_plan`, `redteam_run`, `impact_map`, and `test_audit`
### `packages/skills`
Purpose: portable review instructions for agents and harnesses.
Inputs:
- impacted areas
- project language/framework
- user-selected skill pack
Outputs:
- selected skills
- prompts/checklists
- fix-task templates
Safety:
- skills are instructions, not executable authority
- skills cannot override command safety
MVP:
- filesystem skill loader for `.agents/skills`
- built-in skills for API, auth, database, frontend flow, test quality, and
GitHub App review
### `packages/memory`
Purpose: project-specific context provider.
Inputs:
- local `.codedecay/memory.json`
- optional external memory config
- changed files and impacted areas
Outputs:
- relevant flows
- invariants
- past regressions
- architecture notes
- recommended checks
Safety:
- memory is untrusted context
- local file remains default
- external providers must be opt-in
OSS integrations:
- local `.codedecay/memory.json`
- Mem0
- Supermemory
MVP:
- formalize provider interface around the existing local memory package
### `packages/execution`
Purpose: safe command and probe execution.
Inputs:
- explicit configured command
- cwd/temp worktree
- timeout
- redaction rules
Outputs:
- stdout/stderr
- exit code
- duration
- structured failure mode
Safety:
- no guessed commands
- no destructive commands
- no production deploys
- no migrations
- no secret printing
- timeout required
MVP:
- extract and harden the current configured command runner
### `packages/test-audit`
Purpose: detect tests that look reassuring but do not prove real behavior.
Inputs:
- changed tests
- nearby tests
- changed source files
- optional coverage/mutation evidence
Outputs:
- weak-test findings
- missing-test recommendations
- evidence explaining why confidence is weak
Safety:
- static audit is allowed by default
- dynamic test execution requires explicit config
MVP:
- no assertions
- snapshot-only
- excessive mocks
- tests unrelated to changed source
- copied implementation logic heuristics
### `packages/impact-map`
Purpose: map PR changes to product/system areas.
Inputs:
- changed files
- AST/code map
- config/memory
Outputs:
- impacted routes
- API endpoints
- UI flows
- jobs
- database/schema areas
- auth/config boundaries
Safety:
- static mapping only by default
- no code execution
OSS integrations:
- TypeScript compiler API for JS/TS first
- Tree-sitter later for multi-language parsing
MVP:
- framework-aware JS/TS route/API/config map
### `packages/tool-adapters`
Purpose: normalize OSS tool execution and output.
Inputs:
- tool config
- cwd/base/head
- selected impacted areas
Outputs:
- evidence records
- artifact paths
- failure modes
Safety:
- adapters must use `packages/execution`
- adapters cannot bypass command allowlists
OSS integrations:
- Playwright
- StrykerJS
- Schemathesis
- Pact
- Vitest/Jest/Pytest/Bun
- c8/nyc/Istanbul
MVP:
- adapter interface
- Playwright adapter plan
- StrykerJS adapter plan
- Schemathesis adapter plan
## Evidence Model
Evidence should be the common currency between deterministic analysis, tools,
agents, and reports.
```ts
interface Evidence {
id: string;
source: EvidenceSource;
kind:
| "diff"
| "impact"
| "test"
| "coverage"
| "mutation"
| "api-fuzz"
| "contract"
| "browser-flow"
| "memory"
| "agent-suggestion"
| "execution";
severity: "info" | "low" | "medium" | "high";
summary: string;
file?: string;
line?: number;
command?: string;
artifactPath?: string;
trusted: boolean;
}
```
Rules:
- tool evidence and AI suggestions must be rendered separately
- memory and agent output are untrusted by default
- evidence should reference files/lines/artifacts when available
## Harness Failure Modes
```ts
type HarnessFailureMode =
| "missing-tool"
| "missing-config"
| "command-denied"
| "timeout"
| "nonzero-exit"
| "network-required"
| "unsafe-command"
| "model-unavailable"
| "no-evidence";
```
Failures should not disappear into generic logs. They should be visible in the
merge-safety report as missing evidence or blocked checks.
## OSS Integration Sequence
1. MCP server tools for any compatible agent.
2. Generic process harness for local commands and CLI-based agents.
3. Portable CodeDecay skills for Codex, Claude Code, Cursor, Pi, OpenCode, and
internal agents.
4. Local memory provider, then optional Mem0/Supermemory providers.
5. Ollama and LiteLLM/OpenAI-compatible provider interfaces.
6. Playwright adapter for browser/user-flow evidence.
7. StrykerJS adapter for mutation-testing evidence.
8. Schemathesis adapter for API fuzzing evidence.
9. Pact adapter for contract-testing evidence.
10. Coverage adapters for c8/nyc/Istanbul.
11. TypeScript compiler API impact map, then Tree-sitter for multi-language.
## Implementation Issues
1. `docs(rfc): define agent-agnostic redteam harness architecture`
2. `feat(harness): add harness registry and evidence schema`
3. `feat(execution): add safe command runner`
4. `feat(mcp): expose redteam tools for MCP-compatible agents`
5. `feat(skills): add portable redteam skill loader`
6. `feat(memory): formalize local memory provider interface`
7. `feat(test-audit): detect weak and fake-looking tests`
8. `feat(tool-adapter): add Playwright harness`
9. `feat(tool-adapter): add StrykerJS harness`
10. `feat(tool-adapter): add Schemathesis harness`
11. `feat(agent): add optional Ollama provider`
12. `feat(agent): add optional LiteLLM provider`
13. `feat(harness): add Pi convenience adapter`
## First Three PRs
1. RFC PR: this document only.
2. Harness PR: add `packages/harness` with registry, evidence schema, and tests.
3. Execution PR: extract/harden safe configured command runner.
## Open Questions
- Should `codedecay redteam` default to deterministic-only mode and require
`--assist` for any agent/model call?
- Should external memory providers live behind one provider interface or
separate adapter packages?
- Should GitHub App redteam execution remain disabled until sandboxed workers
exist?
- Should adapter artifacts be stored only in `.codedecay/artifacts/` or in a
user-specified output directory?
- Which first external agent workflow should get convenience docs: Codex,
Claude Code, Cursor, Pi, or OpenCode?
---
# CodeDecay v0.1.1 Launch Post
Source page: https://SubmuxHQ.github.io/CodeDecay/launch-post
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/launch-post.md
# CodeDecay v0.1.1 Launch Post
We released CodeDecay v0.1.1: an open-source, local-first CLI/GitHub Action
for detecting PR regression risk and maintainability decay. No API keys, no LLM
calls, no telemetry.
Install:
```bash
npm install -D @submuxhq/codedecay
```