# CodeDecay Docs Bundle

This is the concatenated Markdown bundle for the CodeDecay docs site.

---

# CodeDecay Docs

Source page: https://SubmuxHQ.github.io/CodeDecay/
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/index.md

## Read This First

- [Getting Started](/getting-started): install the CLI and run your first PR analysis
- [GitHub Action](/github-action): add CodeDecay to pull request workflows
- [Redteam Reports](/redteam): generate merge-safety reports for yourself or your coding agent
- [Agent Task Bundles](/agent): hand deterministic evidence to Codex, Claude Code, Cursor, Pi, OpenCode, or desktop agents
- [MCP Server](/mcp): expose CodeDecay as a local MCP tool for agent clients

## For Humans

- Use the sidebar and local search to navigate product docs quickly.
- Open [Sample Reports](/sample-reports/) to see the actual Markdown, JSON, and SARIF outputs before integrating CodeDecay.
- Use the GitHub edit links to tighten docs in the same repo that ships the code.

## For Agents

- [`/llms.txt`](/llms.txt): compact map of the docs site
- [`/llms-full.txt`](/llms-full.txt): one bundled Markdown context file
- <a href="./markdown/getting-started.md"><code>/markdown/getting-started.md</code></a>: per-page raw Markdown endpoints for direct retrieval

These endpoints are generated from the same source files as the docs site, so humans and agents read the same content instead of drifting copies.

---

# Getting Started

Source page: https://SubmuxHQ.github.io/CodeDecay/getting-started
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/getting-started.md

# Getting Started

CodeDecay analyzes pull requests for regression risk and maintainability decay.
It works locally and in CI without cloud services, telemetry, API keys, LLMs, or
model calls.

## Install

Use the package manager your repository already uses:

```bash
npm install -D @submuxhq/codedecay
pnpm add -D @submuxhq/codedecay
bun add -d @submuxhq/codedecay
yarn add -D @submuxhq/codedecay
```

For a no-install smoke test:

```bash
npx -y @submuxhq/codedecay --help
```

After a local install, run CodeDecay with `npx codedecay`, `pnpm codedecay`,
`bunx codedecay`, or add `codedecay` to a package script.

Do not run `npm install` inside a Bun, pnpm, or Yarn workspace that uses
`workspace:*` dependencies. npm may fail before CodeDecay is installed. In Bun
repos with `minimumReleaseAge`, a fresh CodeDecay release may also be blocked by
repo policy; for local evaluation you can override it explicitly:

```bash
bun add -d @submuxhq/codedecay --minimum-release-age 0
```

## Analyze A PR Diff

```bash
npx codedecay analyze --base main --head HEAD --format markdown
```

## Analyze Current Working Tree

```bash
npx codedecay analyze --format markdown
```

## Analyze Another Repository

```bash
npx codedecay analyze --cwd ../my-repo --format markdown
```

## Generate A Redteam Report

Use `redteam` when you want one report for yourself or your coding agent that
summarizes what the PR could break, weak-test evidence, missing edge cases, and
fix tasks.

```bash
npx codedecay redteam --base main --head HEAD --format markdown
```

The current redteam MVP is report-only. It does not run commands or call an LLM.

## Hand Evidence To Your Agent

Use `agent` when you want Codex, Claude Code, Cursor, a desktop agent, or
another user-owned agent to act on CodeDecay's findings.

```bash
npx codedecay agent --base main --head HEAD --format markdown --output codedecay-agent.md
```

Then give `codedecay-agent.md` to your agent and ask it to:

- fix high-risk findings first,
- add tests that exercise real API, UI, database, or downstream behavior,
- cover the missing edge cases listed by CodeDecay,
- run the relevant project checks,
- rerun CodeDecay after changes.

The agent bundle is local evidence plus instructions. CodeDecay does not call
Codex, Claude, Cursor, Ollama, cloud models, or CodeDecayCloud while creating
it.

## Recommended Local Loop

```bash
npx codedecay analyze --base main --head HEAD --format markdown
npx codedecay redteam --base main --head HEAD --format markdown --output codedecay-redteam.md
npx codedecay agent --base main --head HEAD --format markdown --output codedecay-agent.md
```

Use the redteam report to understand the PR risk. Use the agent bundle to give
your own coding agent the evidence, missing checks, and fix tasks it should
work through. After the agent changes code, run your project checks and run
CodeDecay again.

## Write SARIF

```bash
npx codedecay analyze --format sarif --output codedecay.sarif
```

## Inspect CodeDecay Config

Configuration is optional. Missing config uses safe defaults.

```bash
npx codedecay config --format markdown
```

## Fail CI On High Risk

```bash
npx codedecay analyze --base main --head HEAD --fail-on high
```

Risk levels:

- `0-39`: low
- `40-69`: medium
- `70-100`: high

## Try An Example

Use the example projects to see a realistic high-risk report before wiring
CodeDecay into your own repository:

- [Next.js risk demo](https://github.com/SubmuxHQ/CodeDecay/blob/main/examples/nextjs-risk-demo/README.md)
- [Node API risk demo](https://github.com/SubmuxHQ/CodeDecay/blob/main/examples/node-api-risk-demo/README.md)

---

# GitHub Action

Source page: https://SubmuxHQ.github.io/CodeDecay/github-action
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/github-action.md

# GitHub Action

CodeDecay ships a composite GitHub Action wrapper around the bundled CLI.

```yaml
name: CodeDecay

on:
  pull_request:

jobs:
  codedecay:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: SubmuxHQ/CodeDecay/packages/github-action@v0
        with:
          mode: analyze
          base: ${{ github.event.pull_request.base.sha }}
          head: ${{ github.event.pull_request.head.sha }}
          cwd: .
          format: markdown
          fail-on: high
```

## SARIF Output

```yaml
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
  with:
    mode: analyze
    base: ${{ github.event.pull_request.base.sha }}
    head: ${{ github.event.pull_request.head.sha }}
    cwd: .
    format: sarif
    output: codedecay.sarif
    fail-on: high
```

Relative `output` paths resolve from `cwd`. For example, with `cwd:
packages/web` and `output: codedecay.sarif`, the SARIF file is written to
`packages/web/codedecay.sarif`. Absolute `output` paths are honored exactly.

```yaml
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
  with:
    mode: analyze
    cwd: packages/web
    format: sarif
    output: codedecay.sarif
```

The MVP action writes a markdown summary to `$GITHUB_STEP_SUMMARY`. SARIF upload
can be added by the workflow using GitHub's code scanning upload action.

## Redteam And Agent Modes

The action can also run report-only redteam and agent bundle modes. Redteam
mode is useful as a Step Summary because it includes impact, memory, edge cases,
and fix tasks for a user-owned agent:

```yaml
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
  with:
    mode: redteam
    base: ${{ github.event.pull_request.base.sha }}
    head: ${{ github.event.pull_request.head.sha }}
    cwd: .
    format: markdown
```

```yaml
- uses: SubmuxHQ/CodeDecay/packages/github-action@v0
  with:
    mode: agent
    base: ${{ github.event.pull_request.base.sha }}
    head: ${{ github.event.pull_request.head.sha }}
    cwd: .
    format: markdown
    output: codedecay-agent.md
```

Supported modes are `analyze`, `redteam`, and `agent`. The action does not
expose command-executing modes. `format: sarif` is supported only with
`mode: analyze`. `fail-on` is forwarded for `analyze` and `redteam`; `agent`
mode produces a task bundle for a user-owned coding agent and does not gate the
workflow by risk level.

Use `fail-on` with `analyze` when you want a deterministic CI gate. You can also
add `fail-on` to `redteam` if your repository wants strict risk-score gating.
The CodeDecay repository dogfoods `redteam` report-only so the Step Summary is
always available while lint, typecheck, tests, build, package dry-run, and the
PR safety efficacy eval remain the hard validation gates.

---

# Configuration

Source page: https://SubmuxHQ.github.io/CodeDecay/configuration
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/configuration.md

# Configuration

CodeDecay can load repo-local configuration for red-team orchestration, tool
adapter plans, and real behavior probes.

Configuration is optional. If no config file exists, CodeDecay uses safe
defaults and does not run project commands.

## Supported Files

CodeDecay discovers the first matching file from the analysis working directory:

- `.codedecay/config.yml`
- `.codedecay/config.yaml`
- `codedecay.config.yml`
- `codedecay.config.yaml`

Use `--cwd` to inspect another repository:

```bash
npx codedecay config --cwd ../my-repo --format markdown
```

## Example

```yaml
version: 1

commands:
  test:
    - pnpm test
  build:
    - pnpm build
  start:
    - pnpm dev

probes:
  - name: users api
    command: curl -f http://localhost:3000/api/users
    timeoutMs: 5000

toolAdapters:
  playwright: true
  stryker:
    command: pnpm exec stryker run
  schemathesis:
    schema: docs/openapi.yaml
    baseUrl: http://127.0.0.1:3000
  pact:
    command: pnpm run test:pact

safety:
  commandTimeoutMs: 120000
  allowCommands: false

llm:
  provider: disabled
  timeoutMs: 30000
```

Optional user-owned model providers must be configured explicitly. For a local
LiteLLM or other OpenAI-compatible endpoint:

```yaml
llm:
  provider: litellm
  model: gpt-4.1-mini
  endpoint: http://127.0.0.1:4000/v1
  apiKeyEnv: LITELLM_API_KEY
  timeoutMs: 30000
```

Use `apiKeyEnv` to point at an environment variable name. Do not store literal
API keys in CodeDecay config.

## Safety Model

Config files make project commands explicit. CodeDecay should not guess commands
from model output or run arbitrary commands by default.

Current behavior:

- `codedecay analyze` does not require config.
- `codedecay config` only loads and prints config.
- `codedecay redteam` lists configured tool adapters as planned local checks,
  but does not run them.
- `codedecay execute` runs only commands and probes from config, and only when
  `safety.allowCommands` is true.
- `codedecay differential` runs only configured probes on temporary base/head
  worktrees, and only when `safety.allowCommands` is true.
- missing config returns safe defaults.
- no telemetry, API keys, LLM calls, or cloud services are used.
- LLM use is disabled by default. LLM-backed commands must opt in
  explicitly and treat model output as untrusted suggestions.

Execution uses this config as its allowlisted command source. See
[Execution probes](execution.md) and
[Differential behavior checks](differential.md).

Tool adapters are also configured here. See [Tool adapters](tool-adapters.md)
for Playwright, StrykerJS, Schemathesis, and Pact adapter details.

Read [LLM providers](llm-providers.md) for optional local/BYOK model adapters.

---

# Redteam Reports

Source page: https://SubmuxHQ.github.io/CodeDecay/redteam
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/redteam.md

# Redteam Reports

`codedecay redteam` packages local PR safety evidence into a report that a
developer or their own coding agent can use before merge.

It asks:

```text
What could this PR break, and are the tests actually proving it will not?
```

The command is report-only in the current MVP. It does not run configured
commands, does not call an LLM, does not require API keys, does not send
telemetry, and does not depend on CodeDecayCloud.

Use it when you want a local merge-safety brief for Codex, Claude Code, Cursor,
desktop agents, or another user-owned agent. CodeDecay provides deterministic
tool evidence; the receiving agent still has to inspect the code and prove fixes
with tests or configured checks.

## Run

```bash
npx codedecay redteam --base main --head HEAD --format markdown
npx codedecay redteam --cwd ../my-repo --format json
npx codedecay redteam --format markdown --output codedecay-redteam.md
```

Exit codes:

- `0`: report generated and risk is below `--fail-on`, if provided.
- `1`: report generated and risk meets `--fail-on`.
- `2`: CLI/internal error, such as invalid git refs or invalid config.

## What The Report Includes

- changed files and impacted product/system areas
- concrete route/API impacts when CodeDecay can detect them, such as Next.js
  API routes, Next.js UI routes, Express handlers, or Fastify handlers
- merge-risk and decay-risk scores
- test proof audit status: `missing`, `weak`, `present`, or `not_applicable`
- weak-test and missing-test findings from deterministic test-audit rules
- deterministic missing edge-case checklist
- local memory summary from `.codedecay/memory.json`
- repo-local agent skill summaries from `.agents/skills/*/SKILL.md`
- configured test/build/start/probe commands that are available but not run
- configured Playwright, StrykerJS, Schemathesis, and Pact tool adapters that
  are planned but not run
- fix tasks for your coding agent
- explicit safety flags showing that commands and models were not called

## Agent-Agnostic Workflow

CodeDecay does not replace Codex, Claude Code, Cursor, Pi, OpenCode, desktop
agents, or internal agents. Use it to give those tools better evidence.

Suggested workflow:

1. Run `codedecay redteam --format markdown`.
2. Start with the impacted route/API section and ask what real user/API path
   reaches each changed file.
3. Paste or attach the report to your coding agent.
4. Ask the agent to fix the high-risk findings and add real checks for the
   impacted routes, missing edge cases, and weak-test findings.
5. Run `codedecay analyze`, `codedecay execute`, or `codedecay differential`
   explicitly when you want static analysis, configured checks, or base/head
   behavior probes.

See [Agent skills](skills.md) for the local skill file format.

## Safety Model

`codedecay redteam` lists configured checks and tool adapter plans from
CodeDecay config, but it does not execute them. Command execution remains
explicit through `codedecay execute` and `codedecay differential`, and those
commands still require `safety.allowCommands: true`.

Model use is also opt-in. The redteam MVP does not call Ollama, LiteLLM, cloud
models, or any hosted CodeDecay service.

---

# Agent Task Bundles

Source page: https://SubmuxHQ.github.io/CodeDecay/agent
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/agent.md

# Agent Task Bundles

`codedecay agent` turns a deterministic redteam report into a task bundle for a
user-owned coding agent.

Use it when you want Codex, Claude Code, Cursor, Pi, OpenCode, a desktop agent,
or another local agent to fix what CodeDecay found without CodeDecay making a
hidden model call.

```bash
npx codedecay agent --base main --head HEAD --format markdown
npx codedecay agent --profile codex --format markdown
npx codedecay agent --cwd ../my-repo --format json --output codedecay-agent.json
```

The bundle includes:

- a copy-paste prompt for any user-owned coding agent
- changed files, impacted areas, and concrete route/API impacts when available
- weak-test and missing-test proof signals
- edge cases to check
- configured checks and tool adapters that are available but not run
- tasks for the coding agent
- repo-local skill summaries
- safety and limitation notes

## Agent Profiles

Profiles only shape the handoff instructions. They do not make CodeDecay call
the selected agent, call an LLM, require API keys, or send code anywhere.

Supported profiles:

- `generic`: portable bundle for any user-owned agent.
- `codex`: handoff wording for a Codex repo session.
- `claude-code`: handoff wording for Claude Code.
- `cursor`: handoff wording for Cursor chat or agent mode.
- `pi`: handoff wording for Pi harness or Pi-compatible agent workflows.
- `opencode`: handoff wording for OpenCode.
- `desktop`: handoff wording for desktop or local agent apps.

Example:

```bash
npx codedecay agent --profile cursor --format markdown --output codedecay-agent.md
```

## How To Use

1. Run `codedecay agent`.
2. Copy the prompt from the `Copy-Paste Prompt` section.
3. Give the prompt and Markdown or JSON output to your agent.
4. Ask the agent to start from impacted routes/APIs and explain what real user,
   API, database, or downstream path could break.
5. Ask the agent to complete the listed tasks with real tests and behavior
   checks.
6. Run CodeDecay again.

Example prompt style:

```text
Use this CodeDecay agent task bundle as tool evidence.
Fix the listed PR risks.
Do not assume the PR is safe because tests pass.
Add or improve tests that exercise real behavior paths.
After changes, tell me what checks to run.
```

For JSON consumers, route/API evidence is available under
`evidence.impactedRoutes`. Treat it as tool evidence for the agent's fix plan:
the agent should map each proposed fix back to the changed file, route/API, weak
test signal, and missing edge case it addresses.

## Safety

`codedecay agent` is report-only.

It does not:

- call an LLM or hosted model
- execute commands
- send telemetry
- require API keys
- depend on CodeDecayCloud

Agent output is not trusted evidence by itself. Treat the agent's response as a
proposal until it is verified by tests, configured checks, or manual review.

---

# MCP Server

Source page: https://SubmuxHQ.github.io/CodeDecay/mcp
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/mcp.md

# MCP Server

CodeDecay can run as a local Model Context Protocol server so agent clients can
ask it for PR risk, impact maps, weak-test audits, and deterministic edge-case
suggestions. It can also run explicitly configured local checks when the caller
confirms execution.

The MCP server calls local CodeDecay analysis only. It does not call an LLM,
does not require API keys, and does not send telemetry. Command execution is
opt-in and limited to commands already present in CodeDecay config.

## Run Locally

```bash
npx @submuxhq/codedecay mcp --cwd /path/to/repo
```

## Example MCP Client Config

Exact config shape varies by client. The important part is that the command
runs CodeDecay locally and passes the repository path with `--cwd`.

```json
{
  "mcpServers": {
    "codedecay": {
      "command": "npx",
      "args": ["-y", "@submuxhq/codedecay", "mcp", "--cwd", "/path/to/repo"]
    }
  }
}
```

## Tools

- `analyze_pr`: returns a Markdown or JSON CodeDecay report.
- `impact_map`: returns changed files, impacted areas, and concrete route/API
  impacts when CodeDecay can detect them.
- `audit_tests`: returns missing-test and weak-test proof findings plus
  recommended checks.
- `suggest_edge_cases`: returns deterministic edge-case suggestions.
- `redteam_report`: returns a deterministic merge-safety report for your agent,
  including impacted areas, weak-test findings, edge cases, configured checks,
  memory summary, fix tasks, and safety flags.
- `agent_task_bundle`: returns a deterministic task bundle that Codex, Claude
  Code, Cursor, Pi, OpenCode, desktop agents, or other MCP-compatible agents can
  use to fix PR risks. It packages a copy-paste prompt, tool evidence, weak-test
  signals, edge cases, suggested checks, skills, and fix tasks. It accepts an
  optional `profile` value: `generic`, `codex`, `claude-code`, `cursor`, `pi`,
  `opencode`, or `desktop`.
- `execute_configured_checks`: runs configured CodeDecay commands, probes, and
  enabled tool adapters. It requires `confirmExecution: true` and
  `safety.allowCommands: true`.

Example execution tool input:

```json
{
  "confirmExecution": true,
  "format": "markdown"
}
```

## Safety

MCP clients should treat tool output as analysis, not as permission to execute
commands. The MCP server does not expose arbitrary command execution.

`redteam_report` is report-only. It does not run configured commands, call
Ollama or cloud models, send telemetry, or require CodeDecayCloud. It may include
local skill summaries from `.agents/skills/*/SKILL.md`, but it does not execute
skill content.

`agent_task_bundle` is also report-only. It uses the same deterministic
CodeDecay evidence as `codedecay agent`, and it does not call the MCP client,
Codex, Claude, Cursor, Ollama, cloud models, or CodeDecayCloud. The receiving
agent should treat the bundle as tool evidence plus instructions. The included
prompt is portable across Codex, Claude Code, Cursor, Pi, OpenCode, desktop
agents, and other MCP clients. The optional `profile` only changes handoff
wording; it does not call or authenticate with that agent. Any proposed fix
still needs verification with tests or configured checks.

`execute_configured_checks` is the only MCP tool that can execute local commands.
It never accepts command text from MCP input. It can only run commands from
`.codedecay/config.yml`, `codedecay.config.yml`, or enabled configured tool
adapters such as Playwright, StrykerJS, Schemathesis, and Pact.

Execution requires both:

- MCP input contains `confirmExecution: true`
- CodeDecay config contains `safety.allowCommands: true`

If confirmation is missing, CodeDecay returns a non-executing report. If
`safety.allowCommands` is false, configured checks use the existing skip behavior
and do not run.

---

# CodeDecay GitHub App

Source page: https://SubmuxHQ.github.io/CodeDecay/github-app
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/github-app.md

# CodeDecay GitHub App

The CodeDecay GitHub App is an optional hosted surface for teams that want
CodeDecay to run automatically on pull requests.

The app does not replace the CLI or GitHub Action. The open-source CLI remains
local-first and useful without the hosted app.

## What the app does

For pull request events, the app:

1. receives a GitHub webhook,
2. creates an in-progress CodeDecay check run,
3. checks out the pull request into a temporary directory,
4. runs deterministic CodeDecay analysis,
5. posts or updates one PR comment,
6. completes the check run.

For the first hosted version, the app only runs deterministic PR analysis. It
does not run project commands, deployment commands, LLM calls, model calls, or
CodeDecayCloud services.

## GitHub App settings

Create a GitHub App in the GitHub organization that will own the hosted app.

Set the webhook URL to:

```text
https://<render-service-host>/github/webhooks
```

Subscribe to these events:

- Pull request

Use these repository permissions:

- Metadata: read-only
- Contents: read-only
- Pull requests: read-only
- Issues: read and write
- Checks: read and write

The Issues permission is required because GitHub PR comments use the Issues API.

## Render deployment

Create a Render Web Service connected to this repository.

Build command:

```bash
pnpm install --frozen-lockfile && pnpm --filter @submuxhq/codedecay-github-app build
```

Start command:

```bash
pnpm --filter @submuxhq/codedecay-github-app start
```

Required environment variables:

```text
GITHUB_APP_ID=<numeric app id>
GITHUB_PRIVATE_KEY=<GitHub App private key PEM>
GITHUB_WEBHOOK_SECRET=<webhook secret configured in GitHub>
NODE_ENV=production
```

Optional environment variables:

```text
PORT=3000
GITHUB_WEBHOOK_PATH=/github/webhooks
```

If the private key is stored with escaped newlines, the service converts `\n`
back to PEM newlines at startup.

## First staging test

Before installing the app broadly:

1. install the GitHub App on a test repository,
2. open a harmless documentation-only PR,
3. confirm the CodeDecay check run appears,
4. confirm one PR comment is created,
5. push another commit to the PR,
6. confirm the existing CodeDecay comment is updated instead of duplicated.

Do not enable branch protection around the app until the staging PR behavior is
verified.

## Safety boundary

The hosted app intentionally has a narrow v0 boundary:

- no telemetry,
- no LLM or model calls,
- no arbitrary command execution,
- no project test/start/deploy command execution,
- no persisted repository checkout,
- temporary checkout directories are removed after analysis.

Future hosted execution or red-team behavior should require a separate design
and sandboxing review.

---

# Execution Probes

Source page: https://SubmuxHQ.github.io/CodeDecay/execution
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/execution.md

# Execution Probes

CodeDecay can run explicitly configured project commands, behavior probes, and
tool adapters with `codedecay execute`.

Execution is opt-in. By default, CodeDecay does not run project commands. A repo
must set `safety.allowCommands: true` in CodeDecay config before commands,
probes, or tool adapters execute.

## Run

```bash
npx codedecay execute --format markdown
npx codedecay execute --cwd ../my-repo --format json
npx codedecay execute --cwd ../my-repo --format json --output codedecay-execute.json
```

Exit codes:

- `0`: all configured commands passed, or all commands were safely skipped.
- `1`: one or more configured commands failed, timed out, or errored.
- `2`: CLI/internal error, such as an invalid config file.

## Config

```yaml
version: 1

commands:
  test:
    - pnpm test
  build:
    - pnpm build
  start:
    - pnpm dev

probes:
  - name: users api
    command: curl -f http://localhost:3000/api/users
    timeoutMs: 5000

toolAdapters:
  playwright:
    command: pnpm exec playwright test
  stryker:
    command: pnpm exec stryker run
  schemathesis:
    schema: docs/openapi.yaml
    baseUrl: http://127.0.0.1:3000
  pact:
    command: pnpm run test:pact

safety:
  commandTimeoutMs: 120000
  allowCommands: true
```

CodeDecay supports these configured command groups:

- `commands.test`
- `commands.build`
- `commands.start`
- `probes`
- `toolAdapters.playwright`
- `toolAdapters.stryker`
- `toolAdapters.schemathesis`
- `toolAdapters.pact`

Each command runs from the configured `--cwd` directory. Probe-level
`timeoutMs` overrides the global `safety.commandTimeoutMs`. Tool adapters use
their own configured command and timeout, then return normalized tool evidence
separately from raw command/probe results.

## Safety Rules

- CodeDecay only runs commands from CodeDecay config.
- CodeDecay does not run commands suggested by LLMs, MCP clients, memory files,
  or remote services.
- Command execution is disabled unless `safety.allowCommands` is true.
- Command output is captured locally in the execution report.
- Tool adapter evidence is reported separately from AI suggestions.
- No telemetry, API keys, cloud services, LLMs, or model calls are required.

`commands.start` should use a short-lived smoke command or a low timeout unless
you intentionally want CodeDecay to verify that a long-running service starts
and then times out.

---

# Differential Behavior Checks

Source page: https://SubmuxHQ.github.io/CodeDecay/differential
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/differential.md

# Differential Behavior Checks

`codedecay differential` compares configured probe behavior between two git
refs. It creates temporary worktrees for `--base` and `--head`, runs the same
configured probes in both worktrees, reports behavior differences, and removes
the worktrees afterward.

Differential checks are useful when a PR looks locally tested but may change a
real behavior path outside the touched files.

## Run

```bash
npx codedecay differential --base main --head HEAD --format markdown
npx codedecay differential --cwd ../my-repo --base origin/main --head HEAD --format json
npx codedecay differential --base main --head HEAD --output codedecay-differential.md
```

`--base` and `--head` are required.

Exit codes:

- `0`: configured probes behaved the same, or probes were safely skipped.
- `1`: probe behavior changed, timed out, or hit an execution error.
- `2`: CLI/internal error, such as missing refs or invalid config.

## What It Compares

CodeDecay compares each configured probe by:

- command status
- exit code
- JSON stdout when stdout is valid JSON
- text stdout when stdout is not JSON
- stderr

The report includes base/head status, exit codes, output snippets for changed
or failed probes, and the exact differences detected.

## Config

Differential checks use probes from the current repo config:

```yaml
version: 1

commands: {}

probes:
  - name: users api
    command: node scripts/check-users-api.js
    timeoutMs: 5000

safety:
  commandTimeoutMs: 120000
  allowCommands: true
```

Only `probes` are used by `codedecay differential`. Test, build, and start
commands are handled by `codedecay execute`.

## Safety Model

- Probes must come from CodeDecay config.
- `safety.allowCommands` must be true or probes are skipped.
- Probes run in temporary git worktrees, not by mutating the current checkout.
- Worktrees are removed after the run.
- CodeDecay does not run commands from LLMs, memory files, MCP clients, or
  remote services.
- No telemetry, API keys, cloud services, LLMs, or model calls are required.

---

# Test Proof Audit

Source page: https://SubmuxHQ.github.io/CodeDecay/test-audit
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/test-audit.md

# Test Proof Audit

CodeDecay summarizes deterministic test signals into a test proof audit.

The audit asks:

```text
Are the changed tests actually proving the changed behavior will not break?
```

The first implementation is deterministic and uses existing analyzer findings.
It does not run mutation testing, execute commands, call models, or use cloud
services.

## Statuses

- `missing`: changed source behavior does not have nearby changed test proof.
- `weak`: changed tests exist, but deterministic rules found weak proof
  signals.
- `present`: changed tests are present and no deterministic weak-test signals
  were found.
- `not_applicable`: no changed source or test files require a test proof audit.

## Current Signals

The audit consumes existing analyzer findings, including:

- `missing-nearby-tests`
- `test-without-assertions`
- `snapshot-only-test`
- `mocked-changed-source`
- `unrelated-test-change`
- `copied-implementation-in-test`
- `happy-path-only-test`
- `heavy-mocking`
- `test-bloat`

## Future OSS Adapters

Future adapters such as StrykerJS can add stronger mutation-testing evidence to
this audit. They should remain explicit, local-first, and opt-in.

---

# First PR Safety Efficacy Benchmark

Source page: https://SubmuxHQ.github.io/CodeDecay/evals/first-efficacy-report
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/evals/first-efficacy-report.md

# First PR Safety Efficacy Benchmark

This benchmark is a small, deterministic proof that CodeDecay can catch seeded PR risks that ordinary passing tests miss.

It is not a claim that CodeDecay makes every PR safe. It is a regression harness for the product promise: find what a coding agent may have missed before merge.

## How to run

```bash
pnpm eval:pr-safety -- --run-id local-pr-safety-eval
```

Artifacts are written under `.codedecay/local/evals/<run-id>/`.

## Current benchmark result

- Status: passed
- Scenarios: 2
- Issues: 0

## Scenarios

### API/auth regression hidden by copied implementation tests

A coding agent can add tests that mirror the changed implementation while missing the real API authorization regression.

| Signal | Result |
| --- | --- |
| Scenario status | passed |
| Baseline tests | exit 0 |
| Baseline behavior probe | exit 0 |
| Risky weak tests | exit 0 |
| Risky behavior probe | exit 1 |
| CodeDecay risk | high (100/100 merge, 0/100 decay) |
| Test proof status | weak |
| Weak-test findings | 2 |
| Missing-test findings | 0 |

Expected evidence:

- Pass: baseline tests pass
- Pass: baseline behavior probe passes
- Pass: risky weak tests still pass
- Pass: risky behavior probe catches regression
- Pass: CodeDecay reports high risk
- Pass: CodeDecay reports expected impacted areas
- Pass: CodeDecay reports expected finding rules
- Pass: Redteam report classifies test proof correctly
- Pass: Redteam report contains expected weak-test evidence
- Pass: Redteam report contains expected missing-test evidence
- Pass: Redteam report suggests edge cases
- Pass: Redteam edge cases are actionable
- Pass: Redteam report creates fix tasks
- Pass: Redteam fix tasks are actionable

### Config/database runtime regression missed by normal tests

A PR can pass a narrow unit test while changing runtime defaults and database semantics that affect production behavior.

| Signal | Result |
| --- | --- |
| Scenario status | passed |
| Baseline tests | exit 0 |
| Baseline behavior probe | exit 0 |
| Risky weak tests | exit 0 |
| Risky behavior probe | exit 1 |
| CodeDecay risk | high (76/100 merge, 0/100 decay) |
| Test proof status | missing |
| Weak-test findings | 0 |
| Missing-test findings | 1 |

Expected evidence:

- Pass: baseline tests pass
- Pass: baseline behavior probe passes
- Pass: risky weak tests still pass
- Pass: risky behavior probe catches regression
- Pass: CodeDecay reports high risk
- Pass: CodeDecay reports expected impacted areas
- Pass: CodeDecay reports expected finding rules
- Pass: Redteam report classifies test proof correctly
- Pass: Redteam report contains expected weak-test evidence
- Pass: Redteam report contains expected missing-test evidence
- Pass: Redteam report suggests edge cases
- Pass: Redteam edge cases are actionable
- Pass: Redteam report creates fix tasks
- Pass: Redteam fix tasks are actionable

## Safety boundaries

- No telemetry.
- No cloud dependency.
- No API keys.
- No LLM/model calls.
- Fixtures run inside local temporary git repositories.

The benchmark uses deterministic CodeDecay reports plus explicit behavior probes. AI or agent suggestions should be evaluated separately from this tool evidence.

---

# Tool Adapters

Source page: https://SubmuxHQ.github.io/CodeDecay/tool-adapters
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/tool-adapters.md

# Tool Adapters

CodeDecay should use existing open-source tools instead of rebuilding their
capabilities. Tool adapters normalize local tool execution into CodeDecay
harness evidence.

The first adapters are:

- Playwright for browser/user-flow checks.
- StrykerJS for mutation-testing evidence.
- Schemathesis for OpenAPI/GraphQL API fuzzing evidence.
- Pact for contract-testing evidence.

## Configuring Adapters

Adapters are configured in CodeDecay config. `codedecay redteam` lists adapter
plans but does not run them.

```yaml
version: 1

toolAdapters:
  playwright: true
  stryker:
    command: pnpm exec stryker run
  schemathesis:
    schema: docs/openapi.yaml
    baseUrl: http://127.0.0.1:3000
  pact:
    command: pnpm run test:pact

safety:
  allowCommands: false
```

Set `safety.allowCommands: true` only for explicit execution commands. Redteam
reports remain report-only even when adapter plans are configured.

## Playwright Harness

The Playwright harness is a private internal package API for now:

```ts
createPlaywrightHarness({
  command: "pnpm exec playwright test",
  allowCommands: true
});
```

Safety defaults:

- command execution is disabled unless `allowCommands: true` is provided,
- commands go through `@submuxhq/codedecay-execution`,
- unsafe commands are blocked by the shared safety policy,
- Playwright is not installed by CodeDecay,
- browsers are not installed by CodeDecay,
- no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used.

The default command is:

```bash
pnpm exec playwright test
```

Projects can override the command when they already have their own Playwright
script, shard, config file, or browser setup.

## StrykerJS Harness

The StrykerJS harness is also a private internal package API for now:

```ts
createStrykerHarness({
  command: "pnpm exec stryker run",
  allowCommands: true
});
```

Safety defaults:

- command execution is disabled unless `allowCommands: true` is provided,
- commands go through `@submuxhq/codedecay-execution`,
- unsafe commands are blocked by the shared safety policy,
- StrykerJS is not installed by CodeDecay,
- no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used.

The default command is:

```bash
pnpm exec stryker run
```

Projects can override the command when they already have their own Stryker
script, mutation score threshold, or package manager setup.

## Schemathesis Harness

The Schemathesis harness is also a private internal package API for now:

```ts
createSchemathesisHarness({
  schema: "openapi.yaml",
  baseUrl: "http://127.0.0.1:3000",
  allowCommands: true
});
```

Safety defaults:

- command execution is disabled unless `allowCommands: true` is provided,
- commands go through `@submuxhq/codedecay-execution`,
- unsafe commands are blocked by the shared safety policy,
- Schemathesis is not installed by CodeDecay,
- API servers are not started by CodeDecay,
- no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used.

The default command is:

```bash
st run openapi.yaml --url http://127.0.0.1:3000
```

Projects can override the full command when they already use a different
Schemathesis entry point, package manager, schema location, base URL, or
service startup flow:

```ts
createSchemathesisHarness({
  command: "uvx schemathesis run docs/openapi.yaml --url http://127.0.0.1:4000",
  allowCommands: true
});
```

## Pact Harness

The Pact harness is also a private internal package API for now:

```ts
createPactHarness({
  command: "pnpm run test:pact",
  allowCommands: true
});
```

Safety defaults:

- command execution is disabled unless `allowCommands: true` is provided,
- commands go through `@submuxhq/codedecay-execution`,
- unsafe commands are blocked by the shared safety policy,
- Pact is not installed by CodeDecay,
- Pact Broker or PactFlow are not required by CodeDecay,
- no telemetry, LLM calls, API keys, or CodeDecayCloud dependency are used.

The default command is:

```bash
pnpm run test:pact
```

Projects can override the command when they already have their own Pact
consumer/provider test script, local pact file setup, or broker-backed CI flow.

## Future Adapters

The same package can add adapters for coverage tools and test runners. Each
adapter should use safe configured execution and return evidence rather than
bypassing CodeDecay safety rules.

---

# Agent Skills

Source page: https://SubmuxHQ.github.io/CodeDecay/skills
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/skills.md

# Agent Skills

CodeDecay can load repo-local agent skills from:

```text
.agents/skills/*/SKILL.md
```

Skills are portable review instructions for the developer or their own agent.
They can help Codex, Claude Code, Cursor, MCP clients, desktop agents, or
internal company agents ask better PR-safety questions.

CodeDecay treats skill files as local, untrusted context:

- it does not execute skill content,
- it does not follow arbitrary links from skill files,
- it does not fetch external skills,
- it does not call an LLM,
- it does not send telemetry.

## Example

```text
.agents/skills/pr-red-team/SKILL.md
.agents/skills/test-quality-review/SKILL.md
```

Each skill should start with a Markdown title and a short first paragraph:

```markdown
# PR Red-Team Skill

Find what a coding agent may have missed before merge.
```

`codedecay redteam` includes a compact `Agent Skills` section with the skill
title, path, and summary. Full skill content stays in the repo-local skill file
for the user's agent to read when needed.

## Current Scope

The first loader only reads `.agents/skills/*/SKILL.md` from the analyzed repo.
Future adapters can map the same concept to other local or user-owned skill
systems, but the OSS default remains local-first.

---

# Local Repo Memory

Source page: https://SubmuxHQ.github.io/CodeDecay/memory
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/memory.md

# Local Repo Memory

CodeDecay can read repo-local memory from `.codedecay/memory.json` and use it
to enrich PR risk reports with project-specific flows, commands, invariants,
architecture notes, and past regressions.

Memory is optional. If no memory file exists, CodeDecay uses empty defaults.
The memory file is local to the repository, is never uploaded by CodeDecay, and
does not require telemetry, API keys, LLMs, model calls, or a hosted service.

## Inspect Memory

```bash
npx codedecay memory --format markdown
npx codedecay memory --cwd ../my-repo --format json
```

`codedecay analyze` automatically applies memory when `.codedecay/memory.json`
exists in the analyzed repository.

## File Format

```json
{
  "version": 1,
  "flows": [
    {
      "name": "Checkout",
      "description": "Customer checkout from cart to payment confirmation.",
      "areas": ["api", "ui"],
      "checks": [
        "failed card retry",
        "missing shipping address",
        "duplicate webhook delivery"
      ]
    }
  ],
  "commands": [
    {
      "name": "Checkout smoke tests",
      "command": "pnpm test checkout",
      "areas": ["api", "ui"]
    }
  ],
  "invariants": [
    {
      "name": "Auth fails closed",
      "description": "Missing or invalid users must not become admins.",
      "areas": ["auth"],
      "severity": "high"
    }
  ],
  "architecture": [
    {
      "title": "Session boundary",
      "note": "Session parsing feeds all API routes.",
      "files": ["src/auth/*"]
    }
  ],
  "regressions": [
    {
      "title": "Anonymous admin fallback",
      "description": "A previous fallback user path granted admin access.",
      "areas": ["auth"],
      "check": "request protected routes without a token",
      "severity": "high"
    }
  ]
}
```

All top-level arrays are optional. Unknown fields are ignored by v1.

## Matchers

Memory entries can match changed code by impacted area, file path, or both.

Supported `areas` values:

- `api`
- `ui`
- `database`
- `auth`
- `config`
- `test`
- `source`
- `docs`

Supported `files` values are simple path patterns:

- exact path: `src/auth/session.ts`
- contains match: `auth`
- wildcard match: `src/auth/*`

## Report Behavior

When memory matches a PR, CodeDecay may add:

- findings for impacted invariants
- findings for past regression areas
- findings for matching architecture notes
- recommended checks for flows
- recommended commands from the memory file

CodeDecay does not run memory commands automatically. They are reported as
project-specific checks for the user or future execution adapters.

## Future Adapters

The v1 memory provider is the local `.codedecay/memory.json` file. CodeDecay
formalizes this behind a `MemoryProvider` interface so future adapters can map
the same provider shape to open-source or user-owned memory systems such as
Mem0 or Supermemory, while preserving the local-first default.

Any future hosted or external memory adapter should be opt-in, never required
for `codedecay analyze`, and must not change deterministic baseline scoring.

The built-in provider is:

```text
id: local
name: Local .codedecay memory
kind: local
```

External providers are not enabled by default. They must not add telemetry,
hidden network calls, API key requirements, LLM calls, or CodeDecayCloud
dependencies to the OSS workflow.

---

# LLM Providers

Source page: https://SubmuxHQ.github.io/CodeDecay/llm-providers
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/llm-providers.md

# LLM Providers

CodeDecay is deterministic by default. The default configuration does not call
an LLM, does not require API keys, and does not use a hosted CodeDecay model.

Future or opt-in red-team commands can use user-owned providers for edge-case
reasoning. Model output must be treated as untrusted suggestions, not commands
to execute.

## Disabled By Default

```yaml
llm:
  provider: disabled
  timeoutMs: 30000
```

This is the default when no config file exists.

## Local Ollama

Ollama support is designed for local models running on the user's machine.

```yaml
llm:
  provider: ollama
  model: qwen2.5-coder
  endpoint: http://127.0.0.1:11434
  timeoutMs: 30000
```

CodeDecay should only call this provider from commands that explicitly opt into
LLM assistance. The current deterministic `codedecay analyze` command does not
call an LLM.

## LiteLLM / OpenAI-Compatible BYOK

CodeDecay can construct a LiteLLM/OpenAI-compatible provider for local or BYOK
setups. It does not default to a hosted endpoint; you must provide the endpoint
and model explicitly.

```yaml
llm:
  provider: litellm
  model: gpt-4.1-mini
  endpoint: http://127.0.0.1:4000/v1
  apiKeyEnv: LITELLM_API_KEY
  timeoutMs: 30000
```

`apiKeyEnv` is the name of an environment variable. Do not put literal API keys
in config files.

The provider uses an OpenAI-compatible `/chat/completions` request. Responses
are parsed into untrusted suggestions when possible. CodeDecay must not execute
commands from model output.

## Future Providers

The provider interface leaves room for additional adapters later. Those adapters
should remain optional and must not change the default local-first behavior.

---

# Sample CodeDecay Reports

Source page: https://SubmuxHQ.github.io/CodeDecay/sample-reports/
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/sample-reports/index.md

# Sample CodeDecay Reports

These reports show the output CodeDecay produces for a realistic JavaScript and
TypeScript pull request.

The sample diff changes:

- a UI route: `app/dashboard/page.tsx`
- an API route: `src/api/users.ts`
- an auth/session file: `src/auth/session.ts`
- a Prisma schema file: `prisma/schema.prisma`
- a build/runtime config file: `vite.config.ts`

The reports were generated with the local CodeDecay CLI from a temporary git
repository containing those changes.

## What To Read First

Start with the Markdown report:

- [sample-report.md](/sample-reports/sample-report)

Look at these sections first:

- **Overall risk**: the high-level merge risk and decay scores.
- **Likely Impacted Areas**: the app surfaces CodeDecay thinks may be affected.
- **Likely Impacted Routes And APIs**: concrete user/API paths to verify when
  framework-aware route evidence is available.
- **High Risk Findings**: the findings most likely to need reviewer attention.
- **Recommended Checks**: tests or manual checks to run before merge. Prefer
  checks that exercise the real route, API, UI, database, or downstream path
  instead of only helper-level behavior.

For automation and integrations:

- <a href="./sample-report.json">sample-report.json</a> is the stable machine-readable report.
- <a href="./sample-report.sarif">sample-report.sarif</a> is the code-scanning-oriented report.

CodeDecay generated these reports locally without telemetry, API keys, LLMs, or
model calls.

---

# Sample CodeDecay Markdown Report

Source page: https://SubmuxHQ.github.io/CodeDecay/sample-reports/sample-report
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/sample-reports/sample-report.md

## CodeDecay Report

**Overall risk:** High

| Score | Value |
| --- | ---: |
| Merge risk | 100/100 |
| Decay risk | 62/100 |

| Findings | Count |
| --- | ---: |
| High | 5 |
| Medium | 4 |
| Low | 0 |

### Changed Files

- `app/dashboard/page.tsx` modified (+1/-1)
- `prisma/schema.prisma` modified (+3/-1)
- `src/api/users.ts` modified (+5/-1)
- `src/auth/session.ts` modified (+6/-1)
- `vite.config.ts` modified (+4/-1)

### Likely Impacted Areas

- High **API surface** (api): `src/api/users.ts`
- High **Authentication and authorization** (auth): `src/auth/session.ts`
- High **Database and schema** (database): `prisma/schema.prisma`
- Medium **Build and runtime configuration** (config): `vite.config.ts`
- Medium **UI route** (ui): `app/dashboard/page.tsx`

### Likely Impacted Routes And APIs

- Medium `/dashboard` (Next.js UI route): `app/dashboard/page.tsx`

### High Risk Findings

- **Risky source changes without changed tests** (`app/dashboard/page.tsx:2`): This PR changes risky source areas but does not change any obvious test files.
- **Api area changed** (`src/api/users.ts:1`): src/api/users.ts touches a api area and should be reviewed for regression impact.
- **Auth area changed** (`src/auth/session.ts:2`): src/auth/session.ts touches a auth area and should be reviewed for regression impact.
- **Database area changed** (`prisma/schema.prisma:2`): prisma/schema.prisma touches a database area and should be reviewed for regression impact.
- **Potential silent failure path** (`src/auth/session.ts:5`): src/auth/session.ts adds code that can hide type, lint, or runtime failures.

### Medium Risk Findings

- **Broad unrelated change set**: This PR changes 5 files across 4 top-level areas and 5 risk categories.
- **Config area changed** (`vite.config.ts:1`): vite.config.ts touches a config area and should be reviewed for regression impact.
- **Ui area changed** (`app/dashboard/page.tsx:2`): app/dashboard/page.tsx touches a ui area and should be reviewed for regression impact.
- **New unchecked TypeScript escape hatch** (`src/api/users.ts:1`): src/api/users.ts adds code that can hide type, lint, or runtime failures.

### Recommended Checks

- `Add or run tests covering app/dashboard/page.tsx`
- `Add or run tests covering src/api/users.ts`
- `Add or run tests covering src/auth/session.ts`
- `Add or run tests covering vite.config.ts`

### Notes

CodeDecay is deterministic and local-first. This report was generated without telemetry, API keys, LLMs, or model calls.

---

# Scoring Model

Source page: https://SubmuxHQ.github.io/CodeDecay/scoring
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/scoring.md

# Scoring Model

CodeDecay produces two scores from 0 to 100.

## Merge Risk

Merge risk estimates how likely the PR is to break behavior that reviewers or CI
should care about before merge.

Signals include:

- API route changes
- auth/session/security changes
- database/schema changes
- config/build/deployment changes
- risky source changes without nearby test changes
- heavy mocking that may weaken regression confidence

## Decay Score

Decay score estimates whether the PR makes the codebase harder to maintain.

Signals include:

- duplicated added logic
- large changed functions
- high function complexity
- compiler or linter suppressions
- unchecked TypeScript escape hatches
- broad unrelated change sets
- large test changes weakly connected to source changes

## Thresholds

- `0-39`: low
- `40-69`: medium
- `70-100`: high

Scores are capped by the highest relevant finding severity. A report with only
low-severity merge-risk findings stays low, even if many low findings are
present. A report with only medium-severity merge-risk findings stays at most
medium. High risk requires high-severity evidence.

The v1 scoring model is deterministic. The same diff should produce the same
score.

## No LLM Required

CodeDecay does not call a model to decide risk. It uses git diff data,
path-based impact detection, local JS/TS source analysis, and deterministic
rules.

---

# Research Basis

Source page: https://SubmuxHQ.github.io/CodeDecay/research
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/research.md

# Research Basis

CodeDecay is motivated by research on software evolution, pull request impact,
and AI-era code quality risks.

## SlopCodeBench

SlopCodeBench studies how coding agents degrade over long-horizon iterative
tasks. It tracks verbosity and structural erosion, including duplicated code and
complexity concentration in high-complexity functions.

Reference:
https://arxiv.org/html/2603.24755v1

## Does Code Decay?

“Does Code Decay? Assessing the Evidence from Change Management Data” connects
software evolution data to code decay and maintenance risk.

References:

- https://www.niss.org/sites/default/files/technicalreports/tr81.pdf
- https://dl.acm.org/doi/10.1109/32.895984

## Pull Request Change Impact

Pull request change impact research supports using code structure and changed
artifact relationships to improve review focus and risk awareness.

Reference:
https://link.springer.com/article/10.1007/s10664-024-10600-2

## AI Code Quality, Churn, And Duplication

AI-assisted development can increase code volume, churn, and duplicated code
when teams optimize only for immediate output. CodeDecay turns those concerns
into local PR checks.

Reference:
https://www.gitclear.com/ai_assistant_code_quality_2025_research

---

# Releasing

Source page: https://SubmuxHQ.github.io/CodeDecay/releasing
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/releasing.md

# Releasing

CodeDecay publishes one npm package for v1:

```text
@submuxhq/codedecay
```

The package source is `packages/cli`, and the installed binary remains
`codedecay`.

CodeDecay also publishes an optional GitHub Packages npm mirror with the same
package name. GitHub Packages scopes packages by GitHub user or organization
owner, and this repository is owned by `SubmuxHQ`, so every release from v0.2.0
onward uses `@submuxhq/codedecay` for both npmjs and the GitHub Packages mirror.

npmjs is the default public install path for users:

```bash
npm install -D @submuxhq/codedecay
```

GitHub Packages is an authenticated mirror for GitHub-based workflows and
requires registry authentication for installs.

## Patch Release Checklist

Before opening the release PR, bump the published version in:

- `packages/cli/package.json`
- `packages/core/src/index.ts`

After the release PR is merged, release only from a clean `main` branch at the
commit that will be tagged and published. Do not publish npm contents from a
different commit than the Git tag.

Run:

```bash
pnpm install
pnpm run lint
pnpm typecheck
pnpm test
pnpm build
pnpm --filter @submuxhq/codedecay pack --dry-run
```

Inspect the tarball before publishing:

```bash
pnpm --filter @submuxhq/codedecay pack
tar -tzf submuxhq-codedecay-<version>.tgz
```

The tarball must include:

```text
package/LICENSE
package/README.md
package/package.json
package/dist/index.js
package/dist/index.d.ts
```

Before publishing, run the installed-package smoke against the packed tarball:

```bash
pnpm demo:published-package --tarball ./submuxhq-codedecay-<version>.tgz --run-id v<version>-tarball-smoke
```

This creates a fresh temp install, materializes the Next.js and Node API demo
repos, runs the installed `codedecay` binary, verifies JSON/Markdown/SARIF
outputs, and writes logs under:

```text
.codedecay/local/published-package-demo/<run-id>/run.json
.codedecay/local/published-package-demo/<run-id>/summary.md
```

Publish the scoped package with public access:

```bash
pnpm --filter @submuxhq/codedecay publish --access public
```

If npm requires a one-time password in a non-interactive shell, publish from the
package directory:

```bash
cd packages/cli
npm publish --access public --otp <otp>
```

After publishing, verify the public install path:

```bash
tmpdir=$(mktemp -d)
cd "$tmpdir"
npm install @submuxhq/codedecay@<version>
node_modules/.bin/codedecay --help
```

After publishing, run the same smoke against the registry package:

```bash
pnpm demo:published-package --package @submuxhq/codedecay@<version> --run-id v<version>-published-smoke
```

Create the GitHub release for the same tag and verify the release surfaces stay
in sync:

```bash
git show --no-patch --decorate --oneline v<version>
gh release view v<version>
npm view @submuxhq/codedecay version dist-tags --json
```

The package version, npm `latest` dist-tag, Git tag, and GitHub release should
all refer to the same released version before the release is considered done.

## GitHub Packages Mirror

The GitHub Packages mirror is published from the same built CLI package. It uses
the same package name, `@submuxhq/codedecay`, and sets the registry to
`https://npm.pkg.github.com`.

Use npmjs for public end-user installs. Use GitHub Packages only when a workflow
or organization policy specifically needs a GitHub-hosted package mirror.

Prepare the mirror package locally after `pnpm build:packages`:

```bash
pnpm package:github --out /tmp/codedecay-ghpkg
cd /tmp/codedecay-ghpkg
npm pack --dry-run
```

The dry run should include:

```text
package/LICENSE
package/README.md
package/package.json
package/dist/index.js
package/dist/index.d.ts
```

Publish through the `Publish GitHub Packages` workflow. It uses the repository
`GITHUB_TOKEN` with `packages: write` permission and skips publishing if the
same mirror version already exists.

Use the exact release tag as the workflow `ref`. Do not publish from `main`
after unreleased commits have landed, because that can create a GitHub Packages
version whose contents differ from the npmjs package with the same version.

Manual dispatch:

```bash
gh workflow run publish-github-packages.yml -f ref=v<version>
```

Install from GitHub Packages by adding the GitHub owner scope and an
authenticated token to `.npmrc`:

```text
@submuxhq:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=<classic-pat-with-read:packages>
```

Then install:

```bash
npm install -D @submuxhq/codedecay@<version>
node_modules/.bin/codedecay version
```

GitHub Packages requires a personal access token classic with `read:packages`
for local installs. GitHub Actions can use `GITHUB_TOKEN` when the package is
associated with this repository and the workflow has package access.

If local verification fails with `403 permission_denied`, check the token scope
before changing package metadata. The default public npmjs install path should
still work without GitHub authentication:

```bash
npm install -D @submuxhq/codedecay
```

---

# Framework-Aware Route/API Impact Map

Source page: https://SubmuxHQ.github.io/CodeDecay/proposals/framework-aware-impact-map
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/proposals/framework-aware-impact-map.md

# Framework-Aware Route/API Impact Map

Status: implementation started

Proposal issue: [#35](https://github.com/SubmuxHQ/CodeDecay/issues/35)
Implementation issue: [#144](https://github.com/SubmuxHQ/CodeDecay/issues/144)

## Goal

CodeDecay should make regression risk more actionable by translating changed
JavaScript and TypeScript files into affected routes, API endpoints, and request
surfaces where that can be done deterministically.

The first implementation should focus on Next.js and Node API projects because
they are common adoption paths for CodeDecay and are already represented by the
example projects.

## Non-Goals

- No LLM, model, cloud, telemetry, or API-key dependency.
- No runtime tracing or server startup.
- No attempt to prove whether a change was AI-generated.
- No broad generic code review comments.
- No scoring change until the extracted impact map is covered by fixtures and
  report tests.

## Proposed Report Field

Add an optional top-level field to the JSON report:

```ts
interface ImpactedRoute {
  framework: "nextjs" | "express" | "fastify" | "node";
  kind: "ui-route" | "api-route" | "middleware" | "route-handler";
  route: string;
  methods: string[];
  files: string[];
  risk: "low" | "medium" | "high";
  reasons: string[];
  recommendedTests: string[];
}
```

Example:

```json
{
  "impactedRoutes": [
    {
      "framework": "nextjs",
      "kind": "api-route",
      "route": "/api/users",
      "methods": ["GET"],
      "files": ["src/app/api/users/route.ts"],
      "risk": "high",
      "reasons": ["API route changed", "No nearby test changed"],
      "recommendedTests": ["Add or run tests covering src/app/api/users/route.ts"]
    }
  ]
}
```

Markdown reports should add a compact section after `Likely Impacted Areas`:

```markdown
### Likely Impacted Routes And APIs

- High `GET /api/users` (Next.js API route): `src/app/api/users/route.ts`
- Medium `/dashboard` (Next.js UI route): `src/app/dashboard/page.tsx`
```

SARIF should stay minimal for now. It can continue emitting file/line findings;
route data can be added later through SARIF `properties` only if GitHub code
scanning handles it cleanly.

## Supported Patterns

### Next.js

Supported first:

- `app/**/page.{js,jsx,ts,tsx}` -> UI route
- `src/app/**/page.{js,jsx,ts,tsx}` -> UI route
- `app/api/**/route.{js,ts}` -> API route
- `src/app/api/**/route.{js,ts}` -> API route
- `pages/api/**/*.{js,ts}` -> API route
- `src/pages/api/**/*.{js,ts}` -> API route
- `middleware.{js,ts}` and `src/middleware.{js,ts}` -> middleware

Route normalization:

- Remove `src/`, `app/`, and `pages/` prefixes.
- Remove `page`, `route`, and file extensions.
- Convert route groups like `(admin)` to no path segment.
- Preserve dynamic segments such as `[id]` and `[...slug]`.
- Convert `index` pages to the parent route.
- For `app/api/users/route.ts`, report `/api/users`.
- For `app/dashboard/page.tsx`, report `/dashboard`.

HTTP methods:

- For Next.js route handlers, detect exported functions named `GET`, `POST`,
  `PUT`, `PATCH`, `DELETE`, `HEAD`, or `OPTIONS`.
- If no method is found, use `["*"]`.
- UI routes should use an empty method list.

### Express

Supported first:

- `app.get("/path", ...)`
- `app.post("/path", ...)`
- `router.get("/path", ...)`
- `router.post("/path", ...)`
- equivalent `put`, `patch`, `delete`, `head`, and `options`

File patterns to inspect:

- `src/routes/**/*.{js,ts}`
- `src/api/**/*.{js,ts}`
- `src/controllers/**/*.{js,ts}`
- `routes/**/*.{js,ts}`
- `api/**/*.{js,ts}`
- `server.{js,ts}`
- `app.{js,ts}`

The extractor should use AST parsing where practical and fall back to simple
literal-string matching only for route call expressions. It should not execute
application code.

### Fastify

Supported first:

- `fastify.get("/path", ...)`
- `fastify.post("/path", ...)`
- `server.get("/path", ...)`
- `server.route({ method: "GET", url: "/path" })`
- array methods in route objects, for example `method: ["GET", "POST"]`

Use the same file patterns as Express.

## Risk Mapping

Route/API impact risk should derive from existing deterministic signals:

- API route changed -> high
- auth/session/security file changed and route imports or lives near auth code
  -> high
- database/schema file changed and route imports DB/model code -> high
- UI route changed -> medium
- middleware changed -> high
- route changed with no nearby tests -> add reason, do not duplicate the
  existing `missing-nearby-tests` finding

The first implementation should not add new score weights. It should make the
report more specific while preserving the current scoring behavior.

## Required Tests

Analyzer fixtures:

- Next.js App Router UI route: `src/app/dashboard/page.tsx`
- Next.js App Router API route with exported `GET`
- Next.js dynamic route: `src/app/users/[id]/page.tsx`
- Next.js route group: `src/app/(admin)/dashboard/page.tsx`
- Next.js Pages API route: `src/pages/api/users.ts`
- Express router method calls for `GET` and `POST`
- Fastify shorthand calls and `server.route({ method, url })`
- No false route for non-route utility files

Report tests:

- JSON includes `impactedRoutes` when present.
- Markdown renders the route/API impact section.
- SARIF remains valid when route impact data exists.

CLI tests:

- Existing CLI output remains backward compatible.
- Snapshot or assertion covers a fixture PR with route impact data.

Example fixtures:

- Extend `examples/nextjs-risk-demo` expected summary once implemented.
- Extend `examples/node-api-risk-demo` expected summary once implemented.

## Implementation Plan

1. Add optional `impactedRoutes` types to `packages/core`.
2. Render `impactedRoutes` in JSON and Markdown reports.
3. Add Next.js deterministic route extraction in `packages/analyzer-js`.
4. Add Express and Fastify route extraction in `packages/analyzer-js`.
5. Add fixtures and tests before changing any scoring behavior.
6. Update sample reports and example README summaries.

## Open Questions

- Should dynamic routes stay as framework-native paths like `/users/[id]`, or
  should CodeDecay normalize them to `/users/:id`?
- Should route impact eventually influence scoring, or remain explanatory only?
- Should imports be analyzed deeply enough to connect DB/auth changes to routes,
  or should v1 keep that relationship path-based?

---

# RFC 0001: Agent-Agnostic Redteam Harness Architecture

Source page: https://SubmuxHQ.github.io/CodeDecay/rfcs/0001-agent-agnostic-redteam-harness
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/rfcs/0001-agent-agnostic-redteam-harness.md

# RFC 0001: Agent-Agnostic Redteam Harness Architecture

Status: proposed

## Summary

CodeDecay should evolve from deterministic PR risk analysis into an
agent-agnostic PR safety harness for AI-assisted development.

CodeDecay should not replace Codex, Claude Code, Cursor, Pi, OpenCode, desktop
agents, or internal company agents. It should give those agents better evidence,
skills, memory, execution results, weak-test findings, and merge-safety reports.

Positioning:

```text
CodeDecay is an open-source PR safety harness that works with your existing AI
agents and open-source tools.

Use Codex, Claude Code, Cursor, Pi, OpenCode, desktop agents, or any
MCP-compatible workflow. CodeDecay orchestrates evidence, memory, skills, test
audits, and runtime checks so your agent can find what it missed before merge.
```

Core question:

```text
What could this PR break, and are the tests actually proving it will not?
```

## Product Boundaries

CodeDecay owns:

- PR orchestration
- normalized evidence
- safety policy
- impact mapping
- test-quality audit
- tool-adapter coordination
- merge-safety reporting
- fix-task generation for coding agents

CodeDecay does not own:

- the user's main coding agent
- hosted model inference
- mandatory cloud memory
- every test runner or fuzzing engine
- every language parser

The default OSS experience must remain:

- local-first
- deterministic baseline
- no telemetry
- no required API keys
- no required LLM/model calls
- no required CodeDecayCloud

## Architecture

```text
User agent / IDE / harness
  Codex | Claude Code | Cursor | Pi | OpenCode | Desktop app | Custom MCP client
        |
        v
CodeDecay CLI / MCP / GitHub Action / GitHub App
        |
        v
redteam orchestrator
        |
        +--> git diff and current deterministic analyzer
        +--> impact map
        +--> memory providers
        +--> skill selection
        +--> agent/harness adapters
        +--> safe execution
        +--> OSS tool adapters
        +--> test audit
        +--> base/head differential checks
        |
        v
merge-safety report + evidence bundle + fix tasks
        |
        v
back to the user's agent for implementation
```

## `codedecay redteam` Flow

1. Analyze the PR diff with the existing deterministic engine.
2. Build an impact map for routes, APIs, UI flows, jobs, database/schema, auth,
   and config.
3. Load repo memory from `.codedecay/memory.json` and optional user-owned memory
   providers.
4. Select relevant review skills.
5. Ask an optional user-owned agent or harness for missed risks and edge cases.
6. Run selected open-source tools through safe adapters.
7. Audit changed and nearby tests for weak or fake confidence.
8. Generate an edge-case checklist grounded in impacted areas.
9. Compare base/head behavior through configured probes when explicitly enabled.
10. Produce a merge-safety report that separates tool evidence from AI
    suggestions.
11. Generate fix tasks for Codex, Claude Code, Cursor, Pi, OpenCode, or any
    MCP-compatible agent.

## Module Plan

### `packages/redteam`

Purpose: owns the end-to-end `codedecay redteam` orchestration.

Inputs:

- cwd
- base/head refs
- CodeDecay config
- selected adapters
- optional agent/harness provider

Outputs:

- merge-safety report
- normalized evidence bundle
- fix tasks

Public API:

```ts
interface RedteamInput {
  cwd: string;
  base?: string;
  head?: string;
  mode: "deterministic" | "assisted";
}

interface RedteamResult {
  report: MergeSafetyReport;
  evidence: Evidence[];
  fixTasks: FixTask[];
}
```

Safety:

- deterministic mode must work without agents or models
- no command execution unless explicitly configured

MVP:

- compose existing analyze, memory, config, and Markdown report data into a
  redteam-shaped report

### `packages/harness`

Purpose: registry and interface for agent and tool harnesses.

Inputs:

- redteam task
- repo context
- selected skills
- evidence collected so far

Outputs:

- harness plan
- harness run result
- normalized evidence
- summary

Public API:

```ts
interface CodeDecayHarness {
  name: string;
  capabilities: HarnessCapability[];
  requiredConfig: ConfigRequirement[];

  plan(input: HarnessPlanInput): Promise<HarnessPlan>;
  run(plan: HarnessPlan, context: HarnessRunContext): Promise<HarnessRunResult>;
  collectEvidence(result: HarnessRunResult): Promise<Evidence[]>;
  summarize(evidence: Evidence[]): Promise<HarnessSummary>;
}
```

Safety:

- adapters must declare whether they can execute commands, call models, or use
  network access
- all failures become structured failure modes

MVP:

- generic process harness adapter
- in-memory registry
- evidence schema

### `packages/agent`

Purpose: optional user-owned AI provider interface.

Inputs:

- prompt/task
- selected skills
- evidence bundle
- model/provider config

Outputs:

- AI suggestions
- missing edge cases
- fix tasks

Public API:

```ts
interface AgentProvider {
  name: string;
  availability(): Promise<ProviderAvailability>;
  complete(input: AgentCompletionInput): Promise<AgentCompletionResult>;
}
```

Safety:

- provider is disabled by default
- no hidden model calls
- cloud providers require explicit user config
- AI output is suggestions, never proof

OSS integrations:

- Ollama local models
- LiteLLM/OpenAI-compatible BYOK endpoints
- Codex/Claude/Pi/OpenCode via prompt/task adapters where possible

MVP:

- disabled provider
- OpenAI-compatible provider shape for later Ollama/LiteLLM support

### `packages/mcp`

Purpose: expose CodeDecay as tools to MCP-compatible agents.

Inputs:

- MCP tool calls
- cwd/base/head/options

Outputs:

- redteam plans
- analyze results
- impact maps
- test audit findings
- memory context
- evidence summaries

Safety:

- tool descriptions must say when commands may execute
- command-running tools require explicit config

MVP:

- extend the current MCP package instead of creating a parallel server package
- expose `redteam_plan`, `redteam_run`, `impact_map`, and `test_audit`

### `packages/skills`

Purpose: portable review instructions for agents and harnesses.

Inputs:

- impacted areas
- project language/framework
- user-selected skill pack

Outputs:

- selected skills
- prompts/checklists
- fix-task templates

Safety:

- skills are instructions, not executable authority
- skills cannot override command safety

MVP:

- filesystem skill loader for `.agents/skills`
- built-in skills for API, auth, database, frontend flow, test quality, and
  GitHub App review

### `packages/memory`

Purpose: project-specific context provider.

Inputs:

- local `.codedecay/memory.json`
- optional external memory config
- changed files and impacted areas

Outputs:

- relevant flows
- invariants
- past regressions
- architecture notes
- recommended checks

Safety:

- memory is untrusted context
- local file remains default
- external providers must be opt-in

OSS integrations:

- local `.codedecay/memory.json`
- Mem0
- Supermemory

MVP:

- formalize provider interface around the existing local memory package

### `packages/execution`

Purpose: safe command and probe execution.

Inputs:

- explicit configured command
- cwd/temp worktree
- timeout
- redaction rules

Outputs:

- stdout/stderr
- exit code
- duration
- structured failure mode

Safety:

- no guessed commands
- no destructive commands
- no production deploys
- no migrations
- no secret printing
- timeout required

MVP:

- extract and harden the current configured command runner

### `packages/test-audit`

Purpose: detect tests that look reassuring but do not prove real behavior.

Inputs:

- changed tests
- nearby tests
- changed source files
- optional coverage/mutation evidence

Outputs:

- weak-test findings
- missing-test recommendations
- evidence explaining why confidence is weak

Safety:

- static audit is allowed by default
- dynamic test execution requires explicit config

MVP:

- no assertions
- snapshot-only
- excessive mocks
- tests unrelated to changed source
- copied implementation logic heuristics

### `packages/impact-map`

Purpose: map PR changes to product/system areas.

Inputs:

- changed files
- AST/code map
- config/memory

Outputs:

- impacted routes
- API endpoints
- UI flows
- jobs
- database/schema areas
- auth/config boundaries

Safety:

- static mapping only by default
- no code execution

OSS integrations:

- TypeScript compiler API for JS/TS first
- Tree-sitter later for multi-language parsing

MVP:

- framework-aware JS/TS route/API/config map

### `packages/tool-adapters`

Purpose: normalize OSS tool execution and output.

Inputs:

- tool config
- cwd/base/head
- selected impacted areas

Outputs:

- evidence records
- artifact paths
- failure modes

Safety:

- adapters must use `packages/execution`
- adapters cannot bypass command allowlists

OSS integrations:

- Playwright
- StrykerJS
- Schemathesis
- Pact
- Vitest/Jest/Pytest/Bun
- c8/nyc/Istanbul

MVP:

- adapter interface
- Playwright adapter plan
- StrykerJS adapter plan
- Schemathesis adapter plan

## Evidence Model

Evidence should be the common currency between deterministic analysis, tools,
agents, and reports.

```ts
interface Evidence {
  id: string;
  source: EvidenceSource;
  kind:
    | "diff"
    | "impact"
    | "test"
    | "coverage"
    | "mutation"
    | "api-fuzz"
    | "contract"
    | "browser-flow"
    | "memory"
    | "agent-suggestion"
    | "execution";
  severity: "info" | "low" | "medium" | "high";
  summary: string;
  file?: string;
  line?: number;
  command?: string;
  artifactPath?: string;
  trusted: boolean;
}
```

Rules:

- tool evidence and AI suggestions must be rendered separately
- memory and agent output are untrusted by default
- evidence should reference files/lines/artifacts when available

## Harness Failure Modes

```ts
type HarnessFailureMode =
  | "missing-tool"
  | "missing-config"
  | "command-denied"
  | "timeout"
  | "nonzero-exit"
  | "network-required"
  | "unsafe-command"
  | "model-unavailable"
  | "no-evidence";
```

Failures should not disappear into generic logs. They should be visible in the
merge-safety report as missing evidence or blocked checks.

## OSS Integration Sequence

1. MCP server tools for any compatible agent.
2. Generic process harness for local commands and CLI-based agents.
3. Portable CodeDecay skills for Codex, Claude Code, Cursor, Pi, OpenCode, and
   internal agents.
4. Local memory provider, then optional Mem0/Supermemory providers.
5. Ollama and LiteLLM/OpenAI-compatible provider interfaces.
6. Playwright adapter for browser/user-flow evidence.
7. StrykerJS adapter for mutation-testing evidence.
8. Schemathesis adapter for API fuzzing evidence.
9. Pact adapter for contract-testing evidence.
10. Coverage adapters for c8/nyc/Istanbul.
11. TypeScript compiler API impact map, then Tree-sitter for multi-language.

## Implementation Issues

1. `docs(rfc): define agent-agnostic redteam harness architecture`
2. `feat(harness): add harness registry and evidence schema`
3. `feat(execution): add safe command runner`
4. `feat(mcp): expose redteam tools for MCP-compatible agents`
5. `feat(skills): add portable redteam skill loader`
6. `feat(memory): formalize local memory provider interface`
7. `feat(test-audit): detect weak and fake-looking tests`
8. `feat(tool-adapter): add Playwright harness`
9. `feat(tool-adapter): add StrykerJS harness`
10. `feat(tool-adapter): add Schemathesis harness`
11. `feat(agent): add optional Ollama provider`
12. `feat(agent): add optional LiteLLM provider`
13. `feat(harness): add Pi convenience adapter`

## First Three PRs

1. RFC PR: this document only.
2. Harness PR: add `packages/harness` with registry, evidence schema, and tests.
3. Execution PR: extract/harden safe configured command runner.

## Open Questions

- Should `codedecay redteam` default to deterministic-only mode and require
  `--assist` for any agent/model call?
- Should external memory providers live behind one provider interface or
  separate adapter packages?
- Should GitHub App redteam execution remain disabled until sandboxed workers
  exist?
- Should adapter artifacts be stored only in `.codedecay/artifacts/` or in a
  user-specified output directory?
- Which first external agent workflow should get convenience docs: Codex,
  Claude Code, Cursor, Pi, or OpenCode?

---

# CodeDecay v0.1.1 Launch Post

Source page: https://SubmuxHQ.github.io/CodeDecay/launch-post
Raw markdown: https://SubmuxHQ.github.io/CodeDecay/markdown/launch-post.md

# CodeDecay v0.1.1 Launch Post

We released CodeDecay v0.1.1: an open-source, local-first CLI/GitHub Action
for detecting PR regression risk and maintainability decay. No API keys, no LLM
calls, no telemetry.

Install:

```bash
npm install -D @submuxhq/codedecay
```