π foundry β Claude Code Plugin¶
OSS Claude Code config: 10 specialist agents, 9 skills, event-driven hooks, and a self-improvement loop for professional AI-assisted development.
For OSS workflows, also install the
ossplugin (/oss:review,/oss:release, ...). For development workflows, installdevelop(/develop:feature,/develop:fix, ...). For ML research, installresearch(/research:run,/research:topic, ...).
π Contents
- [What is foundry?](#what-is-foundry) - [Why foundry?](#why-foundry) - [Install](#install) - [Quick start](#quick-start) - [Skills reference](#skills-reference) - [`/foundry:init`](#foundryinit) - [`/foundry:audit`](#foundryaudit) - [`/foundry:calibrate`](#foundrycalibrate) - [`/foundry:manage`](#foundrymanage) - [`/foundry:brainstorm`](#foundrybrainstorm) - [`/foundry:investigate`](#foundryinvestigate) - [`/foundry:distill`](#foundrydistill) - [`/foundry:session`](#foundrysession) - [`/foundry:create`](#foundrycreate) - [Agents reference](#agents-reference) - [foundry:sw-engineer](#foundrysw-engineer) - [foundry:solution-architect](#foundrysolution-architect) - [foundry:qa-specialist](#foundryqa-specialist) - [foundry:linting-expert](#foundrylinting-expert) - [foundry:perf-optimizer](#foundryperf-optimizer) - [foundry:doc-scribe](#foundrydoc-scribe) - [foundry:web-explorer](#foundryweb-explorer) - [foundry:curator](#foundrycurator) - [foundry:challenger](#foundrychallenger) - [foundry:creator](#foundrycreator) - [Agent relationships](#agent-relationships) - [Rules installed](#rules-installed) - [Configuration](#configuration) - [Troubleshooting](#troubleshooting) - [Plugin structure](#plugin-structure) - [Upgrade](#upgrade) - [Uninstall](#uninstall) - [Contributing / feedback](#contributing--feedback)π€ What is foundry?¶
foundry is the base infrastructure plugin for Claude Code on Python and ML OSS projects. It gives Claude Code a team of ten non-overlapping specialist agents β each with deep, calibrated domain knowledge β paired with skills for managing their lifecycle, benchmarking their accuracy, and feeding corrections back into their instructions.
Without foundry, Claude Code is a generalist. It helps with code but does not know your release conventions, does not enforce routing to the right specialist, and has no mechanism to measure or improve its own accuracy over time. foundry packages all of that infrastructure in a single installable plugin.
π― Why foundry?¶
Without it: one model handles architecture, implementation, documentation, linting, testing, and performance β with no boundary enforcement between them. Corrections made in one session evaporate. There is no way to know whether agent accuracy has drifted.
With it:
/foundry:auditcatches config drift before it becomes a debugging session/foundry:calibratemeasures recall versus stated confidence so you know exactly where agents fall short/foundry:managecreates, renames, and deletes agents with full cross-reference propagation in a single command/foundry:brainstormturns a vague idea into an approved spec before a single line of code is written/foundry:distillconverts accumulated corrections into durable rules and agent instruction updates- Hooks keep lint, task tracking, and teammate quality gates running on every file save
The self-improvement loop β /foundry:audit catches structural drift; /foundry:calibrate catches behavioral drift; /foundry:distill surfaces patterns from your corrections β closes the feedback loop automatically.
π¦ Install¶
Prerequisites: Claude Code with plugin support; jq on PATH; node on PATH (required by hooks).
# Run from the directory that CONTAINS your Borda-AI-Rig clone
claude plugin marketplace add ./Borda-AI-Rig
claude plugin install foundry@borda-ai-rig
Install companion plugins if you need the full workflow suite:
claude plugin install oss@borda-ai-rig
claude plugin install develop@borda-ai-rig
claude plugin install research@borda-ai-rig
One-time setup β run inside Claude Code after installing:
This merges statusLine, permissions.allow, and enabledPlugins into ~/.claude/settings.json, and symlinks all rule files and TEAM_PROTOCOL.md into ~/.claude/. It is idempotent β safe to re-run.
After any plugin upgrade, re-run /foundry:init to refresh symlinks pointing to the new cache path.
β‘ Quick start¶
The one command that confirms everything is working:
Expected output: a structured report of system configuration checks (hooks, settings.json, plugin integration, symlinks). Zero critical findings means you are ready.
Follow up with:
This runs a quick routing accuracy benchmark β measures whether Claude Code dispatches tasks to the right agent. You should see routing accuracy at or above 90%.
π§ Skills reference¶
/foundry:init¶
Post-install setup. Merges settings and creates symlinks. Run once after install, and again after any upgrade.
What it does:
- Backs up
~/.claude/settings.jsonbefore touching it - Merges
statusLine,permissions.allow,permissions.deny,enabledPlugins - Copies
permissions-guide.mdto.claude/(only if absent β preserves project-local edits) - Symlinks all
plugins/foundry/rules/*.mdandTEAM_PROTOCOL.mdinto~/.claude/ - Removes stale
hooksblock from settings if present (hooks now register via plugin manifest)
Hooks (hooks.json) register automatically when the plugin is enabled β /foundry:init does not touch them directly.
/foundry:audit¶
Full-sweep quality audit of .claude/ configuration and all plugins/*/ agent and skill files. Catches broken cross-references, inventory drift, model-tier mismatches, description overlap, and documentation staleness. Reports findings by severity; fix level chosen from always-fire follow-up gate after report. Adversarial mode challenges every claim using foundry:challenger + Codex.
/foundry:audit # full sweep, report only β gate offers fix options
/foundry:audit --upgrade # fetch latest Claude Code docs, apply improvements with A/B testing
/foundry:audit --adversarial # adversarial review with foundry:challenger + Codex
# Tier 1 β group scopes
/foundry:audit agents # all agents
/foundry:audit skills # all skills
/foundry:audit rules # all rules
/foundry:audit communication # communication governance files
/foundry:audit setup # system config: settings.json, hooks, plugin integration
/foundry:audit plugin # foundry plugin integration checks only
/foundry:audit plugins # deep audit of all installed plugins
# Tier 2 β plugin name (shorthand for 'plugins <name>')
/foundry:audit oss # oss plugin agents + skills only
/foundry:audit foundry # foundry plugin only (same as 'plugins foundry')
/foundry:audit oss research # oss + research plugins
# Tier 3 β specific agent or skill name
/foundry:audit shepherd # single agent
/foundry:audit curator challenger # two agents
/foundry:audit review resolve # two skills
# Combine scope + flags
/foundry:audit oss --adversarial # oss plugin, adversarial review
/foundry:audit agents --adversarial # all agents, adversarial review
fix and upgrade are mutually exclusive β never combine them.
What the sweep checks (30 checks):
- Inventory drift: MEMORY.md roster vs files on disk
- Broken cross-references between agents and skills
- Hardcoded absolute user paths
- Model-tier appropriateness (reasoning vs execution roles)
- Agent description routing overlap (40%+ consecutive step overlap flagged)
- settings.json permissions vs Bash calls in skills
- Hook event names vs documented schema
- Claude Code docs freshness (spawns
foundry:web-explorerto fetch live docs) - Plugin integration correctness (codex plugin, foundry plugin)
- File length, heading hierarchy, LLM context minimality
- Config token overhead: total always-loaded config >100 KB, single rules file >10 KB (rules/ loads at session start; agents/skills are lazy-loaded)
Outputs a structured report. With a fix level: delegates fixes to sub-agents (never edits inline), then re-audits modified files to confirm fixes held. Convergence loop runs up to 5 passes.
/foundry:calibrate¶
Benchmarks agents and skills against synthetic problems with defined ground truth. Primary signal is calibration bias β the gap between self-reported confidence and actual recall. A well-calibrated agent reports 0.9 when it finds roughly 90% of issues.
/foundry:calibrate all --fast # quick benchmark across all modes (3 problems each)
/foundry:calibrate all --full # thorough benchmark (10 problems each)
/foundry:calibrate routing --fast # routing accuracy only β run after any agent description change
/foundry:calibrate agents --full --ab-test # agents + general-purpose baseline comparison
/foundry:calibrate all --fast --apply # benchmark then immediately apply improvement proposals
/foundry:calibrate --apply # apply proposals from the most recent past run
/foundry:calibrate foundry:sw-engineer --fast # single agent (tier 3 by full name)
# Tier 2 β plugin name
/foundry:calibrate oss --fast # all oss plugin agents + calibratable skills
/foundry:calibrate oss research --fast # oss + research plugins
# Tier 3 β specific agent or skill (bare name or plugin-prefixed)
/foundry:calibrate curator --fast # single agent by bare name
/foundry:calibrate curator shepherd # two agents (default --fast)
# Multiple targets
/foundry:calibrate agents skills --fast # agents + skills in one run
Thresholds:
- Routing accuracy: 90% (hard-problem accuracy: 80%)
- Recall per agent: 0.70 (below this, instruction improvement is needed)
- Calibration bias: within +/-0.15 (beyond this, confidence is decoupled from quality)
Modes:
agentsβ all specialist agentsskillsβ/foundry:auditand/oss:reviewroutingβ measures orchestrator dispatch accuracy for synthetic task promptscommunicationβ team protocol compliance, file-handoff protocol violationsrulesβ rule adherence across global and path-scoped rule filesplugins/<plugin-name>β all agents + calibratable skills for one or all plugins<agent-name>/<skill-name>β single target by bare or plugin-prefixed nameallβ all of the above
Results saved to .reports/calibrate/<timestamp>/<target>/. Improvement proposals written to proposal.md in each target directory and applied with --apply.
Agents and skills modes use dual-source evaluation: Claude and Codex generate problems and score responses independently, with Claude as 51% tiebreaker.
/foundry:manage¶
Create, update, or delete agents, skills, rules, and hooks with full cross-reference propagation. Keeps MEMORY.md, README, and settings.json in sync automatically.
/foundry:manage create agent security-auditor "Vulnerability scanning specialist for OWASP Top 10 and supply chain threats"
/foundry:manage create skill benchmark "Benchmark orchestrator for measuring performance across commits"
/foundry:manage create rule torch-patterns "PyTorch coding patterns β compile, AMP, distributed"
/foundry:manage update my-agent "add a section on error handling patterns"
/foundry:manage update my-agent new-agent-name # rename
/foundry:manage update my-agent docs/spec.md # apply spec file as change directive
/foundry:manage delete old-agent-name
/foundry:manage add perm "Bash(jq:*)" "Parse and filter JSON" "Extract fields from REST API responses"
/foundry:manage remove perm "Bash(jq:*)"
Create: fetches the latest Claude Code agent/skill frontmatter schema, picks an unused color, assigns a model tier based on role complexity, then delegates content generation to foundry:curator. Checks for overlap with existing agents before creating.
Update: auto-detects type from disk. Rename is atomic (write-before-delete). Content edits are delegated to foundry:curator (agents/skills) or foundry:sw-engineer (hooks). Propagates description changes to cross-references when more than 3 files are affected.
Delete: removes the file, cleans up broken references across .claude/, updates MEMORY.md and README.
Permissions: add perm and remove perm update both settings.json and permissions-guide.md atomically β never one without the other.
After any create or update, follow up with /foundry:calibrate routing --fast to confirm routing accuracy is unaffected.
/foundry:brainstorm¶
Turns a fuzzy idea into an approved exploration tree, then into a spec, then into an ordered action plan. Nothing is implemented until the user approves a design.
/foundry:brainstorm "add caching layer to the data pipeline"
/foundry:brainstorm "add caching layer to the data pipeline" --tight # fewer questions and operations
/foundry:brainstorm "add caching layer to the data pipeline" --deep # more exploration
/foundry:brainstorm "add caching layer to the data pipeline" --type workflow
/foundry:brainstorm breakdown .plans/blueprint/2026-04-01-caching-layer.md # tree -> spec
/foundry:brainstorm breakdown .plans/blueprint/2026-04-01-caching-layer-spec.md # spec -> action plan
Idea mode (default):
- Scans codebase for relevant existing code and constraints
- Asks up to 10 clarifying questions (5 with
--tight, 15 with--deep), one at a time - Presents 3-5 initial branches with core idea, tension resolved, and what it trades away
- Interactive operations loop: deepen, reject, resolve, merge, add β up to 10 rounds
- Saves tree to
.plans/blueprint/YYYY-MM-DD-<slug>.mdwithStatus: tree - Live tree viewer available at the URL printed during Step 1 (serve project root with
python3 -m http.server 8000)
Breakdown mode (breakdown <file>):
Status: treefile: distillation questions then section-by-section spec, saved withStatus: draftStatus: draftfile: resolves blocking open questions, then produces ordered action plan with tagged invocations
--type hint (application, workflow, utility, config, research) shapes question framing and codebase scan patterns in idea mode.
/foundry:investigate¶
Systematic diagnosis for unknown failures. Gathers signals, ranks hypotheses, probes the top candidates, and reports a confirmed root cause with a recommended next action.
/foundry:investigate "hooks not firing on Save"
/foundry:investigate "CI fails but passes locally"
/foundry:investigate "codex agent exits 127 on this machine"
/foundry:investigate "/calibrate times out every run"
What it covers: broken local setup, environment mismatches, tool misconfigurations, hook misbehavior, CI vs local divergence, permission errors, runtime anomalies.
Not for: known Python test failures with a traceback (use /develop:debug); .claude/ config quality sweep (use /foundry:audit).
Workflow: parse symptom -> gather signals in parallel (tool versions, PATH, recent git changes, config state, logs) -> rank hypotheses -> optional Codex adversarial review for ambiguous cases -> probe top hypotheses -> report root cause and recommended next skill.
Output always includes: confirmed root cause (or narrowed suspects), key evidence, what was ruled out, and a single recommended next action.
/foundry:distill¶
Extracts patterns from work history and corrections, then distills them into durable improvements β new agent or skill suggestions, roster quality review, memory pruning, promoting lessons into rules, or analysing external plugins and agentic resources for adoption.
/foundry:distill # analyze project patterns, suggest new agents/skills
/foundry:distill review # review existing roster for quality and gaps (no new suggestions)
/foundry:distill prune # trim stale/redundant entries from project MEMORY.md
/foundry:distill lessons # promote patterns from .notes/lessons.md into rules/agents/skills
/foundry:distill "external https://..." # analyse external plugin/skill/agent resource, produce adoption proposal
/foundry:distill "external ./path/to/plugin" # same β local path or directory
/foundry:distill "I keep doing X manually" # use description as context for suggestions
lessons mode is the primary post-correction consolidation path. It reads .notes/lessons.md and feedback_*.md memory files, clusters them by domain, classifies each entry as β rule, β agent update, β skill update, β already covered, or β too narrow, then generates proposals. Before applying, it runs a conflict pre-check β greps each target file for the section the delta would land in and flags cross-proposal collisions with β . Confirmed changes are applied and followed by a git diff gate so you can inspect or revert before committing.
external mode does a fast + slow read of the source (URL, file, or directory), extracts the mental model and standout implementation details, compares against the live local setup, then splits candidates into two groups: Align + improve (maps cleanly onto existing agents/skills/rules) and Differentiated highlights (novel, structurally different β interesting but larger work). Each candidate is scored and assigned to an adoption lane: adopt-as-is / tweak / discuss / skip. When Group A is thin or cumulative edit effort is large, it recommends installing the source as a standalone plugin with justification, rather than cherry-picking. Nothing is written until you confirm.
After applying: run /foundry:init to propagate new rule files to ~/.claude/.
Run monthly or after any burst of corrections.
/foundry:session¶
Parking lot for open-loop ideas and unanswered questions that arise mid-session. Parks items automatically as they arise; three on-demand commands manage them.
/foundry:session resume # list all pending parked items for this project
/foundry:session archive <text> # fuzzy-match and close a parked item
/foundry:session summary # session digest: completed tasks, parked items, recent commits
Items are stored in project-scoped memory (~/.claude/projects/<slug>/memory/session-open-*.md). Items older than 14 days are marked stale; items older than 30 days are deleted silently on resume.
Automatic parking (no command needed): when you send a new top-level request before answering Claude's prior clarifying question, or defer something with "let's come back to that", Claude parks the open item automatically so it is not lost to context compaction.
/foundry:create¶
Interactive outline co-creation for developer advocacy content. Collects format, audience profile, four-beat arc, and voice/tone through structured questions; detects out-of-scope requests; surfaces editorial conflicts; writes approved outline for foundry:creator to execute.
/foundry:create "tracing Python microservices with OpenTelemetry"
/foundry:create "why your CI pipeline is lying to you"
/foundry:create # no topic β skill asks interactively
Supported formats: blog post, Marp slide deck (conference/meetup talk), social thread (Twitter/LinkedIn), talk abstract (CFP submission), lightning talk (5β10 min).
Out-of-scope detection: refuses FAQs, comparison tables, and reference docs at Step 1, redirecting to foundry:doc-scribe.
Editorial conflict detection: if the brief implies an expert-level topic for a beginner audience (or vice versa), the skill surfaces the mismatch explicitly before writing.
Writes .plans/content/<slug>-outline.md. Hand off to foundry:creator after approval:
Max 5 AskUserQuestion interactions for a well-specified brief (format, audience, arc, voice). Skips interactive steps if all choices are provided in the initial brief.
π€ Agents reference¶
All ten agents are available by their full plugin-prefixed name. In spawn directives and subagent_type values, always use the full prefix (foundry:sw-engineer, not sw-engineer).
foundry:sw-engineer¶
Role: senior software engineer for writing and refactoring Python code.
Use for: implementing features, fixing bugs, TDD/test-first development, SOLID principles, type safety, production-quality Python for OSS libraries.
Model: opus
Not for: docstrings (use foundry:doc-scribe), configuring ruff/mypy (use foundry:linting-expert), system design decisions (use foundry:solution-architect), test quality analysis (use foundry:qa-specialist), performance profiling (use foundry:perf-optimizer), ML paper implementations (use research:scientist), editing .claude/ config files (use foundry:curator).
Runs in an isolated worktree by default to keep changes sandboxed until review.
foundry:solution-architect¶
Role: system design specialist for ADRs, API surface design, interface specs, migration plans, and coupling analysis.
Use for: evaluating architectural trade-offs, designing public API contracts, planning deprecation strategies, assessing architectural feasibility of AI-generated hypotheses against codebase constraints.
Model: opusplan (plan-gated Opus)
Not for: writing implementation code (use foundry:sw-engineer), release management (use oss:shepherd), performance profiling or DataLoader throughput tuning (use foundry:perf-optimizer).
Produces documentation β ADRs, interface contracts, migration plans, component diagrams β not production code. Hands off to foundry:sw-engineer for execution.
foundry:qa-specialist¶
Role: QA specialist for writing, reviewing, and fixing tests. Rigorous black-box end-user tester: focuses exclusively on the public API surface, derives expectations from docs/type hints β not implementation, and writes tests that represent realistic user workflows.
Use for: writing new pytest tests, analyzing public-API coverage gaps, building edge-case matrices, fixing failing tests, integration test design. Automatically includes OWASP Top 10 security perspective when used in agent teams.
Model: opus
Not for: linting, type checking, or annotation fixes (use foundry:linting-expert), production implementation (use foundry:sw-engineer), slow test suite profiling or optimizing test execution speed (use foundry:perf-optimizer), testing private/internal methods or mocking internals.
Writes deterministic, parametrized, behavior-focused tests. Systematic progression: happy path β edge cases β error cases β boundary values β adversarial inputs. Applies a public-API coverage checklist before marking done.
foundry:linting-expert¶
Role: static analysis and tooling specialist for Python.
Use for: configuring ruff rules, mypy strictness, pre-commit hooks, fixing lint/type violations, adding missing type annotations, defining the lint/type content of quality gates. Handles final code sanitization before handover.
Model: haiku (high-frequency, lightweight diagnostics)
Not for: CI pipeline structure or runner strategy (use oss:cicd-steward), writing test logic (use foundry:qa-specialist), implementation fixes beyond annotation/style (use foundry:sw-engineer), inline docstrings or API reference writing (use foundry:doc-scribe).
Always downstream of foundry:sw-engineer β never lints code that has not yet been implemented.
foundry:perf-optimizer¶
Role: performance engineer for profiling and optimizing CPU, GPU, memory, and I/O bottlenecks.
Use for: profiling Python/ML workloads, identifying DataLoader bottlenecks, applying mixed precision, vectorizing loops, tuning PyTorch throughput.
Model: opus
Not for: general code refactoring (use foundry:sw-engineer), architectural redesign (use foundry:solution-architect).
Strictly profile-first: measures before changing, changes one thing, measures again. Optimization order: algorithm -> data structure -> I/O -> memory -> concurrency -> vectorization -> compute -> caching. Never jumps to GPU tuning before checking I/O.
foundry:doc-scribe¶
Role: documentation specialist for docstrings, API references, and README files.
Use for: auditing missing docstrings, writing Google-style (Napoleon) docstrings from code, creating or updating README content, finding doc/code inconsistencies.
Model: sonnet
Not for: CHANGELOG entries or release notes (use oss:shepherd for lifecycle/format decisions, /oss:release for automated generation), linting code examples (use foundry:linting-expert), implementation code (use foundry:sw-engineer), outward-facing narrative artifacts like blog posts, talk slides, or social threads (use foundry:creator).
Always downstream β documents finalized code, never shapes design. After foundry:doc-scribe produces content, follow with foundry:linting-expert to sanitize code examples in the output.
foundry:web-explorer¶
Role: web fetch and content extraction specialist.
Use for: fetching live library docs, API references, changelogs, migration guides, package version lookups, GitHub release extraction. Used internally by /foundry:audit upgrade and /foundry:manage create.
Model: sonnet
Not for: code analysis or implementation (use foundry:sw-engineer), ML paper analysis (use research:scientist), writing docstrings (use foundry:doc-scribe), dependency upgrade lifecycle decisions (use oss:shepherd).
Feeds research:scientist β fetches current docs and papers; scientist interprets.
foundry:curator¶
Role: quality guardian of Claude config markdown files β agents, skills, and rules.
Use for: auditing .claude/ config files for verbosity creep, cross-agent duplication, broken cross-references, structural violations, outdated content, and roster overlap. Used internally by /foundry:audit and /foundry:manage.
Model: opusplan
Not for: hook files (*.js) β those belong to foundry:sw-engineer. Not for creating or scaffolding new agents or skills (use /foundry:manage create). Not for routing new tasks to other agents.
You will generally not invoke this agent directly. /foundry:audit spawns it in batches across all config files; /foundry:manage delegates content generation and editing to it.
foundry:challenger¶
Role: adversarial reviewer for implementation plans, architecture proposals, and significant code reviews.
Use for: red-teaming a plan before committing to it, challenging architectural decisions before they ship, adversarial code review on security-sensitive or irreversible operations. Treats every claim as unproven until backed by evidence. Attacks across 6 dimensions (Assumptions, Missing Cases, Security Risks, Architectural Concerns, Complexity Creep, Root Cause) β drills to bedrock for every standing challenge (keeps asking "why?" until root cause found, not just surface symptom). Applies mandatory refutation step to stay objective: accepts refutation when evidence warrants.
When codex@openai-codex plugin is installed, challenger automatically launches a parallel Codex adversarial review track (same target, --scope auto) and aggregates the results β findings from both tracks are reported together with convergence callouts where both flagged the same area. Pass --no-codex in the prompt to skip. If Codex is installed but the parallel run fails for any reason, the failure is surfaced in the report; results are never silently dropped to Claude-only.
Model: opus
Not for: designing plans or ADRs (use foundry:solution-architect), writing tests or test coverage review (use foundry:qa-specialist), config file quality review (use foundry:curator).
Read-only β never writes or edits files. Runs by default in all /develop:* skills and /oss:review β skip with --no-challenge.
foundry:creator¶
Role: developer advocacy content specialist for outward-facing narrative artifacts.
Use for: generating complete blog posts, Marp slide decks, social threads, talk abstracts, and lightning talk outlines in one autonomous pass. Imagines the ideal reader experience first, then works backwards to structure and form β questions status-quo conventions before accepting them, pushes for genuinely fresh angles. Reads an approved outline file (.plans/content/<slug>-outline.md) produced by /foundry:create. Applies a four-beat story arc (ProblemβJourneyβInsightβAction) calibrated to the target audience level.
Model: opus
Not for: in-code documentation, docstrings, or API references (use foundry:doc-scribe), release notes or changelogs (use oss:shepherd), structured reference content such as FAQs or comparison tables (redirect to foundry:doc-scribe).
Always downstream of /foundry:create β reads the approved outline file and generates the full artifact. The two-phase system: /foundry:create (interactive intake β outline) then foundry:creator (autonomous generation β artifact).
π Agent relationships¶
Agents form a directed pipeline, not a flat pool:
foundry:linting-expertis always downstream offoundry:sw-engineerβ never lints code that has not been implementedfoundry:doc-scribeis always downstream β documents finalized code, never shapes designfoundry:qa-specialistruns parallel tofoundry:sw-engineerduring review, or downstream after implementationfoundry:challengeris pre-implementation β challenges plans and proposals before any code is written; use beforefoundry:sw-engineerfoundry:curatoris orthogonal β audits.claude/config files, not user codefoundry:web-explorerfeedsresearch:scientistβ fetches current docs and papers; scientist interpretsfoundry:creatoris always downstream of/foundry:createβ reads the approved outline file; never generates content without a prior outline
Model tiering: reasoning agents (foundry:sw-engineer, foundry:qa-specialist, foundry:perf-optimizer) use opus; plan-gated roles (foundry:solution-architect, foundry:curator, foundry:challenger) use opusplan; execution agents (foundry:doc-scribe, foundry:web-explorer, foundry:creator) use sonnet; high-frequency diagnostics (foundry:linting-expert) use haiku.
π Rules installed¶
/foundry:init symlinks all rule files from plugins/foundry/rules/ into ~/.claude/rules/. These govern Claude's behavior globally across all sessions after install.
| Rule file | Applies to | What it governs |
|---|---|---|
communication.md |
all | Re: anchor format, progress narration, tone, output routing, breaking-findings format, terminal colors |
quality-gates.md |
all | Confidence block format, Internal Quality Loop, link verification, output routing (long output to file) |
git-commit.md |
all | Commit message format, diff-gathering before writing, co-author trailers, branch and push safety |
claude-config.md |
all | Bash timeouts (3x P90), directory navigation rules, no hardcoded absolute paths |
artifact-lifecycle.md |
all | Canonical artifact layout (.plans/, .reports/, .temp/), run directory naming, TTL policy |
external-data.md |
all | Pagination rules for GitHub CLI, REST APIs, GraphQL, Cloud APIs β never work on partial result set |
foundry-config.md |
.claude/** |
Plan-mode gate before any .claude/ edit, post-edit checklist, XML tag conventions, distribution rules |
python-code.md |
**/*.py |
Google-style docstrings (no exceptions), deprecation version check before generating deprecation code |
testing.md |
tests/**/*.py, **/test_*.py |
pytest design: TDD process, fixture conventions, parametrization, what to test in priority order |
public-github.md |
all | Read-only policy on public GitHub β permitted reads vs permanently forbidden write operations |
βοΈ Configuration¶
settings.json keys merged by /foundry:init¶
| Key | What it does |
|---|---|
statusLine.command |
Runs statusline.js to display active agent count in the Claude Code status bar |
permissions.allow |
Adds pre-approved Bash commands, git operations, and WebFetch domains |
permissions.deny |
Adds permanently denied write operations (public GitHub mutations, destructive git) |
enabledPlugins["codex@openai-codex"] |
Enables Codex plugin for adversarial review in /foundry:calibrate and /foundry:audit |
Optional flags and knobs¶
--approve on /foundry:init: skips all interactive prompts and auto-accepts recommended choices. Use for scripted or CI setups.
--skip-audit on /foundry:manage: skips the trailing /foundry:audit validation step. Use inside audit-initiated fix sessions to avoid recursion.
Calibration pace: --fast (3 problems per target, default) vs --full (10 problems per target). Use --fast for routine checks after agent edits; use --full for thorough benchmarks before releases or after major instruction changes.
Brainstorm ceremony: --tight (5/5/1 caps for well-scoped ideas), default (10/10/2), --deep (15/15/3 for genuinely ambiguous problems).
Environment¶
No environment variables required. foundry reads from ~/.claude/settings.json and the plugin's installed cache path, both resolved automatically by /foundry:init.
## π Troubleshooting
## π Troubleshooting **`/foundry:audit` reports broken symlinks (Check I3)** Symlinks in `~/.claude/rules/` point to the previous plugin cache path after an upgrade. Re-run `/foundry:init` β Step 9 detects stale symlinks as conflicts and offers to replace them. **Hooks not firing** Run `/foundry:investigate "hooks not firing on Save"`. Most common cause: a `hooks` block is still present in `~/.claude/settings.json` from a pre-plugin-migration install (hooks now register via plugin manifest, not the `hooks` key). `/foundry:init` Step 3 detects and removes the stale block. **`/foundry:calibrate` times out** Each pipeline subagent has a 10-minute hard cutoff (15 minutes when Codex is active). If a target consistently times out, run it in isolation: `/foundry:calibrate foundry:sw-engineer --fast`. For persistent issues: `/foundry:investigate "/calibrate times out every run"`. **`/foundry:manage create` picks wrong model tier** Model tier is chosen by role complexity at creation time: `opusplan` for plan-gated roles, `opus` for complex implementation, `sonnet` for focused execution, `haiku` for high-frequency diagnostics. To fix after creation: `/foundry:manage update## ποΈ Plugin structure
## ποΈ Plugin structureplugins/foundry/
βββ .claude-plugin/
β βββ plugin.json version + metadata
β βββ permissions-allow.json allow-list merged by /foundry:init
β βββ permissions-deny.json deny-list merged by /foundry:init
βββ agents/ 10 specialist agent files
βββ skills/ 9 skill directories (audit, brainstorm, calibrate, create, distill,
β init, investigate, manage, session)
βββ rules/ 10 rule files symlinked to ~/.claude/rules/ by /foundry:init
βββ CLAUDE.md workflow rules distributed via /foundry:init
βββ TEAM_PROTOCOL.md AgentSpeak v2 inter-agent protocol
βββ permissions-guide.md annotated allow/deny reference (copied to .claude/ by init)
βββ hooks/
βββ hooks.json hook registrations (${CLAUDE_PLUGIN_ROOT} paths)
βββ task-log.js SubagentStart/Stop tracking to /tmp/claude-state-<session>/
βββ statusline.js status bar agent counts
βββ teammate-quality.js TaskCompleted/TeammateIdle teammate output quality gate
βββ lint-on-save.js runs pre-commit after every Write/Edit; async + cross-session lock; 15s timeout
βββ rtk-rewrite.js transparently rewrites CLI calls for token compression
βββ commit-guard.js PreToolUse Bash guard that blocks git commit unless authorized by a skill sentinel
βββ md-compress.js compresses large markdown files before they enter context