No action executes without consensus.
DSS-AI-Consensus drives code changes through a 5-role consensus process — Architect, Developer, Tester, Reviewer, and Builder — with up to 15 iterations per run. Every decision requires quorum approval before execution, with explainable reasoning, confidence scoring, pattern memory across 6 categories, and full rollback authority.
Autonomous agents without guardrails create unaccountable risk.
AI-driven automation is powerful — until a single agent makes a destructive decision with no oversight, no explainability, and no way to undo the damage.
Single-agent failure
One model hallucination, one confidence miscalibration, one edge case — and a production system acts on bad judgment with no second opinion.
No explainability
When something goes wrong, there’s no reasoning trail. Why was this action taken? Who approved it? What was the alternative? Silence.
Irreversible actions
Scaling infrastructure, deploying hotfixes, modifying data — these actions have consequences. Without quorum, there’s no brake pedal.
Consensus turns AI from a clever demo into an operating model.
Most AI coding setups either run one-shot agents or bury the real work in transcripts no one can govern. Consensus is built for teams that need repeatable execution, evidence, and control.
Multi-provider without operational drift
Use Claude, Codex, OpenCode, Ollama, or local models under one orchestration surface instead of creating separate workflows and different trust models for each provider.
Memory and code intelligence compound over time
Runs inherit pattern memory, artifacts, and code intelligence context so the system gets sharper instead of repeating the same mistakes every day.
Execution is gated, not improvised
Consensus is valuable because it can stop, explain, retry, and recover. The goal is governed delivery pressure, not autonomous chaos.
Decisions require consensus. Execution requires proof.
Quorum voting
Multiple specialized agents evaluate each proposed action. Execution proceeds only when a configurable quorum threshold is met.
Confidence scoring
Each agent vote carries a confidence percentage. Low-confidence approvals are weighted differently than high-confidence ones, preventing rubber-stamping.
Role-based agents
Cost analysts, security reviewers, compliance officers, performance monitors — each agent evaluates from its specialized perspective.
Decision audit trail
Every decision records who voted, their reasoning, confidence levels, and the final outcome. Full explainability for every autonomous action.
Accountable autonomy. Auditable by design.
Every decision is logged with full reasoning chains. Rollback authority ensures no action is truly irreversible. Compliance teams get explainable AI governance out of the box.
Every change earns its way to production.
Five specialized AI roles evaluate each code change through structured plan and execute phases. No single agent can bypass the process.
Plan phase
Architect, Developer, Tester, and Reviewer each analyze the proposed change from their perspective. All four must PASS before any code is written. One FAIL triggers a new iteration with targeted feedback.
Execute phase
Developer implements changes. Tester validates. Reviewer audits. Builder runs final quality gates — lint, typecheck, test suite, and Docker build. Every gate must pass.
Quality gates
Automated pipeline: lint, typecheck, test, Docker build, commit, push. A loop breaker stops the run after three consecutive failures on the same stage — no wasted iterations.
Five AI roles. Configurable model routing.
Each role in the 5-role process can run on a different model. Three presets control the cost-quality tradeoff across the entire pipeline.
Architect
Analyzes project structure, dependency graphs, and pattern memory to produce a scoped implementation plan. Uses the strongest available model by default — Claude Opus or GPT-5.4 — because architectural mistakes are the most expensive to fix.
Developer
Implements code changes according to the architect plan. Receives targeted file context from the code intelligence engine. Runs on the balanced model tier unless the task complexity triggers an automatic upgrade.
Tester
Validates the implementation against quality gates and test expectations. Reviews test coverage gaps and edge cases. Can run on a faster, cheaper model — correctness checking is less token-intensive than generation.
Reviewer
Audits the combined plan and implementation for soundness, security, and adherence to project conventions. Acts as the final human-equivalent gate before the builder phase.
Builder
Executes the automated pipeline — lint, typecheck, test suite, Docker build, commit, and push. No AI judgment at this stage; the builder enforces deterministic quality gates. Failures feed back into the next iteration.
Model routing presets
Three presets — balanced, fast, and strong — control which model each role uses. Balanced optimizes cost-quality across the pipeline. Fast uses cheaper models for all roles. Strong routes every role to the premium tier for maximum accuracy.
Structured convergence, not open-ended loops.
The orchestrator drives toward a working solution through bounded iterations with built-in circuit breakers.
Up to 15 iterations
Each run allows up to 15 plan/execute cycles. Code intelligence injects affected symbols and blast radius into the architect prompt at every iteration. Every cycle narrows the solution based on structured feedback from all four plan-phase roles. No unbounded loops — every iteration has a purpose.
Loop breaker
If the same quality gate fails 3 consecutive times, the run stops automatically. No wasted compute on a stuck problem — the system surfaces the blocker instead of retrying blindly.
Repeat runs (1–10)
Queue up to 10 consecutive runs on the same project. Each repeat starts fresh but inherits pattern memory from prior runs — compounding the system’s understanding over time.
Runs compound. Knowledge persists.
Pattern memory bridges individual runs into a continuous learning system. Each success and failure shapes the next orchestration cycle.
6 pattern categories
Memories are classified into six categories: code-change, plan-gate, test, build, docker, and no-op. Each entry stores a lesson summary and prompt hint. Role-specific memory profiles control loading: the architect sees 3 successes + 3 failures (repo and cross-project), while the tester sees 1 success + 2 failures.
Cross-run inheritance
When a repeat run starts, it loads the top-ranked patterns from all previous runs on the same project. The architect receives lessons learned before writing the first plan — so mistakes from run 1 are avoided in run 2.
Memory ranking & decay
Patterns are ranked by relevance and recency. Stale patterns decay in priority over time. High-value patterns — those that prevented repeated failures — are boosted and persist longer across runs.
The orchestrator understands your codebase.
Pure-JS analysis
Regex-based parsing across 9 languages (JS, TS, Python, Java, C#, Go, Rust, C, C++). Import resolution, call graphs, community detection, and symbol indexing — no native dependencies, no tree-sitter. The same engine powers MSPStudio’s code intelligence.
Pattern Memory
Every success and failure is recorded across 6 pattern categories with full context. The next run loads relevant lessons into the architect prompt — so the system learns from past mistakes instead of repeating them. Memory persists across runs and projects.
Knowledge Graph
Interactive canvas visualization of your codebase: symbols, relationships, and execution flows. Zoom, pan, and click through the dependency graph. Available in Mission Control and inherited by MSPStudio.
Full operational visibility. One interface.
11-panel GUI
Project Registry, Code Intelligence, Pattern Memory, Run Ledger, Agent Topology, Active Operations, AI Roles & Communication, Model Routing, and more — every aspect of the orchestration in one view.
Multi-provider routing
Run with Claude (Opus/Sonnet/Haiku) or Codex (GPT-5.4/GPT-5.4-mini). Automatic model fallback on usage limits. Cross-provider fallback when configured. Three presets — balanced, fast, and strong — control cost vs. quality tradeoffs per role.
Docker deployment
Isolated container with credential sync from host. Read-only auth mount, per-provider volumes, cross-platform support (Mac, Linux, WSL). One docker compose up and the system is running.
Choose your AI engine. Switch without reconfiguring.
Consensus supports multiple AI providers with automatic fallback, cross-provider routing, and per-role model assignment.
Claude (Anthropic)
OAuth authentication via host credential sync. Claude Opus for architect-tier reasoning, Sonnet for balanced execution, Haiku for fast validation. Credentials bind-mounted read-only from the host — no secrets in container images.
Codex (OpenAI)
GPT-5.4 for premium-tier decisions, GPT-5.4-mini for cost-optimized roles. API key management through the same host credential sync. Automatic model fallback on rate limits or quota exhaustion.
Cross-provider fallback
When one provider hits usage limits, the system falls back automatically: Claude cascades Opus → Sonnet → Haiku, Codex cascades GPT-5.4 → GPT-5.4-mini → GPT-5.3-codex. Cross-provider fallback (Claude ↔ Codex) activates when configured. Pattern memory and run state persist across provider switches — no lost context.
Know what every run costs. See every decision.
Per-run cost tracking
Token usage and estimated cost are tracked per run, per role, and per model. The Run Ledger shows cumulative spend across all orchestration runs — no surprise bills.
Model routing presets
Three presets control the cost/quality tradeoff: balanced (default), fast (cheaper models for routine tasks), and strong (premium models for complex decisions). Switch presets per run without reconfiguring.
Cross-platform support
Runs on Mac, Linux, and WSL. Docker deployment with credential sync ensures consistent behavior across development and CI environments. No platform-specific lock-in.
Eleven panels. Every orchestration dimension.
Mission Control gives you a single interface for the entire orchestration lifecycle — from project selection to cost analysis.
Project Registry & Run Ledger
Browse all registered projects, view run history, and drill into per-run cost breakdowns. The ledger tracks token usage, model selection, and outcome status for every orchestration run.
Agent Topology & Active Ops
Visualize which agents are active, their current role assignments, and real-time communication flow. See exactly which agent is evaluating which file at any moment.
Code Intelligence & Knowledge Graph
Interactive dependency visualization across your codebase. Symbol resolution, import chains, and community clustering — the same graph the architect uses to scope implementation plans.
Pattern Memory Browser
Browse, search, and rank stored patterns across all 6 categories: architectural decisions, implementation patterns, test strategies, failure modes, performance observations, and security findings. See which lessons shaped each run.
AI Roles & Communication
Inspect each role’s prompt context, token budget, and inter-role message flow. See exactly what the architect told the developer, what the tester flagged, and how feedback propagated through iterations.
Model Routing & Cost Dashboard
Configure which model serves each role. View per-run token usage, estimated cost, and cumulative spend across projects. Switch between balanced, fast, and strong presets with immediate effect on the next iteration.
Production-ready from first docker compose up.
DSS-AI-Consensus runs as an isolated Docker container with built-in recovery and credential management.
Read-only auth mount
API keys for Claude, Codex, and Ollama are mounted read-only from the host. No secrets baked into images. Rotate credentials externally and the container picks them up on next restart.
Persistent run state
Pattern memory, run history, and project configuration persist across container restarts via mounted volumes. Crash recovery resumes from the last consistent checkpoint — no lost progress.
Cross-platform parity
Identical behavior on Mac, Linux, and WSL. Docker Compose handles volume mapping and network configuration. No platform-specific workarounds or conditional logic.
Add governance to your autonomous operations.
Start an evaluation or explore the interactive demo to see multi-agent consensus voting in action.
