498 lines
12 KiB
Markdown
498 lines
12 KiB
Markdown
# Execution Backlog_ Acceptance Ownership Sales Readiness
|
|
|
|
**Date:** 2026-04-14
|
|
**Status:** Draft execution backlog
|
|
**Purpose:** Translate the implementation blueprint into a sequenced delivery backlog with acceptance gates, ownership guidance, and sales-readiness criteria.
|
|
|
|
## 1. Purpose
|
|
|
|
This document defines the executable work program for the colony orchestration layer.
|
|
|
|
## 2. Delivery Principles
|
|
|
|
The delivery program should follow six principles.
|
|
|
|
### 2.1 Sellable Before Comprehensive
|
|
|
|
The first release must be narrow enough to be dependable. A small system that works under real operator pressure is more valuable than a broad colony that only works in demos.
|
|
|
|
### 2.2 Domain Missions Before General Intelligence
|
|
|
|
Do not begin with free-form everything agents. Start with typed missions:
|
|
|
|
- Oracle advisory
|
|
- CRM lead intelligence
|
|
- Catalyst strategy brief
|
|
|
|
### 2.3 Governance Before Autonomy
|
|
|
|
Any action that reads sensitive data, reaches the public web, or proposes writeback must be governed before being automated.
|
|
|
|
### 2.4 Artifacts Before UX Polish
|
|
|
|
The internal artifacts, schemas, and traces must be correct before product polish begins. Without them, debugging and improvement loops stay weak.
|
|
|
|
### 2.5 Root Compatibility Is Mandatory
|
|
|
|
Any colony feature that fights the current root architecture is wrong for Sprint 1.
|
|
|
|
### 2.6 Review Once, Review Well
|
|
|
|
Use one explicit review pass by default. Do not create endless recursive reviewer loops.
|
|
|
|
## 3. Workstream Backlog
|
|
|
|
The backlog below is ordered. Teams should not skip ahead unless a predecessor is stable enough for the next dependency.
|
|
|
|
### Workstream A: Colony Service Bootstrap
|
|
|
|
Objective:
|
|
|
|
Stand up `services/colony-orchestrator/` as a buildable TypeScript service.
|
|
|
|
Tasks:
|
|
|
|
- create package scaffold
|
|
- define environment config contract
|
|
- import or adapt Open Multi Agent execution concepts
|
|
- expose service bootstrap and health endpoint
|
|
|
|
Definition of done:
|
|
|
|
- service builds locally
|
|
- service starts locally
|
|
- service exposes `/health`
|
|
|
|
### Workstream B: Contract Layer
|
|
|
|
Objective:
|
|
|
|
Make mission and artifact exchange explicit and versioned.
|
|
|
|
Tasks:
|
|
|
|
- define mission envelope schema
|
|
- define task graph schema
|
|
- define prompt package schema
|
|
- define librarian pass schema
|
|
- define research artifact schema
|
|
- define worker result schema
|
|
- define aggregation packet schema
|
|
- define review packet schema
|
|
- define policy decision schema
|
|
|
|
Definition of done:
|
|
|
|
- every colony stage reads and writes only typed contracts
|
|
- invalid payloads fail fast
|
|
|
|
### Workstream C: Root Persistence Layer
|
|
|
|
Objective:
|
|
|
|
Add mission persistence to the current PostgreSQL root.
|
|
|
|
Tasks:
|
|
|
|
- create `backend/db/schema_colony.sql`
|
|
- create root repository helper module
|
|
- add mission storage methods
|
|
- add artifact storage methods
|
|
|
|
Definition of done:
|
|
|
|
- missions and artifacts can be inserted, updated, and queried from root Python
|
|
|
|
### Workstream D: Root Gateway and Routes
|
|
|
|
Objective:
|
|
|
|
Expose a root-owned API for colony mission execution.
|
|
|
|
Tasks:
|
|
|
|
- implement `backend/services/colony_gateway.py`
|
|
- implement `backend/api/routes_colony.py`
|
|
- mount colony router in `backend/main.py`
|
|
|
|
Definition of done:
|
|
|
|
- root can create and fetch missions through `/api/colony/*`
|
|
|
|
### Workstream E: Planner and Prompt Master
|
|
|
|
Objective:
|
|
|
|
Create the first meaningful colony reasoning layer.
|
|
|
|
Tasks:
|
|
|
|
- mission normalization
|
|
- task DAG planning
|
|
- prompt package generation
|
|
- prompt template registry
|
|
- prompt exemplar registry
|
|
|
|
Definition of done:
|
|
|
|
- one mission can be decomposed into a persisted task graph and prompt package set
|
|
|
|
### Workstream F: Librarian and Researcher
|
|
|
|
Objective:
|
|
|
|
Make context routing and evidence gathering real.
|
|
|
|
Tasks:
|
|
|
|
- internal source catalog
|
|
- route card generation
|
|
- cache index design
|
|
- search provider abstraction
|
|
- browser provider abstraction
|
|
- citation normalization
|
|
|
|
Definition of done:
|
|
|
|
- workers can receive internal passes and external evidence packets
|
|
|
|
### Workstream G: Workers, Aggregator, Reviewer
|
|
|
|
Objective:
|
|
|
|
Complete the minimum viable colony loop.
|
|
|
|
Tasks:
|
|
|
|
- worker factory
|
|
- worker execution runner
|
|
- result normalizer
|
|
- aggregator packet builder
|
|
- contradiction detector
|
|
- reviewer packet builder
|
|
|
|
Definition of done:
|
|
|
|
- a mission can execute from planning through reviewed final output
|
|
|
|
### Workstream H: Governance Bridge
|
|
|
|
Objective:
|
|
|
|
Connect colony execution to root policy and Nemoclaw-derived controls.
|
|
|
|
Tasks:
|
|
|
|
- risk classifier
|
|
- tool policy
|
|
- model routing policy
|
|
- writeback policy
|
|
- root policy bridge
|
|
|
|
Definition of done:
|
|
|
|
- at least one disallowed tool and one disallowed writeback are blocked and logged
|
|
|
|
### Workstream I: Oracle Adapter
|
|
|
|
Objective:
|
|
|
|
Make Oracle the first colony consumer.
|
|
|
|
Tasks:
|
|
|
|
- mission mapping for Oracle prompt submit
|
|
- Oracle context enrichment
|
|
- reviewed result return contract
|
|
- writeback proposal return contract
|
|
|
|
Definition of done:
|
|
|
|
- one Oracle advisory mission works end to end through the colony
|
|
|
|
### Workstream J: CRM Adapter
|
|
|
|
Objective:
|
|
|
|
Make CRM the second colony consumer.
|
|
|
|
Tasks:
|
|
|
|
- mission mapping for lead intelligence
|
|
- lead and chat context enrichment
|
|
- message suggestion output contract
|
|
- next-step recommendation contract
|
|
|
|
Definition of done:
|
|
|
|
- one CRM lead intelligence mission works end to end through the colony
|
|
|
|
### Workstream K: Catalyst Adapter
|
|
|
|
Objective:
|
|
|
|
Prepare the next monetizable mission surface.
|
|
|
|
Tasks:
|
|
|
|
- define strategy brief mission type
|
|
- connect to campaign and analytics state
|
|
- produce planning-grade artifact only
|
|
|
|
Definition of done:
|
|
|
|
- Catalyst mission can generate reviewed strategy brief in test mode
|
|
|
|
### Workstream L: Observability and Ops
|
|
|
|
Objective:
|
|
|
|
Make the system operable under real internal use.
|
|
|
|
Tasks:
|
|
|
|
- mission trace IDs
|
|
- structured logs
|
|
- duration metrics
|
|
- token usage metrics
|
|
- failure taxonomy
|
|
- replay and inspection support
|
|
|
|
Definition of done:
|
|
|
|
- every mission can be inspected after execution
|
|
|
|
## 4. Acceptance Gates
|
|
|
|
No stage should advance to broader rollout until these gates pass.
|
|
|
|
### Gate 1: Service Integrity
|
|
|
|
Pass conditions:
|
|
|
|
- colony service builds
|
|
- root backend still builds and starts
|
|
- no route collisions
|
|
- no regression in Oracle, CRM, Sentinel, or Catalyst route mounting
|
|
|
|
### Gate 2: Contract Integrity
|
|
|
|
Pass conditions:
|
|
|
|
- all mission and artifact schemas validate
|
|
- malformed payloads fail deterministically
|
|
- persisted records match schema versions
|
|
|
|
### Gate 3: Planner Integrity
|
|
|
|
Pass conditions:
|
|
|
|
- mission produces a task DAG
|
|
- tasks include dependencies and role assignments
|
|
- prompt packages are generated per task
|
|
|
|
### Gate 4: Context Integrity
|
|
|
|
Pass conditions:
|
|
|
|
- librarian passes are scoped
|
|
- research artifacts include provenance
|
|
- workers do not receive unrestricted context blobs
|
|
|
|
### Gate 5: Execution Integrity
|
|
|
|
Pass conditions:
|
|
|
|
- worker tasks execute
|
|
- aggregation packet is produced
|
|
- one-pass review packet is produced
|
|
- final output is returned with traceability
|
|
|
|
### Gate 6: Governance Integrity
|
|
|
|
Pass conditions:
|
|
|
|
- blocked tool requests are denied
|
|
- blocked writeback proposals are denied
|
|
- model routing class is recorded
|
|
- policy decisions are auditable
|
|
|
|
### Gate 7: Oracle Mission Readiness
|
|
|
|
Pass conditions:
|
|
|
|
- Oracle can submit mission
|
|
- mission returns reviewed response
|
|
- optional writeback proposal is structured
|
|
- Oracle v1 remains intact
|
|
|
|
### Gate 8: CRM Mission Readiness
|
|
|
|
Pass conditions:
|
|
|
|
- CRM can submit mission
|
|
- lead context is used
|
|
- result includes actionable insights
|
|
- no regression in existing CRM routes
|
|
|
|
### Gate 9: Internal Sales Readiness
|
|
|
|
Pass conditions:
|
|
|
|
- one internal operator can use the flow with low supervision
|
|
- outputs are stable enough for internal client-facing preparation
|
|
- failures are diagnosable without reading raw model output
|
|
|
|
## 5. Ownership Model
|
|
|
|
Ownership should be explicit even if one person temporarily covers multiple roles.
|
|
|
|
### Orchestration Kernel Owner
|
|
|
|
Owns:
|
|
|
|
- colony runtime
|
|
- task graph execution
|
|
- worker provisioning
|
|
- review loop mechanics
|
|
|
|
### Root Integration Owner
|
|
|
|
Owns:
|
|
|
|
- Python gateway
|
|
- FastAPI route integration
|
|
- PostgreSQL persistence layer
|
|
- adapter surfaces in root
|
|
|
|
### Prompt Systems Owner
|
|
|
|
Owns:
|
|
|
|
- prompt template registry
|
|
- prompt exemplar registry
|
|
- prompt package quality
|
|
- prompt evaluation traces
|
|
|
|
### Policy and Safety Owner
|
|
|
|
Owns:
|
|
|
|
- risk classification
|
|
- tool scope policy
|
|
- model routing policy
|
|
- writeback release policy
|
|
|
|
### Domain Adapter Owner
|
|
|
|
Owns:
|
|
|
|
- Oracle mission mapping
|
|
- CRM mission mapping
|
|
- Catalyst mission mapping
|
|
- Sentinel evidence attachment path
|
|
|
|
### QA and Observability Owner
|
|
|
|
Owns:
|
|
|
|
- scenario validation
|
|
- failure matrix
|
|
- trace verification
|
|
- release criteria evidence
|
|
|
|
## 6. Sales Readiness Criteria
|
|
|
|
This system is not sellable when it is merely architecturally elegant. It is sellable when it can be demonstrated reliably against buyer-relevant workflows.
|
|
|
|
### 6.1 Minimum Sellable Capability
|
|
|
|
The minimum sellable capability is:
|
|
|
|
- one Oracle mission that produces a reviewed advisory result
|
|
- one CRM mission that produces a reviewed lead intelligence result
|
|
- clear auditability and explainability
|
|
- stable failure behavior
|
|
|
|
### 6.2 Buyer-Relevant Proofs
|
|
|
|
To support sales, the team should be able to demonstrate:
|
|
|
|
- the system decomposes a goal transparently
|
|
- the system uses internal business data in a scoped way
|
|
- the system can enrich with external evidence
|
|
- the system reviews its own result before surfacing it
|
|
- the system does not perform uncontrolled autonomous actions
|
|
|
|
### 6.3 Demo Safety Criteria
|
|
|
|
No sales demo should depend on brittle hidden state.
|
|
|
|
A demo-ready system must have:
|
|
|
|
- known mission templates
|
|
- seeded demo data
|
|
- stable prompt packages
|
|
- deterministic fallback behavior for missing data
|
|
- visible mission state and trace output
|
|
|
|
### 6.4 Internal Readiness Checklist
|
|
|
|
Before using the colony in commercial conversations, confirm:
|
|
|
|
- mission classes are frozen for demo use
|
|
- one-click seed or fixture data exists
|
|
- failure cases produce intelligible explanations
|
|
- operator instructions fit on one short runbook page
|
|
- every demo claim can be tied back to a working code path
|
|
|
|
## 7. Risks and Kill Criteria
|
|
|
|
### Risk 1: Too Much System, Too Little Product
|
|
|
|
If the team spends weeks building colony internals without Oracle and CRM mission proofs, the work is drifting.
|
|
|
|
### Risk 2: Governance Is Deferred
|
|
|
|
If tool scope and writeback gating are postponed, the colony will accumulate unsafe assumptions and later refactors will be painful.
|
|
|
|
### Risk 3: Prompt Master Is Hand-Waved
|
|
|
|
If prompt packaging remains a generic coordinator side effect, one of the system's biggest intended advantages is lost.
|
|
|
|
### Risk 4: Library Routing Is Faked
|
|
|
|
If all workers still get all context, the system is not implementing the actual biomimetic model and cost will grow quickly.
|
|
|
|
### Kill Criteria
|
|
|
|
Pause or redesign if any of the following happen:
|
|
|
|
- colony introduces route or data ownership conflicts with root
|
|
- missions cannot be replayed or diagnosed
|
|
- Oracle and CRM adapters remain unproven after kernel work
|
|
- policy model is too weak to gate sensitive actions
|
|
|
|
## 8. Immediate Next Actions
|
|
|
|
The immediate next actions should be executed in this order:
|
|
|
|
1. Freeze the current design docs in the biomimetic folder as the architecture source set.
|
|
2. Create `services/colony-orchestrator/` skeleton.
|
|
3. Create `backend/db/schema_colony.sql`.
|
|
4. Create `backend/services/colony_gateway.py`.
|
|
5. Create `backend/api/routes_colony.py`.
|
|
6. Implement mission envelope and artifact schemas.
|
|
7. Implement planner and prompt package generation.
|
|
8. Wire Oracle advisory mission.
|
|
9. Wire CRM lead intelligence mission.
|
|
10. Add governance and acceptance tests.
|
|
|
|
## 9. Bottom Line
|
|
|
|
The delivery program should be judged by one question:
|
|
|
|
Can this system become a stable, demonstrable revenue-supporting capability for Project Velocity quickly enough to matter?
|
|
|
|
If the team follows this backlog in order, the answer can become yes without turning the architecture into another speculative side project.
|