Files
Project_Velocity/.Agent Context/Sprint 1/Biomimetic Agentic Orchestration Layer/Project Velocity - Biomimetic Agentic Orchestration Layer (First Principles).md

691 lines
19 KiB
Markdown

# Project Velocity - Biomimetic Agentic Orchestration Layer (First Principles)
**Date:** 2026-04-14
**Status:** Draft for design alignment
**Working Name:** `Project Velocity Colony Orchestration Layer`
**Purpose:** Define the product and architecture from first principles before implementation, using current Nemoclaw realities and the public Open Multi Agent framework as the two main input layers.
## Executive Summary
Project Velocity needs an orchestration layer that can turn an ambiguous high-level goal into reliable, auditable, domain-aware work across research, CRM, Oracle, Catalyst, Sentinel, and eventually Dream Weaver and iOS-assisted flows.
The missing piece is not another chatbot. The missing piece is a colony runtime:
- one system that decomposes goals into work
- routes the right context to the right specialist
- controls tool use and data access
- aggregates and reviews results before they touch the user
- stays explainable and governable under production constraints
The best public runtime reference for the orchestration kernel is Open Multi Agent:
- dynamic goal-to-task decomposition
- dependency-aware task DAG execution
- shared memory and agent messaging
- multi-model teams
- MCP connectivity
- built-in tool registry and execution loop
The best reference for safety and operational control is the current Nemoclaw layer already present in Project Velocity:
- policy and workflow boundary thinking
- inference routing
- auditability
- model isolation as a design value
- root-owned Oracle, MCP, and workflow helpers already adapted into Python
This document proposes a biomimetic orchestration system inspired by ant colonies and bee hives, but expressed as concrete software architecture rather than metaphor alone.
## Assumptions and Constraints
### Assumptions
Open Multi Agent is a legitimate public upstream and can be studied, forked, and modified under its MIT license. The public repository and its source files are treated as a primary technical reference.
The current Project Velocity root remains authoritative for:
- FastAPI
- PostgreSQL
- Oracle v1 canvas and prompt orchestration
- Sentinel backend ownership
- Catalyst backend ownership
- CRM canonical storage
The orchestration layer is therefore an internal subsystem, not a second product root.
The system will be used first inside Project Velocity, but should be designed so it can later become a reusable platform across Oracle, Catalyst, Sentinel, CRM, and future agentic modules.
### Constraints
The architecture must not rely on leaked or provenance-unclear code. Public ideas can inspire design. Public licensed code can be forked. Leaked Anthropic artifacts should not be incorporated into product code or product docs as implementation source material.
The system must support production governance:
- explainable task decomposition
- traceable tool usage
- bounded model context
- scoped data access
- safe egress and write controls
The system must work with both cloud and local models, because Project Velocity already spans NVIDIA-hosted inference, local model ambitions, and domain-specific privacy needs.
The system must remain modular. If Open Multi Agent becomes too constraining later, Project Velocity should be able to replace the execution kernel without rewriting all domain contracts.
## Reference Sources and Rationale
### Local Sources
Primary local sources:
- `.Agent Context/Tech/Tools Understanding.md`
- `.Agent Context/Tech/README-Open Multi-Agent.md`
- `.Agent Context/Sprint 1/nemoclaw_setup_truth.md`
- current root code under `backend/oracle`, `backend/services`, `backend/api`, and Sentinel routes
These sources describe the current Project Velocity truth and the internal understanding of Nemoclaw and Open Multi Agent.
### Upstream Public Source
Primary upstream source:
- `https://github.com/JackChen-me/open-multi-agent`
The inspected codebase shows a small TypeScript framework centered around:
- `src/orchestrator/orchestrator.ts`
- `src/task/queue.ts`
- `src/orchestrator/scheduler.ts`
- `src/agent/runner.ts`
- `src/agent/pool.ts`
- `src/team/team.ts`
- `src/memory/shared.ts`
- `src/tool/mcp.ts`
The key takeaway is that Open Multi Agent is intentionally narrow:
- strong at dynamic decomposition and short-lived multi-agent execution
- intentionally weak at persistence, long-lived workflow recovery, and policy infrastructure
That is useful. It gives Project Velocity a sharp orchestration kernel, but it still needs a colony control layer around it.
## System Vision
The target system is a colony-style orchestration plane for Project Velocity.
It should behave like this:
1. A user gives a goal to the system.
2. The system converts that goal into a structured mission.
3. The mission is decomposed into a task graph.
4. A prompt-design specialist prepares role-specific task prompts.
5. Internal and external context is routed to specialists based on need.
6. Specialists execute in parallel where possible.
7. Outputs are aggregated, reviewed once, and normalized.
8. The final result returns to the user or back into a Project Velocity subsystem.
This is not one monolithic agent. It is a managed ecology of small, bounded agents with shared purpose.
## Biomimicry Metaphor in Architecture
The biomimicry metaphor should not remain branding. It should shape architectural behavior.
### Queen
The queen is not the worker. In software terms, the queen is the authoritative interface and coordination identity.
Responsibilities:
- receive intent
- maintain mission identity
- preserve user continuity
- own the final delivery boundary
The queen should not personally execute every step. That would collapse the colony into a single overloaded agent.
### Planner Ant / Foreman Ant
This is the decomposition specialist. It turns a mission into a DAG of work.
In Open Multi Agent terms, this aligns with the coordinator agent. In Project Velocity, this should become a first-class mission planner with stronger contracts than a generic coordinator prompt.
### Prompt Master Ant
This role creates worker-specific prompts using:
- prompt templates
- task type libraries
- past prompt exemplars
- runtime constraints
- tool and policy scope
This is a critical improvement over vanilla orchestration. Open Multi Agent assumes task prompts are mostly generated inline. Project Velocity should separate task decomposition from prompt engineering for execution.
### Researcher Ant
This role handles external evidence gathering:
- search
- browsing
- citation capture
- GUI browser where needed
- headless browser where sufficient
This ant should produce structured evidence artifacts, not prose alone.
### Librarian Ant
This role handles internal memory, repositories, CRM, and domain stores.
Its job is not just retrieval. Its job is routing. It creates scoped access passes for workers so they know:
- which sources are relevant
- which data is allowed
- which indexes to search
- which cached snippets already exist
This is the colony equivalent of pheromone trails plus route cards.
### Worker Ants
These are ephemeral specialists, spawned with:
- task objective
- prompt package
- tool package
- librarian pass
- research evidence where needed
They should be disposable and reproducible.
### Assistant Ants
These are lightweight helper processes that fetch:
- cached summaries
- thumbnails
- source previews
- snippet citations
They reduce bandwidth and keep workers from redoing simple retrieval.
### Aggregator Ant
This role composes the result packet:
- original goal
- enhanced mission
- task outputs
- evidence
- contradictions
- unresolved risks
### Reviewer Ant
This role performs one bounded review pass against:
- completeness
- internal consistency
- policy compliance
- citation and provenance quality
- user-goal alignment
One review loop is a good default. More loops increase cost and delay unless risk level justifies them.
### Pheromone Layer
In software, pheromones correspond to lightweight reinforced signals:
- prompt-template success rates
- tool effectiveness by task type
- source usefulness
- failure hotspots
- cache affinity
This should not be a giant opaque memory store. It should be compact telemetry that influences planning and routing.
### Waggle Dance
The bee waggle dance is a good design metaphor for compressed broadcasting. Not every ant needs the full artifact. The system should support:
- short mission summaries
- confidence and urgency signals
- route hints
- evidence pointers
That means the colony needs both full artifact channels and compressed signal channels.
## First-Principles Architecture
The orchestration layer should be built from the following software principles.
### Principle 1: Separate intent, planning, execution, and governance
If one agent owns everything, the system becomes hard to reason about and easy to break. The colony should split:
- intent ingestion
- task planning
- prompt preparation
- execution
- review
- policy enforcement
### Principle 2: Scope context aggressively
A common failure mode in agent systems is uncontrolled context flooding. Workers should receive:
- only the data needed for the task
- compressed mission context
- explicitly granted tool scope
- explicit evidence packets
This matches the librarian-pass model.
### Principle 3: Prefer artifacts over free-form chat
Each stage should emit structured artifacts, not only prose. Examples:
- mission spec
- task graph
- prompt package
- research packet
- library pass
- worker result
- aggregation packet
- review findings
This keeps the system composable and testable.
### Principle 4: Keep the colony stateful, keep workers mostly stateless
The colony needs memory. Individual workers should be cheap to create and destroy. State should live in:
- mission store
- artifact store
- pheromone store
- audit log
- policy and prompt registries
### Principle 5: Governance must be external to worker prompts
Prompt-only safety is not enough. This is where Nemoclaw matters. The worker should not be trusted to self-police. Instead:
- tool permissions must be granted externally
- network egress must be bounded externally
- data writebacks must be validated externally
- model routing should be policy-driven
### Principle 6: Parallelism should be earned, not assumed
The framework can run tasks in parallel. The product should decide when:
- tasks are independent
- tasks share too much mutable state
- cost or latency budgets require serialization
- review or quorum is needed before downstream work
### Principle 7: Review should target risk, not everything equally
Not every mission needs the same rigor. The system should escalate review by:
- financial risk
- user-facing impact
- external actuation
- writeback side effects
- low-confidence evidence
## Proposed Functional Architecture and Key Roles
```mermaid
flowchart TD
U[User or Product Surface] --> Q[Queen Interface]
Q --> M[Mission Normalizer]
M --> P[Planner Ant]
P --> PM[Prompt Master Ant]
P --> R[Researcher Ant]
P --> L[Librarian Ant]
PM --> W1[Worker Ants]
R --> W1
L --> W1
W1 --> A[Aggregator Ant]
A --> V[Reviewer Ant]
V --> Q
Q --> O[User or Calling Product]
G[Governance and Nemoclaw Guard Plane] --> P
G --> PM
G --> W1
G --> V
S[Pheromone and Artifact Stores] --> P
S --> PM
S --> L
S --> A
```
### Layer 1: Colony Kernel
This is the forked Open Multi Agent core, extended but still recognizable:
- mission coordinator
- task DAG compiler
- scheduler
- agent pool
- tool dispatcher
- short-lived memory and message primitives
### Layer 2: Colony Control Plane
This is the new Project Velocity layer that Open Multi Agent does not provide:
- mission registry
- artifact schemas
- prompt registry
- policy engine
- librarian and research services
- pheromone scoring
- review gates
- observability
### Layer 3: Nemoclaw Guard Plane
This layer governs:
- model routing
- network permissions
- tool access policies
- filesystem access
- human approvals for sensitive actions
- audit logs
### Layer 4: Velocity Domain Adapters
Domain adapters map colony artifacts to actual product surfaces:
- Oracle adapter
- CRM adapter
- Catalyst adapter
- Sentinel adapter
- Dream Weaver adapter
## Data Model and Interfaces
The system should be artifact-native. Core objects should include:
### Mission Envelope
Fields:
- `mission_id`
- `origin_surface`
- `user_goal`
- `normalized_goal`
- `risk_level`
- `time_budget_ms`
- `token_budget`
- `sensitivity_class`
- `requested_outputs`
### Task Graph
Fields:
- `task_id`
- `mission_id`
- `role_type`
- `objective`
- `depends_on`
- `required_capabilities`
- `allowed_tools`
- `required_data_scopes`
- `success_criteria`
### Prompt Package
Fields:
- `prompt_package_id`
- `task_id`
- `role_prompt`
- `template_id`
- `template_version`
- `examples_used`
- `constraints`
- `tool_instructions`
### Research Artifact
Fields:
- `artifact_id`
- `task_id`
- `source_url`
- `source_type`
- `retrieved_at`
- `summary`
- `snippet`
- `citation_confidence`
### Librarian Pass
Fields:
- `pass_id`
- `worker_task_id`
- `allowed_indexes`
- `allowed_entities`
- `recommended_paths`
- `cached_previews`
- `expiry`
### Worker Result
Fields:
- `result_id`
- `task_id`
- `output`
- `structured_output`
- `citations`
- `confidence`
- `tool_trace`
- `cost`
- `duration_ms`
### Aggregation Packet
Fields:
- `mission_id`
- `summary`
- `evidence_matrix`
- `contradictions`
- `coverage_gaps`
- `recommended_answer`
### Review Packet
Fields:
- `mission_id`
- `review_status`
- `issues`
- `required_edits`
- `approved_output`
## Interaction and Workflow Description
### Standard Mission Flow
```mermaid
flowchart LR
A[Goal Submitted] --> B[Mission Normalization]
B --> C[Task Graph Planning]
C --> D[Prompt Package Generation]
D --> E[Research and Librarian Routing]
E --> F[Worker Execution]
F --> G[Aggregation]
G --> H[Single Review Pass]
H --> I[Final Delivery]
```
### Internal Coordination Model
The planner creates the mission map. The prompt master transforms tasks into execution-ready prompt packages. The librarian prepares scoped passes. The researcher gathers external evidence. Workers consume these inputs and produce worker results. The aggregator converts worker results into one coherent answer. The reviewer performs one bounded audit pass. The queen delivers the result.
This flow is clearer and safer than a generic one coordinator, many workers, final answer pipeline because it prevents prompt design, evidence gathering, data routing, and synthesis from collapsing into one role.
## Improvements and Recommendations
### 1. Do not make the queen a single execution bottleneck
Your instinct is correct, but the queen should be the mission owner, not the main reasoner. Use the queen as interface and final sign-off identity. Use planner plus prompt master beneath it.
### 2. Split external research from internal retrieval
Researcher and librarian should remain separate. External evidence and internal knowledge have different trust models, latencies, and security constraints.
### 3. Introduce an immune system role
The metaphor needs one more biological component: an immune role. This should be the policy and anomaly layer that spots:
- tool misuse
- prompt injection
- suspicious source behavior
- uncontrolled recursion
- invalid writeback attempts
This can be implemented partly through Nemoclaw and partly through colony review logic.
### 4. Use pheromone decay
Do not keep every success signal forever. Prompt and route heuristics should decay over time unless reinforced by fresh outcomes. Otherwise the system will overfit stale patterns.
### 5. Add quorum mode for sensitive operations
For external actions or high-value CRM writebacks, require either:
- reviewer approval
- human approval
- dual-agent agreement
### 6. Keep a strict artifact schema
The more the colony relies on free-form prose between agents, the less governable it becomes. Use JSON artifacts internally and render prose only at the delivery layer.
### 7. Make Project Velocity domains first-class specializations
The colony should not remain generic for too long. It should gain specialized ant classes for:
- Oracle composition and visualization
- CRM and lead operations
- Catalyst marketing planning
- Sentinel session interpretation
- Dream Weaver asset workflow dispatch
### 8. Refuse all context to every ant
That feels powerful but is structurally wrong. It increases cost, hallucination risk, leakage risk, and reasoning confusion. The librarian-pass model is the correct design.
### 9. Treat prompt engineering as a subsystem, not a field
The prompt master should have:
- prompt templates
- prompt exemplars
- evaluation traces
- role-specific quality metrics
This becomes a product asset, not just a string builder.
### 10. Keep legal provenance clean
Use public code, public docs, and internal original work only. Do not build from leaked Claude framework material.
## Migration and Fork Strategy
### What to fork from Open Multi Agent
Fork and extend:
- orchestrator model
- task queue
- scheduler
- agent pool
- tool runner
- MCP connectivity
### What not to inherit blindly
Do not inherit its current limitations as product assumptions:
- no persistence
- no checkpointing
- no colony-level review layer
- no domain adapters
- no policy-first execution boundaries
### What to absorb from Nemoclaw
Absorb conceptually and through current root-owned code:
- governance outside worker prompts
- policy-controlled egress
- model routing by sensitivity
- auditability
- writeback controls
### Proposed Runtime Boundary
Recommended shape:
- TypeScript colony service forks Open Multi Agent and becomes the colony kernel
- FastAPI remains source of truth for product data and domain APIs
- Python Nemoclaw-root helpers remain available for Oracle, MCP, and workflow planning
- the colony talks to FastAPI through internal service contracts, not by bypassing the root
## What This Means for Project Velocity
This colony layer is valuable because Project Velocity already has the right pieces but not yet the right orchestration coherence.
Today the product has:
- Oracle reasoning surfaces
- CRM data and writeback paths
- Sentinel scoring and live sessions
- Catalyst execution surfaces
- MCP and Nemoclaw append layers
What it lacks is a disciplined colony runtime that can coordinate these without turning every subsystem into a bespoke prompt chain.
The correct next move is to define the system formally, then build the minimal colony kernel that can support one or two real mission classes first.
## Recommended Initial Mission Classes
The first implementation should focus on three mission types:
1. Oracle advisory missions
These already align with prompt orchestration, structured output, and writeback planning.
2. CRM intelligence missions
Lead research, prioritization, messaging strategy, and action recommendation are ideal for multi-agent decomposition.
3. Catalyst marketing missions
Campaign research, audience strategy, creative ideation, and review are naturally multi-role.
Sentinel should integrate as an evidence provider before becoming a fully autonomous mission class.
## Bottom Line
Yes, the idea makes sense.
The strongest version of it is not Jarvis with many ants. The strongest version is:
- a mission-oriented colony kernel
- a prompt-master subsystem
- a librarian routing subsystem
- a research subsystem
- bounded worker execution
- an aggregation and review pair
- a Nemoclaw-governed policy shell
- clean artifact schemas and domain adapters for Project Velocity
That is the right abstraction to pursue.