Files

sagnik 6cdc366718 feat: Oracle Canvas, Revision History and Canvas Sharing (#33 )

Co-authored-by: Sagnik <sagnik7896@gmail.com>
Reviewed-on: #33

2026-04-23 01:20:21 +05:30

23 KiB

Raw Permalink Blame History

Project Velocity — Truthbook

What this is: The single source of truth for Project Velocity. If it's written down here, it's how the system works — not how someone hoped it would work.

What Is Project Velocity
Quick Start
Architecture Overview
Runtime Truth
Team Setup
GPU & Model Runtime
Infrastructure
Runbooks
API Reference
Contributing

What Is Project Velocity

Project Velocity is a multi-agent AI development platform. It orchestrates intelligent agents (powered by Qwen 3.6 35B A3B and other models) to collaborate on software engineering tasks — code generation, review, testing, deployment — as a coordinated team rather than isolated tools.

Why it exists: Single-agent coding tools hit a ceiling. They lack context persistence, cross-task coordination, and operational reliability. Velocity solves this by:

Multi-agent collaboration — Agents communicate via WebSocket channels and shared memory
Persistent state — PostgreSQL backs user data, CRM records, and agent memory
GPU-accelerated inference — Local Ollama runtime on NVIDIA GPU hardware
Role-based access control — Admin and standard user tiers with avatar support
Live event broadcasting — Real-time campaign and catalyst events via WebSocket

Core stack:

Layer	Technology
Backend API	Python / FastAPI
Database	PostgreSQL (via `databases` library with connection pooling)
Frontend	React 19 + TypeScript + Vite + Tailwind CSS + Framer Motion
Inference	Ollama (Qwen 3.6 35B A3B primary model)
Real-time	WebSocket (Catalyst channel, CRM channel)
Deployment	systemd services on Linux with NVIDIA GPU

Quick Start

Prerequisites

GPU Machine: NVIDIA GPU with sufficient VRAM (≥16GB recommended for Qwen 3.6 35B A3B)
NVMe Storage: For model weights and cache
Linux OS: Ubuntu 22.04+ or equivalent
Python 3.11+: Backend runtime
Node.js 18+: Frontend build
Ollama: Latest stable with Qwen 3.6 35B A3B model pulled
PostgreSQL 15+: Database backend

One-Line Bootstrap

bash bootstrap/setup.sh

This script handles:

GPU driver verification
Ollama installation and model pull
PostgreSQL setup
Backend dependency installation
Frontend dependency installation
systemd service creation

Manual Setup

1. GPU & Ollama

# Verify GPU
nvidia-smi

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the primary model
ollama pull qwen3.6:35b-a3b

# Verify model is loaded
curl http://localhost:11434/api/tags | jq '.models[] | select(.name == "qwen3.6:35b-a3b")'

2. Database

# Start PostgreSQL
sudo systemctl start postgresql

# Create database and user
psql -U postgres -c "CREATE DATABASE velocity;"
psql -U postgres -c "CREATE USER velocity WITH PASSWORD 'secure_password';"
psql -U postgres -c "GRANT ALL PRIVILEGES ON DATABASE velocity TO velocity;"

3. Backend

cd Project_Velocity/backend

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your database credentials and secrets

# Run migrations
python migrate.py

# Start server
uvicorn main:app --host 0.0.0.0 --port 8000

4. Frontend

cd Project_Velocity/app

# Install dependencies
npm install

# Start dev server
npm run dev

Frontend is now available at http://localhost:5173.

5. Verify Everything

# Backend health
curl http://localhost:8000/health

# Model availability
curl http://localhost:11434/api/tags

# Frontend
open http://localhost:5173

Architecture Overview

System Diagram

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   React UI  │────▶│  FastAPI     │────▶│  PostgreSQL │
│  (Port 5173)│◀────│  (Port 8000) │◀────│  (Port 5432)│
└─────────────┘     └──────┬───────┘     └─────────────┘
                           │
                           ▼
                    ┌──────────────┐
                    │   Ollama     │
                    │ (Port 11434) │
                    │ Qwen 3.6 35B │
                    └──────────────┘
                           │
                           ▼
                    ┌──────────────┐
                    │  NVIDIA GPU  │
                    └──────────────┘

Component Breakdown

Backend (`backend/`)

main.py — FastAPI application with:

Auth system — Login, profile lookup, user listing, avatar upload
WebSocket managers — _CatalystManager() and _CRMManager() for real-time event broadcasting
Connection pooling — PostgreSQL via databases library with async context management
Lifespan hooks — lifespan() initializes and cleans up resources

Key endpoints:

Endpoint	Method	Purpose
`/api/auth/login`	POST	Authenticate user
`/api/auth/me`	GET	Get current user profile
`/api/auth/users`	GET	List all users (admin)
`/api/auth/profile/avatar`	POST	Upload profile avatar
`/ws/catalyst`	WS	Catalyst event channel
`/ws/crm`	WS	CRM event channel
`/health`	GET	Health check

Frontend (`app/`)

App.tsx — React application with:

Protected routes — ProtectedRoute() wraps authenticated paths
Route module sync — RouteModuleSync() handles dynamic route loading
Main layout — MainLayout() provides chrome (header, sidebar, content area)
Role rendering — formatRoleLabel() converts role codes to display labels
Auth state management — Dual useEffect hooks handle token persistence and user fetch

Agent Context (`.Agent Context/`)

Documents that define how agents operate within Velocity:

Qwen 3.6 35B A3B Ollama Access, Recovery, and Team Setup.md — Model runtime, recovery policies, team onboarding
README.md — This file

Infrastructure (`.Infrastructure/`)

Deployment and operational documentation:

systemd unit files for backend, frontend, Ollama services
Network configuration and ingress rules
Monitoring and alerting setup

Runtime Truth

What "Works" Means in Velocity

Velocity has three runtime layers, each with different failure modes:

Layer A: Fast Runtime Recovery

If the API crashes or restarts:

PostgreSQL connection pool rebuilds automatically via lifespan()
WebSocket managers reinitialize and accept new connections
No data loss — all state is in PostgreSQL

Layer B: Model Rehydration Recovery

If Ollama loses the Qwen model:

Watchdog systemd unit detects absence via /api/tags
Auto-registers model from NVMe cache or S3 artifact storage
Production requirement: Same-run auto-hydration logic must complete before any agent request

Layer C: Full System Recovery

If everything goes down:

PostgreSQL recovers WAL logs
Ollama watchdog restores model
Backend systemd unit restarts API
Frontend rebuilds if artifacts are corrupted

Critical Contracts

Auth contract:

Client → POST /api/auth/login {email, password}
       → 200 OK {token, user}
       
Client → GET /api/auth/me (Authorization: Bearer <token>)
       → 200 OK {id, email, role, avatar_url}
       → 401 Unauthorized

WebSocket contract:

Client → WS /ws/catalyst
       → Accepts live events: {event_type, campaign_name, value, timestamp}

Client → WS /ws/crm
       → Accepts CRM events: {type, payload, timestamp}

Model contract:

Ollama → GET /api/tags returns qwen3.6:35b-a3b
       → Context window: 131072 tokens
       → Provider: OpenAI-compatible interface at http://localhost:11434/v1

Team Setup

Developer Onboarding

1. Clone & Bootstrap

git clone <repo-url>
cd Project_Velocity
bash bootstrap/setup.sh

2. VS Code / Roo Code Configuration

Edit .vscode/settings.json:

{
  "roo-cline.provider": "openai-compatible",
  "roo-cline.baseUrl": "http://localhost:11434/v1",
  "roo-cline.modelId": "qwen3.6:35b-a3b",
  "roo-cline.contextWindow": 131072,
  "roo-cline.temperature": 0.7
}

3. Verify Team Access

# Backend health
curl http://localhost:8000/health
# Expected: {"status": "ok"}

# Model loaded
curl http://localhost:11434/api/tags | jq -r '.models[].name'
# Expected: qwen3.6:35b-a3b

# Frontend
open http://localhost:5173
# Expected: Login screen

Role Definitions

Role	Access Level	Can Do
`admin`	Full	User management, system config, agent orchestration
`developer`	Standard	Code generation, review, testing
`viewer`	Read-only	Dashboard, campaign monitoring

Performance Expectations

Scenario	Tokens/sec	Latency
Single-stream (local GPU)	~80-120 tok/s	~200ms first token
Two concurrent requests	~60-90 tok/s each	~300ms first token
Four-way batch	~40-60 tok/s each	~500ms first token

Numbers vary by GPU hardware. Measure your setup.

GPU & Model Runtime

Hardware Requirements

Component	Minimum	Recommended
GPU VRAM	16GB	24GB+
GPU Compute	Turing architecture	Ada Lovelace / Hopper
NVMe Storage	50GB free	100GB+ NVMe Gen4
RAM	32GB	64GB+

Ollama Watchdog

The watchdog is a systemd-managed service that ensures the Qwen model stays loaded:

Location: .Infrastructure/systemd/ollama-watchdog.service

Behavior:

Every 60 seconds, queries http://localhost:11434/api/tags
If qwen3.6:35b-a3b is absent, triggers rehydration
Rehydration priority: NVMe cache → S3 artifact → remote pull
Logs all actions to journalctl

Manual watchdog check:

sudo systemctl status ollama-watchdog
journalctl -u ollama-watchdog --since "1 hour ago"

Model Hydration Strategies

Strategy	Speed	Use Case
NVMe local registration	~2 seconds	Primary recovery path
Local manifest `ollama create`	~5 seconds	Fresh hydration from extracted weights
S3 cold hydrate	~60-300 seconds	No local cache available

Critical: What Watchdog Must NOT Do

❌ Delete model layers during recovery
❌ Modify GPU memory directly
❌ Block agent requests during hydration (graceful degradation only)
❌ Restart Ollama process unless absolutely necessary

Infrastructure

Deployment Topology

┌─────────────────────────────────────────────────┐
│                  Production Host                 │
│                                                  │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐  │
│  │ Backend  │  │ Frontend │  │   Ollama     │  │
│  │ :8000    │  │ :5173    │  │  :11434      │  │
│  │ systemd  │  │ nginx    │  │  systemd     │  │
│  └────┬─────┘  └────┬─────┘  └──────┬───────┘  │
│       │             │               │           │
│       └─────────────┴───────────────┘           │
│                         │                        │
│                  ┌──────▼───────┐               │
│                  │  PostgreSQL  │               │
│                  │   :5432      │               │
│                  │  systemd     │               │
│                  └──────────────┘               │
│                                                  │
│  ┌──────────────────────────────────────────┐    │
│  │        NVIDIA GPU (CUDA + TensorRT)      │    │
│  └──────────────────────────────────────────┘    │
└─────────────────────────────────────────────────┘

systemd Services

Service	File	Restart Policy
Backend API	`velocity-backend.service`	always
Frontend (nginx)	`velocity-frontend.service`	always
Ollama	`ollama.service`	on-failure
Watchdog	`ollama-watchdog.service`	always
PostgreSQL	`postgresql.service`	on-failure

Network Rules

Port	Protocol	Service	External Access
80	HTTP	nginx → frontend	Yes (public)
443	HTTPS	nginx → frontend	Yes (public)
8000	TCP	FastAPI backend	No (internal only)
5173	TCP	Vite dev server	No (dev only)
5432	TCP	PostgreSQL	No (internal only)
11434	TCP	Ollama API	No (internal only)

Monitoring

# All service health
systemctl status velocity-backend ollama postgresql

# GPU utilization
nvidia-smi -l 1

# Model inference logs
journalctl -u ollama -f

# API error rate
curl -s http://localhost:8000/health | jq .

Runbooks

Runbook: Backend Crashes at 2 AM

Symptom: Frontend shows 500 errors on API calls.

Steps:

# 1. Check backend status
sudo systemctl status velocity-backend
# Expected: active (running)

# 2. If stopped, restart
sudo systemctl restart velocity-backend

# 3. Check logs for root cause
sudo journalctl -u velocity-backend --since "30 minutes ago" --no-pager

# 4. Verify recovery
curl http://localhost:8000/health
# Expected: {"status": "ok"}

# 5. If crash repeats, check database connectivity
psql -U velocity -d velocity -c "SELECT 1;"
# Expected: 1

If still broken:

Check disk space: df -h /
Check memory: free -h
Check PostgreSQL: sudo systemctl status postgresql
Escalate with logs from step 3

Runbook: Ollama Model Disappeared

Symptom: Agents return empty responses or errors.

Steps:

# 1. Check if Ollama is running
sudo systemctl status ollama
# Expected: active (running)

# 2. Check loaded models
curl http://localhost:11434/api/tags | jq '.models[].name'
# Expected: qwen3.6:35b-a3b

# 3. If model is missing, check watchdog
sudo systemctl status ollama-watchdog
journalctl -u ollama-watchdog --since "1 hour ago" --no-pager

# 4. Manual recovery if watchdog failed
ollama pull qwen3.6:35b-a3b

# 5. Verify model is usable
curl http://localhost:11434/api/generate -d '{
  "model": "qwen3.6:35b-a3b",
  "prompt": "Hello",
  "stream": false
}' | jq .done
# Expected: true

Runbook: Database Connection Failures

Symptom: Backend logs show connection refused or pool exhausted.

Steps:

# 1. Check PostgreSQL status
sudo systemctl status postgresql
# Expected: active (running)

# 2. Check connection count
psql -U postgres -c "SELECT count(*) FROM pg_stat_activity;"
# Should be < max_connections (default 100)

# 3. Check disk space for WAL files
df -h /var/lib/postgresql

# 4. Restart if hung
sudo systemctl restart postgresql

# 5. Verify backend reconnects
sudo journalctl -u velocity-backend --since "1 minute ago" | grep -i "connected\|error"

Runbook: GPU Memory Exhaustion

Symptom: Ollama returns out of memory errors.

Steps:

# 1. Check current GPU usage
nvidia-smi
# Note: PID, memory usage, temperature

# 2. Kill non-essential GPU processes if needed
nvidia-smi --id=0 --query-compute-apps=pid,name,used_memory --format=csv
kill <PID>

# 3. Check Ollama memory allocation
ollama show qwen3.6:35b-a3b | grep -i "layer\|memory"

# 4. If still exhausted, reduce model quantization
ollama pull qwen3.6:35b-a3b-q4_0

# 5. Monitor recovery
watch -n 1 nvidia-smi

API Reference

Auth Endpoints

`POST /api/auth/login`

Authenticate a user and receive a JWT token.

Request:

{
  "email": "user@example.com",
  "password": "secure_password"
}

Response (200 OK):

{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "user": {
    "id": "uuid-here",
    "email": "user@example.com",
    "role": "developer",
    "avatar_url": null
  }
}

Errors:

Status	Meaning
401	Invalid credentials
422	Malformed request body

`GET /api/auth/me`

Get the current authenticated user's profile.

Headers:

Authorization: Bearer <token>

Response (200 OK):

{
  "id": "uuid-here",
  "email": "user@example.com",
  "role": "developer",
  "avatar_url": "https://cdn.example.com/avatars/user.png"
}

Errors:

Status	Meaning
401	Token missing or invalid
403	Token expired

`GET /api/auth/users`

List all users in the system. Admin only.

Headers:

Authorization: Bearer <admin_token>

Response (200 OK):

[
  {
    "id": "uuid-1",
    "email": "admin@example.com",
    "role": "admin",
    "avatar_url": null
  },
  {
    "id": "uuid-2",
    "email": "dev@example.com",
    "role": "developer",
    "avatar_url": "https://cdn.example.com/avatars/dev.png"
  }
]

Errors:

Status	Meaning
403	User is not admin

`POST /api/auth/profile/avatar`

Upload a profile avatar image.

Headers:

Authorization: Bearer <token>
Content-Type: multipart/form-data

Form Data:

Field	Type	Required
avatar	file (image/jpeg, image/png)	Yes

Response (200 OK):

{
  "avatar_url": "https://cdn.example.com/avatars/new-avatar.png"
}

Errors:

Status	Meaning
401	Not authenticated
422	Invalid file type or size > 5MB

WebSocket Endpoints

`WS /ws/catalyst`

Real-time channel for Catalyst events (agent coordination, task updates).

Connection:

const ws = new WebSocket('ws://localhost:8000/ws/catalyst');
ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log(data.event_type, data.campaign_name, data.value);
};

Event Format:

{
  "event_type": "task_complete",
  "campaign_name": "codegen-sprint-42",
  "value": 0.97,
  "timestamp": "2026-04-21T16:00:00Z"
}

`WS /ws/crm`

Real-time channel for CRM events (customer interactions, lead updates).

Connection:

const ws = new WebSocket('ws://localhost:8000/ws/crm');
ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log(data.type, data.payload);
};

Event Format:

{
  "type": "lead_created",
  "payload": {
    "id": "crm-uuid",
    "name": "Acme Corp",
    "status": "new"
  },
  "timestamp": "2026-04-21T16:00:00Z"
}

Health Check

`GET /health`

Verify system health.

Response (200 OK):

{
  "status": "ok",
  "database": "connected",
  "ollama": "available",
  "gpu": "present"
}

Contributing

Code Structure

Project_Velocity/
├── .Agent Context/          # Agent documentation, model specs
├── .Infrastructure/         # Deployment configs, systemd units
├── backend/                 # FastAPI backend
│   ├── main.py              # Application entry point
│   ├── requirements.txt     # Python dependencies
│   └── migrate.py           # Database migrations
├── app/                     # React frontend
│   ├── src/
│   │   ├── App.tsx          # Root component
│   │   └── ...              # Components, routes, utils
│   ├── package.json         # Node dependencies
│   └── vite.config.ts       # Build config
├── bootstrap/               # Setup scripts
│   └── setup.sh             # One-line bootstrap
└── README.md                # This file

Making a Contribution

Fork and branch

git checkout -b feature/your-feature-name

Make changes
- Backend: Follow FastAPI conventions, add type hints
- Frontend: Follow React + TypeScript patterns, use existing components
- Docs: Update this README if behavior changes

Test locally

# Backend tests
cd backend && pytest

# Frontend checks
cd app && npm run build

Submit PR
- Title: Clear, action-oriented
- Description: What + Why + How to test
- Link any related issues

Documentation Standards

Every endpoint: Document inputs, outputs, errors
Every component: JSDoc for public APIs
Every runbook: Write as if for on-call at 2am
Every decision: Record in DECISIONS.md with rationale

Appendix

A. Environment Variables

Variable	Required	Description
`DATABASE_URL`	Yes	PostgreSQL connection string
`SECRET_KEY`	Yes	JWT signing key
`OLLAMA_BASE_URL`	No	Ollama API URL (default: `http://localhost:11434`)
`GPU_ENABLED`	No	Enable GPU path (default: `true`)
`LOG_LEVEL`	No	Logging level (default: `INFO`)

B. Troubleshooting Matrix

Symptom	Likely Cause	Fix
Frontend blank screen	Backend down	`curl http://localhost:8000/health`
401 on all calls	Token expired	Re-login
Agent returns empty	Model unloaded	`ollama pull qwen3.6:35b-a3b`
Slow responses	GPU not used	Check `nvidia-smi`, verify CUDA
Database errors	Pool exhausted	Check `max_connections`, restart backend
WebSocket disconnects	Network issue	Check firewall, reverse proxy config

C. Useful Commands Cheat Sheet

# Full system status
systemctl status velocity-backend ollama postgresql ollama-watchdog

# GPU实时监控
watch -n 1 nvidia-smi

# Model check
curl http://localhost:11434/api/tags | jq '.models[].name'

# API health
curl -s http://localhost:8000/health | jq .

# Database connection test
psql -U velocity -d velocity -c "SELECT version();"

# Frontend rebuild
cd app && npm run build && cp -r dist/* ../nginx/html/

# Restart everything (nuclear option)
sudo systemctl restart velocity-backend ollama postgresql

Last verified: 2026-04-21 Maintained by: Velocity Team If this doc is wrong, the system is broken. Fix the doc first.

23 KiB Raw Permalink Blame History

Project Velocity — Truthbook

Table of Contents

What Is Project Velocity

Quick Start

Prerequisites

One-Line Bootstrap

Manual Setup

1. GPU & Ollama

2. Database

3. Backend

4. Frontend

5. Verify Everything

Architecture Overview

System Diagram

Component Breakdown

Backend (backend/)

Frontend (app/)

Agent Context (.Agent Context/)

Infrastructure (.Infrastructure/)

Runtime Truth

What "Works" Means in Velocity

Layer A: Fast Runtime Recovery

Layer B: Model Rehydration Recovery

Layer C: Full System Recovery

Critical Contracts

Team Setup

Developer Onboarding

1. Clone & Bootstrap

2. VS Code / Roo Code Configuration

3. Verify Team Access

Role Definitions

Performance Expectations

GPU & Model Runtime

Hardware Requirements

Ollama Watchdog

Model Hydration Strategies

Critical: What Watchdog Must NOT Do

Infrastructure

Deployment Topology

systemd Services

Network Rules

Monitoring

Runbooks

Runbook: Backend Crashes at 2 AM

Runbook: Ollama Model Disappeared

Runbook: Database Connection Failures

Runbook: GPU Memory Exhaustion

API Reference

Auth Endpoints

POST /api/auth/login

GET /api/auth/me

GET /api/auth/users

POST /api/auth/profile/avatar

WebSocket Endpoints

WS /ws/catalyst

WS /ws/crm

Health Check

GET /health

Contributing

Code Structure

Making a Contribution

Documentation Standards

Appendix

A. Environment Variables

B. Troubleshooting Matrix

C. Useful Commands Cheat Sheet

23 KiB

Raw Permalink Blame History

Backend (`backend/`)

Frontend (`app/`)

Agent Context (`.Agent Context/`)

Infrastructure (`.Infrastructure/`)

`POST /api/auth/login`

`GET /api/auth/me`

`GET /api/auth/users`

`POST /api/auth/profile/avatar`

`WS /ws/catalyst`

`WS /ws/crm`

`GET /health`