feat: Oracle Canvas, Revision History and Canvas Sharing

2026-04-23 01:19:03 +05:30
parent e519339cc9
commit 527b10cd41
58 changed files with 3187 additions and 705 deletions
--- a/Context/README.md
+++ b/Context/README.md
@@ -0,0 +1,891 @@
+# Project Velocity — Truthbook
+
+> **What this is:** The single source of truth for Project Velocity. If it's written down here, it's how the system works — not how someone hoped it would work.
+
+---
+
+## Table of Contents
+
+1. [What Is Project Velocity](#what-is-project-velocity)
+2. [Quick Start](#quick-start)
+3. [Architecture Overview](#architecture-overview)
+4. [Runtime Truth](#runtime-truth)
+5. [Team Setup](#team-setup)
+6. [GPU & Model Runtime](#gpu--model-runtime)
+7. [Infrastructure](#infrastructure)
+8. [Runbooks](#runbooks)
+9. [API Reference](#api-reference)
+10. [Contributing](#contributing)
+
+---
+
+## What Is Project Velocity
+
+Project Velocity is a multi-agent AI development platform. It orchestrates intelligent agents (powered by Qwen 3.6 35B A3B and other models) to collaborate on software engineering tasks — code generation, review, testing, deployment — as a coordinated team rather than isolated tools.
+
+**Why it exists:** Single-agent coding tools hit a ceiling. They lack context persistence, cross-task coordination, and operational reliability. Velocity solves this by:
+
+- **Multi-agent collaboration** — Agents communicate via WebSocket channels and shared memory
+- **Persistent state** — PostgreSQL backs user data, CRM records, and agent memory
+- **GPU-accelerated inference** — Local Ollama runtime on NVIDIA GPU hardware
+- **Role-based access control** — Admin and standard user tiers with avatar support
+- **Live event broadcasting** — Real-time campaign and catalyst events via WebSocket
+
+**Core stack:**
+
+| Layer | Technology |
+|-------|-----------|
+| Backend API | Python / FastAPI |
+| Database | PostgreSQL (via `databases` library with connection pooling) |
+| Frontend | React 19 + TypeScript + Vite + Tailwind CSS + Framer Motion |
+| Inference | Ollama (Qwen 3.6 35B A3B primary model) |
+| Real-time | WebSocket (Catalyst channel, CRM channel) |
+| Deployment | systemd services on Linux with NVIDIA GPU |
+
+---
+
+## Quick Start
+
+### Prerequisites
+
+- **GPU Machine:** NVIDIA GPU with sufficient VRAM (≥16GB recommended for Qwen 3.6 35B A3B)
+- **NVMe Storage:** For model weights and cache
+- **Linux OS:** Ubuntu 22.04+ or equivalent
+- **Python 3.11+:** Backend runtime
+- **Node.js 18+:** Frontend build
+- **Ollama:** Latest stable with Qwen 3.6 35B A3B model pulled
+- **PostgreSQL 15+:** Database backend
+
+### One-Line Bootstrap
+
+```bash
+bash bootstrap/setup.sh
+```
+
+This script handles:
+1. GPU driver verification
+2. Ollama installation and model pull
+3. PostgreSQL setup
+4. Backend dependency installation
+5. Frontend dependency installation
+6. systemd service creation
+
+### Manual Setup
+
+#### 1. GPU & Ollama
+
+```bash
+# Verify GPU
+nvidia-smi
+
+# Install Ollama
+curl -fsSL https://ollama.ai/install.sh | sh
+
+# Pull the primary model
+ollama pull qwen3.6:35b-a3b
+
+# Verify model is loaded
+curl http://localhost:11434/api/tags | jq '.models[] | select(.name == "qwen3.6:35b-a3b")'
+```
+
+#### 2. Database
+
+```bash
+# Start PostgreSQL
+sudo systemctl start postgresql
+
+# Create database and user
+psql -U postgres -c "CREATE DATABASE velocity;"
+psql -U postgres -c "CREATE USER velocity WITH PASSWORD 'secure_password';"
+psql -U postgres -c "GRANT ALL PRIVILEGES ON DATABASE velocity TO velocity;"
+```
+
+#### 3. Backend
+
+```bash
+cd Project_Velocity/backend
+
+# Install dependencies
+pip install -r requirements.txt
+
+# Configure environment
+cp .env.example .env
+# Edit .env with your database credentials and secrets
+
+# Run migrations
+python migrate.py
+
+# Start server
+uvicorn main:app --host 0.0.0.0 --port 8000
+```
+
+#### 4. Frontend
+
+```bash
+cd Project_Velocity/app
+
+# Install dependencies
+npm install
+
+# Start dev server
+npm run dev
+```
+
+Frontend is now available at `http://localhost:5173`.
+
+#### 5. Verify Everything
+
+```bash
+# Backend health
+curl http://localhost:8000/health
+
+# Model availability
+curl http://localhost:11434/api/tags
+
+# Frontend
+open http://localhost:5173
+```
+
+---
+
+## Architecture Overview
+
+### System Diagram
+
+```
+┌─────────────┐     ┌──────────────┐     ┌─────────────┐
+│   React UI  │────▶│  FastAPI     │────▶│  PostgreSQL │
+│  (Port 5173)│◀────│  (Port 8000) │◀────│  (Port 5432)│
+└─────────────┘     └──────┬───────┘     └─────────────┘
+                           │
+                           ▼
+                    ┌──────────────┐
+                    │   Ollama     │
+                    │ (Port 11434) │
+                    │ Qwen 3.6 35B │
+                    └──────────────┘
+                           │
+                           ▼
+                    ┌──────────────┐
+                    │  NVIDIA GPU  │
+                    └──────────────┘
+```
+
+### Component Breakdown
+
+#### Backend (`backend/`)
+
+[`main.py`](Project_Velocity/backend/main.py) — FastAPI application with:
+
+- **Auth system** — Login, profile lookup, user listing, avatar upload
+- **WebSocket managers** — [`_CatalystManager()`](Project_Velocity/backend/main.py:296) and [`_CRMManager()`](Project_Velocity/backend/main.py:320) for real-time event broadcasting
+- **Connection pooling** — PostgreSQL via `databases` library with async context management
+- **Lifespan hooks** — [`lifespan()`](Project_Velocity/backend/main.py:83) initializes and cleans up resources
+
+Key endpoints:
+
+| Endpoint | Method | Purpose |
+|----------|--------|---------|
+| `/api/auth/login` | POST | Authenticate user |
+| `/api/auth/me` | GET | Get current user profile |
+| `/api/auth/users` | GET | List all users (admin) |
+| `/api/auth/profile/avatar` | POST | Upload profile avatar |
+| `/ws/catalyst` | WS | Catalyst event channel |
+| `/ws/crm` | WS | CRM event channel |
+| `/health` | GET | Health check |
+
+#### Frontend (`app/`)
+
+[`App.tsx`](Project_Velocity/app/src/App.tsx) — React application with:
+
+- **Protected routes** — [`ProtectedRoute()`](Project_Velocity/app/src/App.tsx:66) wraps authenticated paths
+- **Route module sync** — [`RouteModuleSync()`](Project_Velocity/app/src/App.tsx:90) handles dynamic route loading
+- **Main layout** — [`MainLayout()`](Project_Velocity/app/src/App.tsx:90) provides chrome (header, sidebar, content area)
+- **Role rendering** — [`formatRoleLabel()`](Project_Velocity/app/src/App.tsx:379) converts role codes to display labels
+- **Auth state management** — Dual `useEffect` hooks handle token persistence and user fetch
+
+#### Agent Context (`.Agent Context/`)
+
+Documents that define how agents operate within Velocity:
+
+- [`Qwen 3.6 35B A3B Ollama Access, Recovery, and Team Setup.md`](Project_Velocity/.Agent%20Context/Qwen%203.6%2035B%20A3B%20Ollama%20Access,%20Recovery,%20and%20Team%20Setup.md) — Model runtime, recovery policies, team onboarding
+- `README.md` — This file
+
+#### Infrastructure (`.Infrastructure/`)
+
+Deployment and operational documentation:
+
+- systemd unit files for backend, frontend, Ollama services
+- Network configuration and ingress rules
+- Monitoring and alerting setup
+
+---
+
+## Runtime Truth
+
+### What "Works" Means in Velocity
+
+Velocity has three runtime layers, each with different failure modes:
+
+#### Layer A: Fast Runtime Recovery
+
+If the API crashes or restarts:
+- PostgreSQL connection pool rebuilds automatically via [`lifespan()`](Project_Velocity/backend/main.py:83)
+- WebSocket managers reinitialize and accept new connections
+- No data loss — all state is in PostgreSQL
+
+#### Layer B: Model Rehydration Recovery
+
+If Ollama loses the Qwen model:
+- Watchdog systemd unit detects absence via `/api/tags`
+- Auto-registers model from NVMe cache or S3 artifact storage
+- **Production requirement:** Same-run auto-hydration logic must complete before any agent request
+
+#### Layer C: Full System Recovery
+
+If everything goes down:
+1. PostgreSQL recovers WAL logs
+2. Ollama watchdog restores model
+3. Backend systemd unit restarts API
+4. Frontend rebuilds if artifacts are corrupted
+
+### Critical Contracts
+
+**Auth contract:**
+```
+Client → POST /api/auth/login {email, password}
+       → 200 OK {token, user}
+       
+Client → GET /api/auth/me (Authorization: Bearer <token>)
+       → 200 OK {id, email, role, avatar_url}
+       → 401 Unauthorized
+```
+
+**WebSocket contract:**
+```
+Client → WS /ws/catalyst
+       → Accepts live events: {event_type, campaign_name, value, timestamp}
+
+Client → WS /ws/crm
+       → Accepts CRM events: {type, payload, timestamp}
+```
+
+**Model contract:**
+```
+Ollama → GET /api/tags returns qwen3.6:35b-a3b
+       → Context window: 131072 tokens
+       → Provider: OpenAI-compatible interface at http://localhost:11434/v1
+```
+
+---
+
+## Team Setup
+
+### Developer Onboarding
+
+#### 1. Clone & Bootstrap
+
+```bash
+git clone <repo-url>
+cd Project_Velocity
+bash bootstrap/setup.sh
+```
+
+#### 2. VS Code / Roo Code Configuration
+
+Edit `.vscode/settings.json`:
+
+```json
+{
+  "roo-cline.provider": "openai-compatible",
+  "roo-cline.baseUrl": "http://localhost:11434/v1",
+  "roo-cline.modelId": "qwen3.6:35b-a3b",
+  "roo-cline.contextWindow": 131072,
+  "roo-cline.temperature": 0.7
+}
+```
+
+#### 3. Verify Team Access
+
+```bash
+# Backend health
+curl http://localhost:8000/health
+# Expected: {"status": "ok"}
+
+# Model loaded
+curl http://localhost:11434/api/tags | jq -r '.models[].name'
+# Expected: qwen3.6:35b-a3b
+
+# Frontend
+open http://localhost:5173
+# Expected: Login screen
+```
+
+### Role Definitions
+
+| Role | Access Level | Can Do |
+|------|-------------|--------|
+| `admin` | Full | User management, system config, agent orchestration |
+| `developer` | Standard | Code generation, review, testing |
+| `viewer` | Read-only | Dashboard, campaign monitoring |
+
+### Performance Expectations
+
+| Scenario | Tokens/sec | Latency |
+|----------|-----------|---------|
+| Single-stream (local GPU) | ~80-120 tok/s | ~200ms first token |
+| Two concurrent requests | ~60-90 tok/s each | ~300ms first token |
+| Four-way batch | ~40-60 tok/s each | ~500ms first token |
+
+*Numbers vary by GPU hardware. Measure your setup.*
+
+---
+
+## GPU & Model Runtime
+
+### Hardware Requirements
+
+| Component | Minimum | Recommended |
+|-----------|---------|-------------|
+| GPU VRAM | 16GB | 24GB+ |
+| GPU Compute | Turing architecture | Ada Lovelace / Hopper |
+| NVMe Storage | 50GB free | 100GB+ NVMe Gen4 |
+| RAM | 32GB | 64GB+ |
+
+### Ollama Watchdog
+
+The watchdog is a systemd-managed service that ensures the Qwen model stays loaded:
+
+**Location:** `.Infrastructure/systemd/ollama-watchdog.service`
+
+**Behavior:**
+1. Every 60 seconds, queries `http://localhost:11434/api/tags`
+2. If `qwen3.6:35b-a3b` is absent, triggers rehydration
+3. Rehydration priority: NVMe cache → S3 artifact → remote pull
+4. Logs all actions to journalctl
+
+**Manual watchdog check:**
+```bash
+sudo systemctl status ollama-watchdog
+journalctl -u ollama-watchdog --since "1 hour ago"
+```
+
+### Model Hydration Strategies
+
+| Strategy | Speed | Use Case |
+|----------|-------|----------|
+| NVMe local registration | ~2 seconds | Primary recovery path |
+| Local manifest `ollama create` | ~5 seconds | Fresh hydration from extracted weights |
+| S3 cold hydrate | ~60-300 seconds | No local cache available |
+
+### Critical: What Watchdog Must NOT Do
+
+- ❌ Delete model layers during recovery
+- ❌ Modify GPU memory directly
+- ❌ Block agent requests during hydration (graceful degradation only)
+- ❌ Restart Ollama process unless absolutely necessary
+
+---
+
+## Infrastructure
+
+### Deployment Topology
+
+```
+┌─────────────────────────────────────────────────┐
+│                  Production Host                 │
+│                                                  │
+│  ┌──────────┐  ┌──────────┐  ┌──────────────┐  │
+│  │ Backend  │  │ Frontend │  │   Ollama     │  │
+│  │ :8000    │  │ :5173    │  │  :11434      │  │
+│  │ systemd  │  │ nginx    │  │  systemd     │  │
+│  └────┬─────┘  └────┬─────┘  └──────┬───────┘  │
+│       │             │               │           │
+│       └─────────────┴───────────────┘           │
+│                         │                        │
+│                  ┌──────▼───────┐               │
+│                  │  PostgreSQL  │               │
+│                  │   :5432      │               │
+│                  │  systemd     │               │
+│                  └──────────────┘               │
+│                                                  │
+│  ┌──────────────────────────────────────────┐    │
+│  │        NVIDIA GPU (CUDA + TensorRT)      │    │
+│  └──────────────────────────────────────────┘    │
+└─────────────────────────────────────────────────┘
+```
+
+### systemd Services
+
+| Service | File | Restart Policy |
+|---------|------|---------------|
+| Backend API | `velocity-backend.service` | always |
+| Frontend (nginx) | `velocity-frontend.service` | always |
+| Ollama | `ollama.service` | on-failure |
+| Watchdog | `ollama-watchdog.service` | always |
+| PostgreSQL | `postgresql.service` | on-failure |
+
+### Network Rules
+
+| Port | Protocol | Service | External Access |
+|------|----------|---------|-----------------|
+| 80 | HTTP | nginx → frontend | Yes (public) |
+| 443 | HTTPS | nginx → frontend | Yes (public) |
+| 8000 | TCP | FastAPI backend | No (internal only) |
+| 5173 | TCP | Vite dev server | No (dev only) |
+| 5432 | TCP | PostgreSQL | No (internal only) |
+| 11434 | TCP | Ollama API | No (internal only) |
+
+### Monitoring
+
+```bash
+# All service health
+systemctl status velocity-backend ollama postgresql
+
+# GPU utilization
+nvidia-smi -l 1
+
+# Model inference logs
+journalctl -u ollama -f
+
+# API error rate
+curl -s http://localhost:8000/health | jq .
+```
+
+---
+
+## Runbooks
+
+### Runbook: Backend Crashes at 2 AM
+
+**Symptom:** Frontend shows 500 errors on API calls.
+
+**Steps:**
+
+```bash
+# 1. Check backend status
+sudo systemctl status velocity-backend
+# Expected: active (running)
+
+# 2. If stopped, restart
+sudo systemctl restart velocity-backend
+
+# 3. Check logs for root cause
+sudo journalctl -u velocity-backend --since "30 minutes ago" --no-pager
+
+# 4. Verify recovery
+curl http://localhost:8000/health
+# Expected: {"status": "ok"}
+
+# 5. If crash repeats, check database connectivity
+psql -U velocity -d velocity -c "SELECT 1;"
+# Expected: 1
+```
+
+**If still broken:**
+1. Check disk space: `df -h /`
+2. Check memory: `free -h`
+3. Check PostgreSQL: `sudo systemctl status postgresql`
+4. Escalate with logs from step 3
+
+---
+
+### Runbook: Ollama Model Disappeared
+
+**Symptom:** Agents return empty responses or errors.
+
+**Steps:**
+
+```bash
+# 1. Check if Ollama is running
+sudo systemctl status ollama
+# Expected: active (running)
+
+# 2. Check loaded models
+curl http://localhost:11434/api/tags | jq '.models[].name'
+# Expected: qwen3.6:35b-a3b
+
+# 3. If model is missing, check watchdog
+sudo systemctl status ollama-watchdog
+journalctl -u ollama-watchdog --since "1 hour ago" --no-pager
+
+# 4. Manual recovery if watchdog failed
+ollama pull qwen3.6:35b-a3b
+
+# 5. Verify model is usable
+curl http://localhost:11434/api/generate -d '{
+  "model": "qwen3.6:35b-a3b",
+  "prompt": "Hello",
+  "stream": false
+}' | jq .done
+# Expected: true
+```
+
+---
+
+### Runbook: Database Connection Failures
+
+**Symptom:** Backend logs show `connection refused` or `pool exhausted`.
+
+**Steps:**
+
+```bash
+# 1. Check PostgreSQL status
+sudo systemctl status postgresql
+# Expected: active (running)
+
+# 2. Check connection count
+psql -U postgres -c "SELECT count(*) FROM pg_stat_activity;"
+# Should be < max_connections (default 100)
+
+# 3. Check disk space for WAL files
+df -h /var/lib/postgresql
+
+# 4. Restart if hung
+sudo systemctl restart postgresql
+
+# 5. Verify backend reconnects
+sudo journalctl -u velocity-backend --since "1 minute ago" | grep -i "connected\|error"
+```
+
+---
+
+### Runbook: GPU Memory Exhaustion
+
+**Symptom:** Ollama returns `out of memory` errors.
+
+**Steps:**
+
+```bash
+# 1. Check current GPU usage
+nvidia-smi
+# Note: PID, memory usage, temperature
+
+# 2. Kill non-essential GPU processes if needed
+nvidia-smi --id=0 --query-compute-apps=pid,name,used_memory --format=csv
+kill <PID>
+
+# 3. Check Ollama memory allocation
+ollama show qwen3.6:35b-a3b | grep -i "layer\|memory"
+
+# 4. If still exhausted, reduce model quantization
+ollama pull qwen3.6:35b-a3b-q4_0
+
+# 5. Monitor recovery
+watch -n 1 nvidia-smi
+```
+
+---
+
+## API Reference
+
+### Auth Endpoints
+
+#### `POST /api/auth/login`
+
+Authenticate a user and receive a JWT token.
+
+**Request:**
+```json
+{
+  "email": "user@example.com",
+  "password": "secure_password"
+}
+```
+
+**Response (200 OK):**
+```json
+{
+  "token": "eyJhbGciOiJIUzI1NiIs...",
+  "user": {
+    "id": "uuid-here",
+    "email": "user@example.com",
+    "role": "developer",
+    "avatar_url": null
+  }
+}
+```
+
+**Errors:**
+| Status | Meaning |
+|--------|---------|
+| 401 | Invalid credentials |
+| 422 | Malformed request body |
+
+---
+
+#### `GET /api/auth/me`
+
+Get the current authenticated user's profile.
+
+**Headers:**
+```
+Authorization: Bearer <token>
+```
+
+**Response (200 OK):**
+```json
+{
+  "id": "uuid-here",
+  "email": "user@example.com",
+  "role": "developer",
+  "avatar_url": "https://cdn.example.com/avatars/user.png"
+}
+```
+
+**Errors:**
+| Status | Meaning |
+|--------|---------|
+| 401 | Token missing or invalid |
+| 403 | Token expired |
+
+---
+
+#### `GET /api/auth/users`
+
+List all users in the system. Admin only.
+
+**Headers:**
+```
+Authorization: Bearer <admin_token>
+```
+
+**Response (200 OK):**
+```json
+[
+  {
+    "id": "uuid-1",
+    "email": "admin@example.com",
+    "role": "admin",
+    "avatar_url": null
+  },
+  {
+    "id": "uuid-2",
+    "email": "dev@example.com",
+    "role": "developer",
+    "avatar_url": "https://cdn.example.com/avatars/dev.png"
+  }
+]
+```
+
+**Errors:**
+| Status | Meaning |
+|--------|---------|
+| 403 | User is not admin |
+
+---
+
+#### `POST /api/auth/profile/avatar`
+
+Upload a profile avatar image.
+
+**Headers:**
+```
+Authorization: Bearer <token>
+Content-Type: multipart/form-data
+```
+
+**Form Data:**
+| Field | Type | Required |
+|-------|------|----------|
+| avatar | file (image/jpeg, image/png) | Yes |
+
+**Response (200 OK):**
+```json
+{
+  "avatar_url": "https://cdn.example.com/avatars/new-avatar.png"
+}
+```
+
+**Errors:**
+| Status | Meaning |
+|--------|---------|
+| 401 | Not authenticated |
+| 422 | Invalid file type or size > 5MB |
+
+---
+
+### WebSocket Endpoints
+
+#### `WS /ws/catalyst`
+
+Real-time channel for Catalyst events (agent coordination, task updates).
+
+**Connection:**
+```javascript
+const ws = new WebSocket('ws://localhost:8000/ws/catalyst');
+ws.onmessage = (event) => {
+  const data = JSON.parse(event.data);
+  console.log(data.event_type, data.campaign_name, data.value);
+};
+```
+
+**Event Format:**
+```json
+{
+  "event_type": "task_complete",
+  "campaign_name": "codegen-sprint-42",
+  "value": 0.97,
+  "timestamp": "2026-04-21T16:00:00Z"
+}
+```
+
+---
+
+#### `WS /ws/crm`
+
+Real-time channel for CRM events (customer interactions, lead updates).
+
+**Connection:**
+```javascript
+const ws = new WebSocket('ws://localhost:8000/ws/crm');
+ws.onmessage = (event) => {
+  const data = JSON.parse(event.data);
+  console.log(data.type, data.payload);
+};
+```
+
+**Event Format:**
+```json
+{
+  "type": "lead_created",
+  "payload": {
+    "id": "crm-uuid",
+    "name": "Acme Corp",
+    "status": "new"
+  },
+  "timestamp": "2026-04-21T16:00:00Z"
+}
+```
+
+---
+
+### Health Check
+
+#### `GET /health`
+
+Verify system health.
+
+**Response (200 OK):**
+```json
+{
+  "status": "ok",
+  "database": "connected",
+  "ollama": "available",
+  "gpu": "present"
+}
+```
+
+---
+
+## Contributing
+
+### Code Structure
+
+```
+Project_Velocity/
+├── .Agent Context/          # Agent documentation, model specs
+├── .Infrastructure/         # Deployment configs, systemd units
+├── backend/                 # FastAPI backend
+│   ├── main.py              # Application entry point
+│   ├── requirements.txt     # Python dependencies
+│   └── migrate.py           # Database migrations
+├── app/                     # React frontend
+│   ├── src/
+│   │   ├── App.tsx          # Root component
+│   │   └── ...              # Components, routes, utils
+│   ├── package.json         # Node dependencies
+│   └── vite.config.ts       # Build config
+├── bootstrap/               # Setup scripts
+│   └── setup.sh             # One-line bootstrap
+└── README.md                # This file
+```
+
+### Making a Contribution
+
+1. **Fork and branch**
+   ```bash
+   git checkout -b feature/your-feature-name
+   ```
+
+2. **Make changes**
+   - Backend: Follow FastAPI conventions, add type hints
+   - Frontend: Follow React + TypeScript patterns, use existing components
+   - Docs: Update this README if behavior changes
+
+3. **Test locally**
+   ```bash
+   # Backend tests
+   cd backend && pytest
+   
+   # Frontend checks
+   cd app && npm run build
+   ```
+
+4. **Submit PR**
+   - Title: Clear, action-oriented
+   - Description: What + Why + How to test
+   - Link any related issues
+
+### Documentation Standards
+
+- **Every endpoint:** Document inputs, outputs, errors
+- **Every component:** JSDoc for public APIs
+- **Every runbook:** Write as if for on-call at 2am
+- **Every decision:** Record in `DECISIONS.md` with rationale
+
+---
+
+## Appendix
+
+### A. Environment Variables
+
+| Variable | Required | Description |
+|----------|----------|-------------|
+| `DATABASE_URL` | Yes | PostgreSQL connection string |
+| `SECRET_KEY` | Yes | JWT signing key |
+| `OLLAMA_BASE_URL` | No | Ollama API URL (default: `http://localhost:11434`) |
+| `GPU_ENABLED` | No | Enable GPU path (default: `true`) |
+| `LOG_LEVEL` | No | Logging level (default: `INFO`) |
+
+### B. Troubleshooting Matrix
+
+| Symptom | Likely Cause | Fix |
+|---------|-------------|-----|
+| Frontend blank screen | Backend down | `curl http://localhost:8000/health` |
+| 401 on all calls | Token expired | Re-login |
+| Agent returns empty | Model unloaded | `ollama pull qwen3.6:35b-a3b` |
+| Slow responses | GPU not used | Check `nvidia-smi`, verify CUDA |
+| Database errors | Pool exhausted | Check `max_connections`, restart backend |
+| WebSocket disconnects | Network issue | Check firewall, reverse proxy config |
+
+### C. Useful Commands Cheat Sheet
+
+```bash
+# Full system status
+systemctl status velocity-backend ollama postgresql ollama-watchdog
+
+# GPU实时监控
+watch -n 1 nvidia-smi
+
+# Model check
+curl http://localhost:11434/api/tags | jq '.models[].name'
+
+# API health
+curl -s http://localhost:8000/health | jq .
+
+# Database connection test
+psql -U velocity -d velocity -c "SELECT version();"
+
+# Frontend rebuild
+cd app && npm run build && cp -r dist/* ../nginx/html/
+
+# Restart everything (nuclear option)
+sudo systemctl restart velocity-backend ollama postgresql
+```
+
+---
+
+> **Last verified:** 2026-04-21
+> **Maintained by:** Velocity Team
+> **If this doc is wrong, the system is broken. Fix the doc first.**