feat: Oracle Canvas Component Schema and Qwen 3.6 integration (#31)

Co-authored-by: Sagnik <sagnik7896@gmail.com> Reviewed-on: #31
2026-04-20 01:43:39 +05:30
parent 57144e1bd3
commit e519339cc9
129 changed files with 625213 additions and 262 deletions
--- a/Architecture.md
+++ b/Architecture.md
@@ -0,0 +1,382 @@
+# Oracle Canvas Runtime and Ollama Batch Architecture
+
+Date: 2026-04-19  
+Repo: `Project_Velocity`
+
+## Purpose
+
+This document defines the current production Oracle Canvas runtime path, the intended Ollama/Nemoclaw model-routing strategy, and the target batch-processing API shape the team can use if Velocity exposes Oracle or coding-agent capabilities through the local model stack.
+
+This is the operator and engineering artifact. It exists to remove ambiguity.
+
+## Runtime Topology
+
+### Linux origin box
+
+Role:
+
+- hosts Velocity frontend
+- hosts FastAPI backend
+- hosts PostgreSQL and application services
+- terminates app-origin requests under the public site path
+
+Primary concern:
+
+- application routing
+- auth/session enforcement
+- Oracle API execution
+- CRM/intelligence/inventory data access
+
+### GPU box
+
+Role:
+
+- hosts ComfyUI
+- hosts heavy model runtime
+- hosts Ollama / Nemoclaw execution plane
+- stores runtime/model payloads on NVMe only
+
+Primary concern:
+
+- inference
+- media generation
+- model serving
+- agent runtime workloads
+
+### Ingress
+
+Role:
+
+- stable public entry for GPU-backed services
+- hides raw GPU host details from application code
+
+Non-negotiable rule:
+
+- never wire Oracle or frontend code to a raw GPU public IP
+
+## Oracle Canvas Current Execution Path
+
+The production-safe Oracle path is now:
+
+1. User submits prompt from Oracle Canvas frontend.
+2. Frontend calls:
+   - `/api/oracle/v1/canvas-pages/{page_id}/prompts`
+3. FastAPI Oracle orchestrator:
+   - loads user context
+   - retrieves best codebook matches
+   - builds a safe retrieval plan
+   - queries approved datasets from PostgreSQL
+   - produces JSON Canvas components
+   - commits a page revision
+4. Frontend reloads/reconciles the canvas state and renders the new blocks.
+
+## Current Oracle Backend Families
+
+### Live today
+
+- `/api/oracle/v1/me`
+- `/api/oracle/v1/canvas-pages/{page_id}`
+- `/api/oracle/v1/canvas-pages/{page_id}/prompts`
+- `/api/oracle/v1/canvas-pages/{page_id}/forks`
+- `/api/oracle/v1/canvas-pages/{page_id}/rollback`
+- `/api/oracle/v1/canvas-pages/{page_id}/revisions`
+- `/api/oracle/v1/component-templates`
+- `/api/oracle/v1/component-templates/synthesize`
+- `/api/oracle/v1/merge-requests`
+- `/api/oracle/v1/merge-requests/{mr_id}/review`
+- `/ws/oracle/canvas/{page_id}`
+
+### Template taxonomy routes
+
+- `/api/oracle/template-chapters`
+- `/api/oracle/template-subchapters`
+- `/api/oracle/component-templates`
+- `/api/oracle/component-templates/{id}`
+- `/api/oracle/component-templates/{id}/seed`
+- `/api/oracle/component-templates/synthetic-jobs`
+
+## Prompt Analysis Path
+
+Oracle should not rely on one monolithic LLM call.
+
+The correct production split is:
+
+1. codebook retrieval
+2. safe dataset selection
+3. optional LLM planning
+4. live DB fetch
+5. JSON Canvas synthesis
+6. revision commit
+
+### Why this split is correct
+
+- It reduces hallucination in UI structure.
+- It keeps DB access whitelisted and auditable.
+- It allows Oracle to keep working even when the LLM runtime is degraded.
+- It keeps the Oracle Canvas deterministic enough for operational use.
+
+## Current Model Routing Truth
+
+### Present reality
+
+The current Oracle backend has these runtime modes:
+
+- `codebook_retrieval`
+  - preferred when the prompt clearly matches the Oracle template corpus
+- `nemoclaw_hosted`
+  - used when `NEMOCLAW_API_URL` and `NEMOCLAW_API_KEY` are configured and reachable
+- `deterministic_fallback`
+  - used when the LLM planner is unavailable
+
+### What Nemoclaw currently means in code
+
+Current dispatch abstraction:
+
+- `backend/services/nemoclaw_runtime.py`
+
+This file is still a light dispatch envelope, not a fully featured provider router.
+
+### Recommended production provider stack
+
+Provider order:
+
+1. codebook retrieval layer
+2. Nemoclaw planner endpoint
+3. local Ollama fallback
+4. deterministic fallback
+
+## Recommended Ollama Model Policy
+
+### Default planning / Oracle analysis model
+
+Use a local reasoning-capable model behind Ollama when Nemoclaw is not available or when the team wants deterministic private execution.
+
+Recommended candidate:
+
+- `qwen3.6:35b-a3b`
+
+Reason:
+
+- strong agentic coding and structured reasoning profile
+- local execution path through Ollama
+- realistic fit for GPU-box-hosted inference
+
+### Deployment command
+
+Example:
+
+```bash
+ollama run qwen3.6:35b-a3b
+```
+
+### Routing rule
+
+- Oracle prompt planning:
+  - small to medium prompts: local Ollama `qwen3.6:35b-a3b`
+  - larger multi-step analytical plans: Nemoclaw planner if available
+- Coding-agent batch workloads:
+  - Ollama first for local/private jobs
+  - Nemoclaw for heavier orchestration when the runtime is healthy
+
+## Runtime LLM API
+
+The backend now exposes a first-class runtime LLM family:
+
+- `GET /api/runtime/llm/providers`
+- `POST /api/runtime/llm/chat`
+- `POST /api/runtime/llm/batch`
+- `GET /api/runtime/llm/jobs/{job_id}`
+- `GET /api/runtime/llm/jobs/{job_id}/results`
+
+This router is mounted in:
+
+- `backend/api/routes_runtime_llm.py`
+
+The current persistence path uses the existing canonical table:
+
+- `workflow_agent_runs`
+
+That means batch jobs are now persisted against the live Velocity schema without requiring a new table family before the first production rollout.
+
+## Implemented Batch Processing API
+
+This is no longer only a proposal. The following contract family exists now and can be used by Oracle or future coding-agent surfaces.
+
+### Single request inference
+
+- `POST /api/runtime/llm/chat`
+
+Payload:
+
+```json
+{
+  "provider": "ollama",
+  "model": "qwen3.6:35b-a3b",
+  "system_prompt": "You are Oracle Planner.",
+  "messages": [
+    { "role": "user", "content": "Build a CRM pipeline view for high-intent NRI buyers." }
+  ],
+  "temperature": 0.2,
+  "response_format": "json"
+}
+```
+
+### Batch submission
+
+- `POST /api/runtime/llm/batch`
+
+Payload:
+
+```json
+{
+  "provider": "ollama",
+  "model": "qwen3.6:35b-a3b",
+  "job_type": "oracle_canvas_planning",
+  "items": [
+    {
+      "request_id": "req_001",
+      "messages": [
+        { "role": "user", "content": "Show overdue high-QD follow-ups." }
+      ],
+      "response_format": "json"
+    },
+    {
+      "request_id": "req_002",
+      "messages": [
+        { "role": "user", "content": "Build a Kolkata luxury inventory comparison block." }
+      ],
+      "response_format": "json"
+    }
+  ]
+}
+```
+
+### Batch status
+
+- `GET /api/runtime/llm/jobs/{job_id}`
+
+Response:
+
+```json
+{
+  "job_id": "job_123",
+  "status": "running",
+  "provider": "ollama",
+  "model": "qwen3.6:35b-a3b",
+  "submitted_count": 2,
+  "completed_count": 1,
+  "failed_count": 0
+}
+```
+
+### Batch results
+
+- `GET /api/runtime/llm/jobs/{job_id}/results`
+
+### Providers inventory
+
+- `GET /api/runtime/llm/providers`
+
+Example response:
+
+```json
+{
+  "providers": [
+    {
+      "id": "nemoclaw",
+      "status": "online",
+      "models": ["nemotron", "remote_default"]
+    },
+    {
+      "id": "ollama",
+      "status": "online",
+      "models": ["qwen3.6:35b-a3b"]
+    }
+  ]
+}
+```
+
+## Batch Processing Design Rules
+
+1. Batch jobs must be persisted.
+2. Batch items must be individually addressable by `request_id`.
+3. Every batch job must record:
+   - provider
+   - model
+   - submitted payload hash
+   - start/end timestamps
+   - failure reason
+4. Oracle must not block the main request thread for large batches.
+5. Any DB writeback generated from a batch must go through approval tables, not direct execution.
+
+## Oracle-Specific Runtime Policy
+
+For Oracle Canvas, the LLM is not the source of truth for data.
+
+The source of truth order is:
+
+1. canonical DB tables
+2. approved dataset projections
+3. codebook template corpus
+4. model planner
+
+The model is only allowed to:
+
+- classify intent
+- choose likely component families
+- propose layout direction
+- summarize findings
+
+The model is not allowed to:
+
+- invent database facts
+- bypass dataset allowlists
+- emit arbitrary executable code into production rendering paths
+
+## Current Production Readiness Assessment
+
+### Ready now
+
+- Oracle Canvas frontend-to-backend v1 route family
+- codebook-backed template retrieval path
+- safe DB execution gateway
+- merge/fork/revision path
+- deterministic fallback path
+- runtime LLM provider inventory
+- runtime single-chat execution
+- runtime persisted batch execution through `workflow_agent_runs`
+- Oracle planner fallback through the shared runtime LLM service
+
+### Still needs explicit implementation if the team approves
+
+- per-model selection UI in Catalyst or Oracle controls
+- dedicated `runtime_llm_jobs` / `runtime_llm_job_items` tables if the team wants stronger audit/query ergonomics than `workflow_agent_runs`
+- explicit Nemoclaw vs Ollama operator switch in a production admin surface
+- richer provider health telemetry beyond simple reachability
+
+## Recommended Next Build Steps
+
+1. Add a dedicated runtime router:
+   - `backend/api/routes_runtime_llm.py`
+2. Add DB tables:
+   - `runtime_llm_jobs`
+   - `runtime_llm_job_items`
+   - `runtime_llm_job_results`
+3. Implement provider adapters:
+   - Nemoclaw adapter
+   - Ollama adapter
+4. Expose provider status to Catalyst/Oracle settings surfaces.
+5. Keep Oracle Canvas on the current codebook-first path even after LLM batching exists.
+
+## Bottom Line
+
+Oracle Canvas should be treated as a codebook-guided analytical surface with optional LLM planning, not as a raw chat-to-SQL toy.
+
+The production-safe architecture is:
+
+- Linux origin runs the application and DB access
+- GPU box runs ComfyUI and model inference
+- Oracle retrieves from the merged codebook first
+- DB access stays whitelisted
+- Nemoclaw and Ollama sit behind a documented provider interface
+- batch processing is a separate runtime service contract, not an implicit side effect of the canvas endpoint