feat: Oracle Canvas Component Schema and Qwen 3.6 integration (#31)
Co-authored-by: Sagnik <sagnik7896@gmail.com> Reviewed-on: #31
This commit was merged in pull request #31.
This commit is contained in:
@@ -0,0 +1,382 @@
|
||||
# Oracle Canvas Runtime and Ollama Batch Architecture
|
||||
|
||||
Date: 2026-04-19
|
||||
Repo: `Project_Velocity`
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines the current production Oracle Canvas runtime path, the intended Ollama/Nemoclaw model-routing strategy, and the target batch-processing API shape the team can use if Velocity exposes Oracle or coding-agent capabilities through the local model stack.
|
||||
|
||||
This is the operator and engineering artifact. It exists to remove ambiguity.
|
||||
|
||||
## Runtime Topology
|
||||
|
||||
### Linux origin box
|
||||
|
||||
Role:
|
||||
|
||||
- hosts Velocity frontend
|
||||
- hosts FastAPI backend
|
||||
- hosts PostgreSQL and application services
|
||||
- terminates app-origin requests under the public site path
|
||||
|
||||
Primary concern:
|
||||
|
||||
- application routing
|
||||
- auth/session enforcement
|
||||
- Oracle API execution
|
||||
- CRM/intelligence/inventory data access
|
||||
|
||||
### GPU box
|
||||
|
||||
Role:
|
||||
|
||||
- hosts ComfyUI
|
||||
- hosts heavy model runtime
|
||||
- hosts Ollama / Nemoclaw execution plane
|
||||
- stores runtime/model payloads on NVMe only
|
||||
|
||||
Primary concern:
|
||||
|
||||
- inference
|
||||
- media generation
|
||||
- model serving
|
||||
- agent runtime workloads
|
||||
|
||||
### Ingress
|
||||
|
||||
Role:
|
||||
|
||||
- stable public entry for GPU-backed services
|
||||
- hides raw GPU host details from application code
|
||||
|
||||
Non-negotiable rule:
|
||||
|
||||
- never wire Oracle or frontend code to a raw GPU public IP
|
||||
|
||||
## Oracle Canvas Current Execution Path
|
||||
|
||||
The production-safe Oracle path is now:
|
||||
|
||||
1. User submits prompt from Oracle Canvas frontend.
|
||||
2. Frontend calls:
|
||||
- `/api/oracle/v1/canvas-pages/{page_id}/prompts`
|
||||
3. FastAPI Oracle orchestrator:
|
||||
- loads user context
|
||||
- retrieves best codebook matches
|
||||
- builds a safe retrieval plan
|
||||
- queries approved datasets from PostgreSQL
|
||||
- produces JSON Canvas components
|
||||
- commits a page revision
|
||||
4. Frontend reloads/reconciles the canvas state and renders the new blocks.
|
||||
|
||||
## Current Oracle Backend Families
|
||||
|
||||
### Live today
|
||||
|
||||
- `/api/oracle/v1/me`
|
||||
- `/api/oracle/v1/canvas-pages/{page_id}`
|
||||
- `/api/oracle/v1/canvas-pages/{page_id}/prompts`
|
||||
- `/api/oracle/v1/canvas-pages/{page_id}/forks`
|
||||
- `/api/oracle/v1/canvas-pages/{page_id}/rollback`
|
||||
- `/api/oracle/v1/canvas-pages/{page_id}/revisions`
|
||||
- `/api/oracle/v1/component-templates`
|
||||
- `/api/oracle/v1/component-templates/synthesize`
|
||||
- `/api/oracle/v1/merge-requests`
|
||||
- `/api/oracle/v1/merge-requests/{mr_id}/review`
|
||||
- `/ws/oracle/canvas/{page_id}`
|
||||
|
||||
### Template taxonomy routes
|
||||
|
||||
- `/api/oracle/template-chapters`
|
||||
- `/api/oracle/template-subchapters`
|
||||
- `/api/oracle/component-templates`
|
||||
- `/api/oracle/component-templates/{id}`
|
||||
- `/api/oracle/component-templates/{id}/seed`
|
||||
- `/api/oracle/component-templates/synthetic-jobs`
|
||||
|
||||
## Prompt Analysis Path
|
||||
|
||||
Oracle should not rely on one monolithic LLM call.
|
||||
|
||||
The correct production split is:
|
||||
|
||||
1. codebook retrieval
|
||||
2. safe dataset selection
|
||||
3. optional LLM planning
|
||||
4. live DB fetch
|
||||
5. JSON Canvas synthesis
|
||||
6. revision commit
|
||||
|
||||
### Why this split is correct
|
||||
|
||||
- It reduces hallucination in UI structure.
|
||||
- It keeps DB access whitelisted and auditable.
|
||||
- It allows Oracle to keep working even when the LLM runtime is degraded.
|
||||
- It keeps the Oracle Canvas deterministic enough for operational use.
|
||||
|
||||
## Current Model Routing Truth
|
||||
|
||||
### Present reality
|
||||
|
||||
The current Oracle backend has these runtime modes:
|
||||
|
||||
- `codebook_retrieval`
|
||||
- preferred when the prompt clearly matches the Oracle template corpus
|
||||
- `nemoclaw_hosted`
|
||||
- used when `NEMOCLAW_API_URL` and `NEMOCLAW_API_KEY` are configured and reachable
|
||||
- `deterministic_fallback`
|
||||
- used when the LLM planner is unavailable
|
||||
|
||||
### What Nemoclaw currently means in code
|
||||
|
||||
Current dispatch abstraction:
|
||||
|
||||
- `backend/services/nemoclaw_runtime.py`
|
||||
|
||||
This file is still a light dispatch envelope, not a fully featured provider router.
|
||||
|
||||
### Recommended production provider stack
|
||||
|
||||
Provider order:
|
||||
|
||||
1. codebook retrieval layer
|
||||
2. Nemoclaw planner endpoint
|
||||
3. local Ollama fallback
|
||||
4. deterministic fallback
|
||||
|
||||
## Recommended Ollama Model Policy
|
||||
|
||||
### Default planning / Oracle analysis model
|
||||
|
||||
Use a local reasoning-capable model behind Ollama when Nemoclaw is not available or when the team wants deterministic private execution.
|
||||
|
||||
Recommended candidate:
|
||||
|
||||
- `qwen3.6:35b-a3b`
|
||||
|
||||
Reason:
|
||||
|
||||
- strong agentic coding and structured reasoning profile
|
||||
- local execution path through Ollama
|
||||
- realistic fit for GPU-box-hosted inference
|
||||
|
||||
### Deployment command
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
ollama run qwen3.6:35b-a3b
|
||||
```
|
||||
|
||||
### Routing rule
|
||||
|
||||
- Oracle prompt planning:
|
||||
- small to medium prompts: local Ollama `qwen3.6:35b-a3b`
|
||||
- larger multi-step analytical plans: Nemoclaw planner if available
|
||||
- Coding-agent batch workloads:
|
||||
- Ollama first for local/private jobs
|
||||
- Nemoclaw for heavier orchestration when the runtime is healthy
|
||||
|
||||
## Runtime LLM API
|
||||
|
||||
The backend now exposes a first-class runtime LLM family:
|
||||
|
||||
- `GET /api/runtime/llm/providers`
|
||||
- `POST /api/runtime/llm/chat`
|
||||
- `POST /api/runtime/llm/batch`
|
||||
- `GET /api/runtime/llm/jobs/{job_id}`
|
||||
- `GET /api/runtime/llm/jobs/{job_id}/results`
|
||||
|
||||
This router is mounted in:
|
||||
|
||||
- `backend/api/routes_runtime_llm.py`
|
||||
|
||||
The current persistence path uses the existing canonical table:
|
||||
|
||||
- `workflow_agent_runs`
|
||||
|
||||
That means batch jobs are now persisted against the live Velocity schema without requiring a new table family before the first production rollout.
|
||||
|
||||
## Implemented Batch Processing API
|
||||
|
||||
This is no longer only a proposal. The following contract family exists now and can be used by Oracle or future coding-agent surfaces.
|
||||
|
||||
### Single request inference
|
||||
|
||||
- `POST /api/runtime/llm/chat`
|
||||
|
||||
Payload:
|
||||
|
||||
```json
|
||||
{
|
||||
"provider": "ollama",
|
||||
"model": "qwen3.6:35b-a3b",
|
||||
"system_prompt": "You are Oracle Planner.",
|
||||
"messages": [
|
||||
{ "role": "user", "content": "Build a CRM pipeline view for high-intent NRI buyers." }
|
||||
],
|
||||
"temperature": 0.2,
|
||||
"response_format": "json"
|
||||
}
|
||||
```
|
||||
|
||||
### Batch submission
|
||||
|
||||
- `POST /api/runtime/llm/batch`
|
||||
|
||||
Payload:
|
||||
|
||||
```json
|
||||
{
|
||||
"provider": "ollama",
|
||||
"model": "qwen3.6:35b-a3b",
|
||||
"job_type": "oracle_canvas_planning",
|
||||
"items": [
|
||||
{
|
||||
"request_id": "req_001",
|
||||
"messages": [
|
||||
{ "role": "user", "content": "Show overdue high-QD follow-ups." }
|
||||
],
|
||||
"response_format": "json"
|
||||
},
|
||||
{
|
||||
"request_id": "req_002",
|
||||
"messages": [
|
||||
{ "role": "user", "content": "Build a Kolkata luxury inventory comparison block." }
|
||||
],
|
||||
"response_format": "json"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Batch status
|
||||
|
||||
- `GET /api/runtime/llm/jobs/{job_id}`
|
||||
|
||||
Response:
|
||||
|
||||
```json
|
||||
{
|
||||
"job_id": "job_123",
|
||||
"status": "running",
|
||||
"provider": "ollama",
|
||||
"model": "qwen3.6:35b-a3b",
|
||||
"submitted_count": 2,
|
||||
"completed_count": 1,
|
||||
"failed_count": 0
|
||||
}
|
||||
```
|
||||
|
||||
### Batch results
|
||||
|
||||
- `GET /api/runtime/llm/jobs/{job_id}/results`
|
||||
|
||||
### Providers inventory
|
||||
|
||||
- `GET /api/runtime/llm/providers`
|
||||
|
||||
Example response:
|
||||
|
||||
```json
|
||||
{
|
||||
"providers": [
|
||||
{
|
||||
"id": "nemoclaw",
|
||||
"status": "online",
|
||||
"models": ["nemotron", "remote_default"]
|
||||
},
|
||||
{
|
||||
"id": "ollama",
|
||||
"status": "online",
|
||||
"models": ["qwen3.6:35b-a3b"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Batch Processing Design Rules
|
||||
|
||||
1. Batch jobs must be persisted.
|
||||
2. Batch items must be individually addressable by `request_id`.
|
||||
3. Every batch job must record:
|
||||
- provider
|
||||
- model
|
||||
- submitted payload hash
|
||||
- start/end timestamps
|
||||
- failure reason
|
||||
4. Oracle must not block the main request thread for large batches.
|
||||
5. Any DB writeback generated from a batch must go through approval tables, not direct execution.
|
||||
|
||||
## Oracle-Specific Runtime Policy
|
||||
|
||||
For Oracle Canvas, the LLM is not the source of truth for data.
|
||||
|
||||
The source of truth order is:
|
||||
|
||||
1. canonical DB tables
|
||||
2. approved dataset projections
|
||||
3. codebook template corpus
|
||||
4. model planner
|
||||
|
||||
The model is only allowed to:
|
||||
|
||||
- classify intent
|
||||
- choose likely component families
|
||||
- propose layout direction
|
||||
- summarize findings
|
||||
|
||||
The model is not allowed to:
|
||||
|
||||
- invent database facts
|
||||
- bypass dataset allowlists
|
||||
- emit arbitrary executable code into production rendering paths
|
||||
|
||||
## Current Production Readiness Assessment
|
||||
|
||||
### Ready now
|
||||
|
||||
- Oracle Canvas frontend-to-backend v1 route family
|
||||
- codebook-backed template retrieval path
|
||||
- safe DB execution gateway
|
||||
- merge/fork/revision path
|
||||
- deterministic fallback path
|
||||
- runtime LLM provider inventory
|
||||
- runtime single-chat execution
|
||||
- runtime persisted batch execution through `workflow_agent_runs`
|
||||
- Oracle planner fallback through the shared runtime LLM service
|
||||
|
||||
### Still needs explicit implementation if the team approves
|
||||
|
||||
- per-model selection UI in Catalyst or Oracle controls
|
||||
- dedicated `runtime_llm_jobs` / `runtime_llm_job_items` tables if the team wants stronger audit/query ergonomics than `workflow_agent_runs`
|
||||
- explicit Nemoclaw vs Ollama operator switch in a production admin surface
|
||||
- richer provider health telemetry beyond simple reachability
|
||||
|
||||
## Recommended Next Build Steps
|
||||
|
||||
1. Add a dedicated runtime router:
|
||||
- `backend/api/routes_runtime_llm.py`
|
||||
2. Add DB tables:
|
||||
- `runtime_llm_jobs`
|
||||
- `runtime_llm_job_items`
|
||||
- `runtime_llm_job_results`
|
||||
3. Implement provider adapters:
|
||||
- Nemoclaw adapter
|
||||
- Ollama adapter
|
||||
4. Expose provider status to Catalyst/Oracle settings surfaces.
|
||||
5. Keep Oracle Canvas on the current codebook-first path even after LLM batching exists.
|
||||
|
||||
## Bottom Line
|
||||
|
||||
Oracle Canvas should be treated as a codebook-guided analytical surface with optional LLM planning, not as a raw chat-to-SQL toy.
|
||||
|
||||
The production-safe architecture is:
|
||||
|
||||
- Linux origin runs the application and DB access
|
||||
- GPU box runs ComfyUI and model inference
|
||||
- Oracle retrieves from the merged codebook first
|
||||
- DB access stays whitelisted
|
||||
- Nemoclaw and Ollama sit behind a documented provider interface
|
||||
- batch processing is a separate runtime service contract, not an implicit side effect of the canvas endpoint
|
||||
Reference in New Issue
Block a user