Files

sagnik e519339cc9 feat: Oracle Canvas Component Schema and Qwen 3.6 integration (#31 )

Co-authored-by: Sagnik <sagnik7896@gmail.com>
Reviewed-on: sagnik/Project_Velocity#31

2026-04-20 01:43:39 +05:30

9.5 KiB

Raw Blame History

Oracle Canvas Runtime and Ollama Batch Architecture

Date: 2026-04-19
Repo: Project_Velocity

Purpose

This document defines the current production Oracle Canvas runtime path, the intended Ollama/Nemoclaw model-routing strategy, and the target batch-processing API shape the team can use if Velocity exposes Oracle or coding-agent capabilities through the local model stack.

This is the operator and engineering artifact. It exists to remove ambiguity.

Runtime Topology

Linux origin box

Role:

hosts Velocity frontend
hosts FastAPI backend
hosts PostgreSQL and application services
terminates app-origin requests under the public site path

Primary concern:

application routing
auth/session enforcement
Oracle API execution
CRM/intelligence/inventory data access

GPU box

Role:

hosts ComfyUI
hosts heavy model runtime
hosts Ollama / Nemoclaw execution plane
stores runtime/model payloads on NVMe only

Primary concern:

inference
media generation
model serving
agent runtime workloads

Ingress

Role:

stable public entry for GPU-backed services
hides raw GPU host details from application code

Non-negotiable rule:

never wire Oracle or frontend code to a raw GPU public IP

Oracle Canvas Current Execution Path

The production-safe Oracle path is now:

User submits prompt from Oracle Canvas frontend.
Frontend calls:
- /api/oracle/v1/canvas-pages/{page_id}/prompts
FastAPI Oracle orchestrator:
- loads user context
- retrieves best codebook matches
- builds a safe retrieval plan
- queries approved datasets from PostgreSQL
- produces JSON Canvas components
- commits a page revision
Frontend reloads/reconciles the canvas state and renders the new blocks.

Current Oracle Backend Families

Live today

/api/oracle/v1/me
/api/oracle/v1/canvas-pages/{page_id}
/api/oracle/v1/canvas-pages/{page_id}/prompts
/api/oracle/v1/canvas-pages/{page_id}/forks
/api/oracle/v1/canvas-pages/{page_id}/rollback
/api/oracle/v1/canvas-pages/{page_id}/revisions
/api/oracle/v1/component-templates
/api/oracle/v1/component-templates/synthesize
/api/oracle/v1/merge-requests
/api/oracle/v1/merge-requests/{mr_id}/review
/ws/oracle/canvas/{page_id}

Template taxonomy routes

/api/oracle/template-chapters
/api/oracle/template-subchapters
/api/oracle/component-templates
/api/oracle/component-templates/{id}
/api/oracle/component-templates/{id}/seed
/api/oracle/component-templates/synthetic-jobs

Prompt Analysis Path

Oracle should not rely on one monolithic LLM call.

The correct production split is:

codebook retrieval
safe dataset selection
optional LLM planning
live DB fetch
JSON Canvas synthesis
revision commit

Why this split is correct

It reduces hallucination in UI structure.
It keeps DB access whitelisted and auditable.
It allows Oracle to keep working even when the LLM runtime is degraded.
It keeps the Oracle Canvas deterministic enough for operational use.

Current Model Routing Truth

Present reality

The current Oracle backend has these runtime modes:

codebook_retrieval
- preferred when the prompt clearly matches the Oracle template corpus
nemoclaw_hosted
- used when NEMOCLAW_API_URL and NEMOCLAW_API_KEY are configured and reachable
deterministic_fallback
- used when the LLM planner is unavailable

What Nemoclaw currently means in code

Current dispatch abstraction:

backend/services/nemoclaw_runtime.py

This file is still a light dispatch envelope, not a fully featured provider router.

Recommended production provider stack

Provider order:

codebook retrieval layer
Nemoclaw planner endpoint
local Ollama fallback
deterministic fallback

Recommended Ollama Model Policy

Default planning / Oracle analysis model

Use a local reasoning-capable model behind Ollama when Nemoclaw is not available or when the team wants deterministic private execution.

Recommended candidate:

qwen3.6:35b-a3b

Reason:

strong agentic coding and structured reasoning profile
local execution path through Ollama
realistic fit for GPU-box-hosted inference

Deployment command

Example:

ollama run qwen3.6:35b-a3b

Routing rule

Oracle prompt planning:
- small to medium prompts: local Ollama qwen3.6:35b-a3b
- larger multi-step analytical plans: Nemoclaw planner if available
Coding-agent batch workloads:
- Ollama first for local/private jobs
- Nemoclaw for heavier orchestration when the runtime is healthy

Runtime LLM API

The backend now exposes a first-class runtime LLM family:

GET /api/runtime/llm/providers
POST /api/runtime/llm/chat
POST /api/runtime/llm/batch
GET /api/runtime/llm/jobs/{job_id}
GET /api/runtime/llm/jobs/{job_id}/results

This router is mounted in:

backend/api/routes_runtime_llm.py

The current persistence path uses the existing canonical table:

workflow_agent_runs

That means batch jobs are now persisted against the live Velocity schema without requiring a new table family before the first production rollout.

Implemented Batch Processing API

This is no longer only a proposal. The following contract family exists now and can be used by Oracle or future coding-agent surfaces.

Single request inference

POST /api/runtime/llm/chat

Payload:

{
  "provider": "ollama",
  "model": "qwen3.6:35b-a3b",
  "system_prompt": "You are Oracle Planner.",
  "messages": [
    { "role": "user", "content": "Build a CRM pipeline view for high-intent NRI buyers." }
  ],
  "temperature": 0.2,
  "response_format": "json"
}

Batch submission

POST /api/runtime/llm/batch

Payload:

{
  "provider": "ollama",
  "model": "qwen3.6:35b-a3b",
  "job_type": "oracle_canvas_planning",
  "items": [
    {
      "request_id": "req_001",
      "messages": [
        { "role": "user", "content": "Show overdue high-QD follow-ups." }
      ],
      "response_format": "json"
    },
    {
      "request_id": "req_002",
      "messages": [
        { "role": "user", "content": "Build a Kolkata luxury inventory comparison block." }
      ],
      "response_format": "json"
    }
  ]
}

Batch status

GET /api/runtime/llm/jobs/{job_id}

Response:

{
  "job_id": "job_123",
  "status": "running",
  "provider": "ollama",
  "model": "qwen3.6:35b-a3b",
  "submitted_count": 2,
  "completed_count": 1,
  "failed_count": 0
}

Batch results

GET /api/runtime/llm/jobs/{job_id}/results

Providers inventory

GET /api/runtime/llm/providers

Example response:

{
  "providers": [
    {
      "id": "nemoclaw",
      "status": "online",
      "models": ["nemotron", "remote_default"]
    },
    {
      "id": "ollama",
      "status": "online",
      "models": ["qwen3.6:35b-a3b"]
    }
  ]
}

Batch Processing Design Rules

Batch jobs must be persisted.
Batch items must be individually addressable by request_id.
Every batch job must record:
- provider
- model
- submitted payload hash
- start/end timestamps
- failure reason
Oracle must not block the main request thread for large batches.
Any DB writeback generated from a batch must go through approval tables, not direct execution.

Oracle-Specific Runtime Policy

For Oracle Canvas, the LLM is not the source of truth for data.

The source of truth order is:

canonical DB tables
approved dataset projections
codebook template corpus
model planner

The model is only allowed to:

classify intent
choose likely component families
propose layout direction
summarize findings

The model is not allowed to:

invent database facts
bypass dataset allowlists
emit arbitrary executable code into production rendering paths

Current Production Readiness Assessment

Ready now

Oracle Canvas frontend-to-backend v1 route family
codebook-backed template retrieval path
safe DB execution gateway
merge/fork/revision path
deterministic fallback path
runtime LLM provider inventory
runtime single-chat execution
runtime persisted batch execution through workflow_agent_runs
Oracle planner fallback through the shared runtime LLM service

Still needs explicit implementation if the team approves

per-model selection UI in Catalyst or Oracle controls
dedicated runtime_llm_jobs / runtime_llm_job_items tables if the team wants stronger audit/query ergonomics than workflow_agent_runs
explicit Nemoclaw vs Ollama operator switch in a production admin surface
richer provider health telemetry beyond simple reachability

Recommended Next Build Steps

Add a dedicated runtime router:
- backend/api/routes_runtime_llm.py
Add DB tables:
- runtime_llm_jobs
- runtime_llm_job_items
- runtime_llm_job_results
Implement provider adapters:
- Nemoclaw adapter
- Ollama adapter
Expose provider status to Catalyst/Oracle settings surfaces.
Keep Oracle Canvas on the current codebook-first path even after LLM batching exists.

Bottom Line

Oracle Canvas should be treated as a codebook-guided analytical surface with optional LLM planning, not as a raw chat-to-SQL toy.

The production-safe architecture is:

Linux origin runs the application and DB access
GPU box runs ComfyUI and model inference
Oracle retrieves from the merged codebook first
DB access stays whitelisted
Nemoclaw and Ollama sit behind a documented provider interface
batch processing is a separate runtime service contract, not an implicit side effect of the canvas endpoint

9.5 KiB Raw Blame History

Oracle Canvas Runtime and Ollama Batch Architecture

Purpose

Runtime Topology

Linux origin box

GPU box

Ingress

Oracle Canvas Current Execution Path

Current Oracle Backend Families

Live today

Template taxonomy routes

Prompt Analysis Path

Why this split is correct

Current Model Routing Truth

Present reality

What Nemoclaw currently means in code

Recommended production provider stack

Recommended Ollama Model Policy

Default planning / Oracle analysis model

Deployment command

Routing rule

Runtime LLM API

Implemented Batch Processing API

Single request inference

Batch submission

Batch status

Batch results

Providers inventory

Batch Processing Design Rules

Oracle-Specific Runtime Policy

Current Production Readiness Assessment

Ready now

Still needs explicit implementation if the team approves

Recommended Next Build Steps

Bottom Line

9.5 KiB

Raw Blame History