feat: Oracle Canvas Component Schema and Qwen 3.6 integration (#31)

Co-authored-by: Sagnik <sagnik7896@gmail.com> Reviewed-on: sagnik/Project_Velocity#31
2026-04-20 01:43:39 +05:30
parent 57144e1bd3
commit e519339cc9
129 changed files with 625213 additions and 262 deletions
--- a/Context/Oracle
+++ b/Context/Oracle
@@ -0,0 +1,209 @@
+# Oracle Canvas Codebook Production Truth
+
+Date: 2026-04-19  
+Repo: `Project_Velocity`
+
+## Purpose
+
+This document freezes the current production truth for the Oracle Canvas template/codebook system, the expanded GPT and Claude corpora, the runtime merge policy, and the current rendering limits that matter for delivery.
+
+This is not a concept note. It is the implementation-facing truth for the Oracle template layer as it exists now.
+
+## Current Source Of Truth
+
+The Oracle template book is split across three layers:
+
+1. Structural database schema
+   - `backend/oracle/schema_extension_v2.sql`
+   - Defines:
+     - `oracle_template_chapters`
+     - `oracle_template_subchapters`
+     - `oracle_template_seed_examples`
+     - chapter/subchapter linkage on `oracle_component_templates`
+     - `oracle_synthetic_generation_jobs`
+
+2. Runtime seed DB
+   - `backend/oracle/oracle_template_seed_db.json`
+   - This is the lightweight fallback DB shipped with the runtime.
+   - It is structurally correct but incomplete relative to the intended corpus.
+
+3. Expanded authoring corpora
+   - GPT pack:
+     - `Project_Velocity/.Agent Context/Sprint 1/Sayan Multi-Surface and Oracle Delivery Pack/Sample JSON Schema/GPT 5.4/oracle_canvas_json_expansion_pack/db/oracle_template_seed_db_expanded_v1.pretty.json`
+   - Claude pack:
+   - `Project_Velocity/.Agent Context/Sprint 1/Sayan Multi-Surface and Oracle Delivery Pack/Sample JSON Schema/Claude Sonnet 4.6/oracle_template_expansion/oracle_template_seed_db_expanded.json`
+
+4. Frozen runtime merge artifact
+   - `backend/oracle/oracle_runtime_codebook_merged.json`
+   - This is the deploy-safe merged corpus generated from the GPT and Claude packs.
+   - Production should prefer this file over the authoring packs whenever it is present.
+
+## Corpus Status
+
+The expanded corpora are materially useful and production-relevant.
+
+### GPT 5.4 pack
+
+- Chapters: `6`
+- Subchapters: `24`
+- Seed examples: `1200`
+- Shape: already close to runtime needs
+- Key field for examples: `seed_examples`
+
+### Claude Sonnet 4.6 pack
+
+- Chapters: `6`
+- Subchapters: `24`
+- Examples: `1200`
+- Key field for examples: `examples`
+- Shape: close, but requires normalization into runtime form
+
+### Runtime fallback pack
+
+- Chapters: `6`
+- Subchapters: `24`
+- Seed examples declared in metadata: `36`
+- Seed examples physically present: lower than metadata
+- Useful only as a fallback, not as the primary production corpus
+
+## Super Codebook Policy
+
+The current runtime now treats the codebook as a merged corpus rather than a single-file static DB.
+
+The merge policy is:
+
+1. Load GPT pack first.
+2. Load Claude pack second.
+3. Load runtime fallback pack last.
+4. Normalize all example records to one runtime contract.
+5. Deduplicate by:
+   - `subchapter_id`
+   - `template_name`
+   - `title`
+6. Prefer in this order:
+   - GPT 5.4 examples
+   - canonical examples
+   - fallback records only when no richer example exists
+
+This behavior is implemented in:
+
+- `backend/oracle/codebook_service.py`
+- `backend/scripts/build_oracle_runtime_codebook.py`
+
+That file is now the effective runtime “super codebook” layer.
+
+The generated runtime artifact currently contains the merged deployable corpus and is suitable for Linux-box deployment without requiring `.Agent Context` lookups at request time.
+
+## What The Runtime Actually Uses
+
+The runtime no longer needs to rely on hardcoded template lists in the Oracle v1 router.
+
+The codebook service now provides:
+
+- merged corpus loading
+- search over both corpora
+- normalized template listing
+- best-match template synthesis from a user prompt
+
+Primary runtime functions:
+
+- `codebook_service.stats()`
+- `codebook_service.list_templates(...)`
+- `codebook_service.search_examples(prompt, limit=...)`
+- `codebook_service.synthesize_template(prompt, data_shapes=...)`
+
+## Current Supported Runtime Output Families
+
+The expanded corpora include more component types than the current frontend renderer supports directly.
+
+The current production-safe strategy is:
+
+1. keep the full codebook corpus
+2. map high-variety codebook component families into a smaller supported runtime renderer set
+3. let Oracle render reliably today instead of failing on unsupported component types
+
+### Supported runtime renderers today
+
+- `textCanvas`
+- `kpiTile`
+- `barChart`
+- `lineChart`
+- `geoMap`
+- `table`
+- `pipelineBoard`
+- `timeline`
+- `activityStream`
+- `errorNotice`
+
+### Codebook-to-runtime normalization policy
+
+Examples:
+
+- `summary_card`, `summary_strip`, `metric_card_group`, `gauge_stack`
+  - mapped to `kpiTile`
+- `lead_profile_card`, `property_card`, `data_table`, `leaderboard_table`, `matrix_grid`
+  - mapped to `table`
+- `interaction_timeline`, `message_thread_summary`
+  - mapped to `activityStream`
+- `heatmap`
+  - mapped to `geoMap`
+
+This is deliberate. It keeps the UI stable while preserving the larger design vocabulary inside the template book.
+
+## What Is Production-Ready Now
+
+- Oracle template DB schema exists.
+- Oracle template taxonomy APIs exist.
+- Expanded GPT and Claude corpora are available locally in the repo.
+- Runtime codebook merge and retrieval is implemented in `codebook_service.py`.
+- A frozen merged runtime codebook now exists at `backend/oracle/oracle_runtime_codebook_merged.json`.
+- Oracle v1 template listing/synthesis is being moved to the codebook-backed path.
+- Oracle backend can now emit `textCanvas` planning blocks and the frontend has a renderer for them.
+
+## What Is Still Constrained
+
+- The runtime is not yet rendering all 47+ component families natively.
+- The current system uses safe projection into supported runtime renderers.
+- The template taxonomy routes existed, but were incorrectly using `user.role` as `tenant_id`; that has been corrected toward a fixed Oracle tenant policy.
+- The lightweight fallback JSON DB remains incomplete and should not be treated as the main corpus.
+
+## What Nemoclaw / Oracle Should Use For Retrieval
+
+The correct order for Oracle prompt handling is:
+
+1. Parse prompt.
+2. Retrieve matching codebook examples from the merged corpus.
+3. Build a safe retrieval plan against allowed DB datasets.
+4. Query live CRM/intelligence/inventory datasets.
+5. Build Oracle Canvas JSON with supported runtime component types.
+6. Append to the existing canvas.
+
+The codebook is not the final UI payload by itself.
+
+It is the reference layer that guides:
+
+- component family selection
+- chapter/subchapter intent
+- layout direction
+- data-shape expectations
+- policy hints
+- backend contract hints
+
+## Recommended Near-Term Hardening
+
+1. Materialize a generated runtime codebook file if Linux deployment should not depend on `.Agent Context`.
+2. Add explicit metadata versioning to the merged corpus.
+3. Add a small admin endpoint for codebook stats and source summary.
+4. Expand renderer coverage incrementally rather than trying to support all component families at once.
+5. Add a batch offline export path if the team wants a frozen deploy artifact.
+
+## Operator Bottom Line
+
+The Oracle “book with chapters and JSON schema examples” is real and already useful.
+
+The correct production interpretation is:
+
+- DB schema and APIs are already present
+- GPT and Claude expansion packs are the real high-value corpus
+- `backend/oracle/codebook_service.py` is the runtime super-codebook layer
+- Oracle should retrieve from this merged corpus first, then query live DB data, then render supported JSON Canvas components