Files

sagnik 4645ff737b feat: Complete code integration of modules (#18 )

The complete code integration is done.

Co-authored-by: Sagnik <sagnik7896@gmail.com>
Reviewed-on: #18

2026-04-12 19:20:14 +05:30

12 KiB

Raw Blame History

NemoClaw Setup Truth

Updated: April 12, 2026

1. Purpose

This document records the actual NemoClaw-related deployment state for Project Velocity. It explains what exists, where it exists, why it exists, which ports are involved, and how the reasoning path works today.

This is not the original intended architecture. This is the current operational truth.

2. High-Level Summary

Project Velocity uses the term "NemoClaw" for the reasoning and prompt layer attached to the Sentinel QD Engine. In practice, this is now split into three different concerns:

Prompted reasoning used by the FastAPI backend
OpenShell / gateway infrastructure that remains installed on the AWS node
Python-native append layers used by Oracle planning, MCP-style tool registration, and workflow dispatch preview

The active FastAPI inference path is NVIDIA-hosted OpenAI-compatible chat completions.

The OpenShell gateway and Ollama are still installed and running as adjacent infrastructure, but they are not the active primary scoring path used by backend/services/nemoclaw_client.py.

The root codebase now also includes Python-native compatibility layers inspired by Sourik's Go runtime:

backend/services/nemoclaw_runtime.py
backend/services/mcp_registry.py
backend/oracle/persona_service.py

These append the current root without replacing the active NVIDIA-hosted inference path.

3. Node and Network Truth

AWS region: us-east-1 Current public IP: 54.152.236.10 SSH user: ubuntu

Port Map

22 SSH access to the AWS node.

443 nginx TLS reverse proxy. Public entry point for the backend.

127.0.0.1:8001 FastAPI/Uvicorn backend. Not directly public.

127.0.0.1:5432 PostgreSQL. Local-only.

8080 OpenShell/NemoClaw gateway target. Internal service path for gateway bootstrap and sandbox-related flows.

11434 Local Ollama runtime. Installed and reachable on the node, but not the current primary backend scoring path.

/api/videos/marketing Backend catalog endpoint for Sentinel live-session marketing videos.

4. File and Directory Layout

NVMe-backed runtime directories

/opt/dlami/nvme/velocity/current Active backend code.

/opt/dlami/nvme/velocity/env Environment file used by velocity-backend.service.

/opt/dlami/nvme/velocity/venv Python virtual environment for the backend.

/opt/dlami/nvme/velocity/tls TLS cert and key used by nginx.

/opt/dlami/nvme/nemoclaw/prompts Prompt files used by the backend reasoning client.

/opt/dlami/nvme/assets/videos Runtime marketing-video directory served by FastAPI static assets.

/opt/dlami/nvme/assets/videos/catalog.json Optional checked catalog that controls video ordering, labels, and display metadata for the live-session picker.

/opt/dlami/nvme/pgdata/14/velocity PostgreSQL 14 data directory.

Repo paths

backend/services/nemoclaw_client.py Primary reasoning client used by the FastAPI backend.

backend/services/nemoclaw_runtime.py Python-native append layer for workflow dispatch planning, webhook verification, and claim-style helper behavior.

backend/services/mcp_registry.py Python-native MCP/search tool registry append layer used by Oracle helper surfaces.

backend/oracle/persona_service.py Subordinate Oracle persona planning layer that recommends component templates, renders prompt assets, and augments Oracle v1.

backend/api/routes_crm.py Root PostgreSQL-first CRM append layer for leads, chat_logs, kanban, and analytics routes.

backend/api/routes_oracle.py Root Oracle helper append layer for workflow preview and MCP tool discovery.

backend/oracle/router_v1.py Mounted Oracle v1 API surface for canvas, prompts, persona helpers, and collaboration.

backend/routers/videos.py Marketing-video catalog endpoint for the Sentinel live-session picker.

backend/config/marketing_videos.catalog.json Checked source catalog for the four current property walkthrough videos.

backend/nemoclaw_prompts/qd_calculator.md QD scoring prompt.

backend/nemoclaw_prompts/lead_tagger.md Lead enrichment prompt.

backend/nemoclaw_prompts/cctv_profiler.md CCTV vehicle and plate profiling prompt.

backend/scripts/nemoclaw_deploy.sh Historical deployment/bootstrap script for OpenShell/Ollama-style setup. Useful as reference, but no longer fully aligned with the active NVIDIA-primary truth.

5. Services

`velocity-backend.service`

Purpose: Runs FastAPI/Uvicorn from the NVMe release tree.

Why it exists: Provides the production API and WebSocket layer for Sentinel, Vault, Scenes, CCTV, and Auth.

Key behavior:

Reads /opt/dlami/nvme/velocity/env
Starts uvicorn backend.main:app --host 127.0.0.1 --port 8001

`nemoclaw-velocity.service`

Purpose: Bootstraps the OpenShell/NemoClaw gateway state.

Why it exists: Keeps the local gateway selection and related tooling available on the node even though FastAPI currently scores against NVIDIA directly.

Current truth:

Implemented as a non-blocking oneshot systemd unit
Leaves the service in active (exited) when successful

`nginx`

Purpose: TLS reverse proxy for the backend.

Why it exists: Exposes the backend on 443, terminates TLS, and forwards both HTTP and WebSocket traffic to Uvicorn.

`postgresql@14-velocity.service`

Purpose: Owns the NVMe-backed PostgreSQL cluster.

Why it exists: The Sentinel and Vault flows persist state in PostgreSQL, not Supabase.

6. Environment Variables

Active variables relevant to NemoClaw reasoning:

NVIDIA_API_KEY Used by the backend to authenticate against NVIDIA hosted completions.

NVIDIA_BASE_URL Set to https://integrate.api.nvidia.com/v1

NVIDIA_MODEL Set to nvidia/nemotron-3-super-120b-a12b

NVIDIA_FALLBACK_MODEL Set to nvidia/llama-3.3-nemotron-super-49b-v1

ALLOW_LOCAL_FALLBACK Currently false

NEMOCLAW_PROMPT_DIR Set to /opt/dlami/nvme/nemoclaw/prompts

Historical-but-not-primary variables:

OLLAMA_BASE_URL Still relevant if local fallback is re-enabled.

NEMOCLAW_BASE_URL No longer the primary path for backend scoring.

7. Inference Flow

Current backend inference flow

Frontend emits biometric packet over /api/sentinel/ws/perception
backend/routers/sentinel.py receives the packet
Scene context is resolved from video_scene_maps if video_asset_id and video_ts_ms are present
backend/services/nemoclaw_client.py builds an OpenAI-compatible messages payload
The backend calls NVIDIA hosted completions using nvidia/nemotron-3-super-120b-a12b
The result updates QD score state and is broadcast back over WebSocket

Current Oracle canvas planning append flow

Frontend can call /api/oracle/v1/canvas-pages/{pageId}/prompts
backend/oracle/prompt_orchestrator.py builds a retrieval plan
backend/oracle/persona_service.py recommends reusable component templates and emits a planning note block
backend/services/nemoclaw_runtime.py produces a workflow dispatch preview for ComfyUI-backed execution
backend/oracle/data_access_gateway.py runs only whitelisted PostgreSQL queries
Oracle commits the resulting components into the active canvas revision

Current CRM and analytics append flow

Root FastAPI mounts backend/api/routes_crm.py
Canonical root endpoints now exist for:
- /api/leads
- /api/leads/demographics
- /api/chat-logs
- /api/kanban/board
- /api/kanban/move
- /api/analytics/sentiment-scatter
These routes use the root asyncpg pool and PostgreSQL-first storage contract
CRM WebSocket sync is still intentionally deferred

Current lead-tagging flow

Broker or system calls /api/sentinel/tag-lead
tag_lead() uses the NVIDIA path
Lead tags are updated in leads_intelligence
LEAD_TAGGED is broadcast to notifications

Current CCTV flow

OCR/bridge posts to /api/cctv/event
profile_cctv_visitor() uses the NVIDIA path
cctv_events row is written
Session evidence is updated
Session can later be finalized through auto-mode matching

Current live-session video flow

Frontend calls GET /api/videos/marketing
Backend reads /opt/dlami/nvme/assets/videos/catalog.json if present
Backend falls back to scanning /opt/dlami/nvme/assets/videos recursively for playable files if the catalog is missing or incomplete
FastAPI serves the MP4 files through /assets/videos/...
SentinelLiveSession.tsx renders smaller preview cards that autoplay in 3-second bursts on hover and advance 10 seconds between bursts
PerceptionPlayer.tsx plays the selected asset through the same /assets/videos/... path

8. OpenShell and Ollama Truth

OpenShell and Ollama still matter, but in a narrower way than originally planned.

Ollama

Location: Runs locally on port 11434

Why it still exists:

Historical deployment compatibility
Potential local fallback if NVIDIA is disabled
OpenShell-related infrastructure expectations

OpenShell gateway

Location: Gateway target on port 8080

Why it still exists:

NemoClaw sandbox bootstrap
Local gateway control path
Operational continuity for the previously onboarded sandbox

What it is not:

It is not the current primary inference path for backend scoring
It is not the root source of truth for Oracle or CRM orchestration

8.5 Python-Native Append Responsibilities

These are now part of root truth:

Oracle persona prompt loading and render helpers live in Python, not Go
MCP/search registration lives in Python, not Go
Workflow dispatch planning for Oracle-to-Comfy orchestration lives in Python, not Go
Claim-style helper behavior is appended in Python as a compatibility layer, not as a second backend center

What remains deferred:

Full production webhook runtime parity with Sourik's Go stack
Full external search provider execution inside the MCP layer
Autonomous posting and non-root agent/webhook services

9. Prompts

Prompt source-of-truth in repo:

backend/nemoclaw_prompts/qd_calculator.md
backend/nemoclaw_prompts/lead_tagger.md
backend/nemoclaw_prompts/cctv_profiler.md

Prompt runtime location on node:

/opt/dlami/nvme/nemoclaw/prompts/qd_calculator.md
/opt/dlami/nvme/nemoclaw/prompts/lead_tagger.md
/opt/dlami/nvme/nemoclaw/prompts/cctv_profiler.md

Why copied to NVMe:

Keeps runtime prompts off the root volume
Aligns with the NVMe-first deployment strategy
Prevents storage-eviction regressions

10. Known Operational Risks

JSON compliance risk

The NVIDIA model sometimes returns malformed or partially malformed JSON for the full QD prompt. The backend now includes partial-response recovery, but this is the biggest remaining correctness risk.

Dynamic IP risk

The public IP has changed during execution. A stable Elastic IP or DNS entry is still recommended.

Trust-chain risk

nginx TLS exists, but a production-trusted certificate should replace self-signed cert material.

External producer gap

The OCR bridge script exists, but a production ONVIF/RTSP/OCR producer still needs to be pointed at the ingestion endpoint.

Catalog drift risk

If new property videos are copied to NVMe without updating catalog.json, they will still be discoverable through directory scanning, but order, title, and display color may drift from the intended broker-facing presentation.

11. Validation Commands

Health:

curl -k https://54.152.236.10/health
curl -k https://54.152.236.10/api/videos/marketing

Backend service:

sudo systemctl status velocity-backend.service

Gateway bootstrap:

sudo systemctl status nemoclaw-velocity.service

PostgreSQL:

sudo systemctl status postgresql@14-velocity.service
sudo -u postgres psql -d velocity -c '\dt'

Local inference health from backend env:

source /opt/dlami/nvme/velocity/env
PYTHONPATH=/opt/dlami/nvme/velocity/current /opt/dlami/nvme/velocity/venv/bin/python - <<'PY'
import asyncio, json
from backend.services.nemoclaw_client import health_check
print(asyncio.run(health_check()))
PY

12. What to Update If the Truth Changes

Update this document whenever any of the following change:

Public IP or DNS target
Primary inference provider
Primary model
Prompt directory
nginx port or TLS behavior
OpenShell gateway port
service unit names
NVMe runtime paths

12 KiB Raw Blame History