Files
Project_Velocity/.Agent Context/Sprint 1/nemoclaw_setup_truth.md
2026-04-12 02:02:58 +05:30

9.3 KiB

NemoClaw Setup Truth

Updated: April 2, 2026

1. Purpose

This document records the actual NemoClaw-related deployment state for Project Velocity. It explains what exists, where it exists, why it exists, which ports are involved, and how the reasoning path works today.

This is not the original intended architecture. This is the current operational truth.

2. High-Level Summary

Project Velocity uses the term "NemoClaw" for the reasoning and prompt layer attached to the Sentinel QD Engine. In practice, this is now split into two different concerns:

  1. Prompted reasoning used by the FastAPI backend
  2. OpenShell / gateway infrastructure that remains installed on the AWS node

The active FastAPI inference path is NVIDIA-hosted OpenAI-compatible chat completions.

The OpenShell gateway and Ollama are still installed and running as adjacent infrastructure, but they are not the active primary scoring path used by backend/services/nemoclaw_client.py.

3. Node and Network Truth

AWS region: us-east-1 Current public IP: 54.152.236.10 SSH user: ubuntu

Port Map

22 SSH access to the AWS node.

443 nginx TLS reverse proxy. Public entry point for the backend.

127.0.0.1:8001 FastAPI/Uvicorn backend. Not directly public.

127.0.0.1:5432 PostgreSQL. Local-only.

8080 OpenShell/NemoClaw gateway target. Internal service path for gateway bootstrap and sandbox-related flows.

11434 Local Ollama runtime. Installed and reachable on the node, but not the current primary backend scoring path.

/api/videos/marketing Backend catalog endpoint for Sentinel live-session marketing videos.

4. File and Directory Layout

NVMe-backed runtime directories

/opt/dlami/nvme/velocity/current Active backend code.

/opt/dlami/nvme/velocity/env Environment file used by velocity-backend.service.

/opt/dlami/nvme/velocity/venv Python virtual environment for the backend.

/opt/dlami/nvme/velocity/tls TLS cert and key used by nginx.

/opt/dlami/nvme/nemoclaw/prompts Prompt files used by the backend reasoning client.

/opt/dlami/nvme/assets/videos Runtime marketing-video directory served by FastAPI static assets.

/opt/dlami/nvme/assets/videos/catalog.json Optional checked catalog that controls video ordering, labels, and display metadata for the live-session picker.

/opt/dlami/nvme/pgdata/14/velocity PostgreSQL 14 data directory.

Repo paths

backend/services/nemoclaw_client.py Primary reasoning client used by the FastAPI backend.

backend/routers/videos.py Marketing-video catalog endpoint for the Sentinel live-session picker.

backend/config/marketing_videos.catalog.json Checked source catalog for the four current property walkthrough videos.

backend/nemoclaw_prompts/qd_calculator.md QD scoring prompt.

backend/nemoclaw_prompts/lead_tagger.md Lead enrichment prompt.

backend/nemoclaw_prompts/cctv_profiler.md CCTV vehicle and plate profiling prompt.

backend/scripts/nemoclaw_deploy.sh Historical deployment/bootstrap script for OpenShell/Ollama-style setup. Useful as reference, but no longer fully aligned with the active NVIDIA-primary truth.

5. Services

velocity-backend.service

Purpose: Runs FastAPI/Uvicorn from the NVMe release tree.

Why it exists: Provides the production API and WebSocket layer for Sentinel, Vault, Scenes, CCTV, and Auth.

Key behavior:

  • Reads /opt/dlami/nvme/velocity/env
  • Starts uvicorn backend.main:app --host 127.0.0.1 --port 8001

nemoclaw-velocity.service

Purpose: Bootstraps the OpenShell/NemoClaw gateway state.

Why it exists: Keeps the local gateway selection and related tooling available on the node even though FastAPI currently scores against NVIDIA directly.

Current truth:

  • Implemented as a non-blocking oneshot systemd unit
  • Leaves the service in active (exited) when successful

nginx

Purpose: TLS reverse proxy for the backend.

Why it exists: Exposes the backend on 443, terminates TLS, and forwards both HTTP and WebSocket traffic to Uvicorn.

postgresql@14-velocity.service

Purpose: Owns the NVMe-backed PostgreSQL cluster.

Why it exists: The Sentinel and Vault flows persist state in PostgreSQL, not Supabase.

6. Environment Variables

Active variables relevant to NemoClaw reasoning:

NVIDIA_API_KEY Used by the backend to authenticate against NVIDIA hosted completions.

NVIDIA_BASE_URL Set to https://integrate.api.nvidia.com/v1

NVIDIA_MODEL Set to nvidia/nemotron-3-super-120b-a12b

NVIDIA_FALLBACK_MODEL Set to nvidia/llama-3.3-nemotron-super-49b-v1

ALLOW_LOCAL_FALLBACK Currently false

NEMOCLAW_PROMPT_DIR Set to /opt/dlami/nvme/nemoclaw/prompts

Historical-but-not-primary variables:

OLLAMA_BASE_URL Still relevant if local fallback is re-enabled.

NEMOCLAW_BASE_URL No longer the primary path for backend scoring.

7. Inference Flow

Current backend inference flow

  1. Frontend emits biometric packet over /api/sentinel/ws/perception
  2. backend/routers/sentinel.py receives the packet
  3. Scene context is resolved from video_scene_maps if video_asset_id and video_ts_ms are present
  4. backend/services/nemoclaw_client.py builds an OpenAI-compatible messages payload
  5. The backend calls NVIDIA hosted completions using nvidia/nemotron-3-super-120b-a12b
  6. The result updates QD score state and is broadcast back over WebSocket

Current lead-tagging flow

  1. Broker or system calls /api/sentinel/tag-lead
  2. tag_lead() uses the NVIDIA path
  3. Lead tags are updated in leads_intelligence
  4. LEAD_TAGGED is broadcast to notifications

Current CCTV flow

  1. OCR/bridge posts to /api/cctv/event
  2. profile_cctv_visitor() uses the NVIDIA path
  3. cctv_events row is written
  4. Session evidence is updated
  5. Session can later be finalized through auto-mode matching

Current live-session video flow

  1. Frontend calls GET /api/videos/marketing
  2. Backend reads /opt/dlami/nvme/assets/videos/catalog.json if present
  3. Backend falls back to scanning /opt/dlami/nvme/assets/videos recursively for playable files if the catalog is missing or incomplete
  4. FastAPI serves the MP4 files through /assets/videos/...
  5. SentinelLiveSession.tsx renders smaller preview cards that autoplay in 3-second bursts on hover and advance 10 seconds between bursts
  6. PerceptionPlayer.tsx plays the selected asset through the same /assets/videos/... path

8. OpenShell and Ollama Truth

OpenShell and Ollama still matter, but in a narrower way than originally planned.

Ollama

Location: Runs locally on port 11434

Why it still exists:

  • Historical deployment compatibility
  • Potential local fallback if NVIDIA is disabled
  • OpenShell-related infrastructure expectations

OpenShell gateway

Location: Gateway target on port 8080

Why it still exists:

  • NemoClaw sandbox bootstrap
  • Local gateway control path
  • Operational continuity for the previously onboarded sandbox

What it is not:

  • It is not the current primary inference path for backend scoring

9. Prompts

Prompt source-of-truth in repo:

  • backend/nemoclaw_prompts/qd_calculator.md
  • backend/nemoclaw_prompts/lead_tagger.md
  • backend/nemoclaw_prompts/cctv_profiler.md

Prompt runtime location on node:

  • /opt/dlami/nvme/nemoclaw/prompts/qd_calculator.md
  • /opt/dlami/nvme/nemoclaw/prompts/lead_tagger.md
  • /opt/dlami/nvme/nemoclaw/prompts/cctv_profiler.md

Why copied to NVMe:

  • Keeps runtime prompts off the root volume
  • Aligns with the NVMe-first deployment strategy
  • Prevents storage-eviction regressions

10. Known Operational Risks

JSON compliance risk

The NVIDIA model sometimes returns malformed or partially malformed JSON for the full QD prompt. The backend now includes partial-response recovery, but this is the biggest remaining correctness risk.

Dynamic IP risk

The public IP has changed during execution. A stable Elastic IP or DNS entry is still recommended.

Trust-chain risk

nginx TLS exists, but a production-trusted certificate should replace self-signed cert material.

External producer gap

The OCR bridge script exists, but a production ONVIF/RTSP/OCR producer still needs to be pointed at the ingestion endpoint.

Catalog drift risk

If new property videos are copied to NVMe without updating catalog.json, they will still be discoverable through directory scanning, but order, title, and display color may drift from the intended broker-facing presentation.

11. Validation Commands

Health:

curl -k https://54.152.236.10/health
curl -k https://54.152.236.10/api/videos/marketing

Backend service:

sudo systemctl status velocity-backend.service

Gateway bootstrap:

sudo systemctl status nemoclaw-velocity.service

PostgreSQL:

sudo systemctl status postgresql@14-velocity.service
sudo -u postgres psql -d velocity -c '\dt'

Local inference health from backend env:

source /opt/dlami/nvme/velocity/env
PYTHONPATH=/opt/dlami/nvme/velocity/current /opt/dlami/nvme/velocity/venv/bin/python - <<'PY'
import asyncio, json
from backend.services.nemoclaw_client import health_check
print(asyncio.run(health_check()))
PY

12. What to Update If the Truth Changes

Update this document whenever any of the following change:

  • Public IP or DNS target
  • Primary inference provider
  • Primary model
  • Prompt directory
  • nginx port or TLS behavior
  • OpenShell gateway port
  • service unit names
  • NVMe runtime paths