Built the Sentinel Tab

2026-04-12 02:02:58 +05:30
parent fb656d1443
commit 075ab280ad
526 changed files with 17646 additions and 70931 deletions
--- a/1/nemoclaw_setup_truth.md
+++ b/1/nemoclaw_setup_truth.md
@@ -0,0 +1,327 @@
+# NemoClaw Setup Truth
+
+Updated: April 2, 2026
+
+## 1. Purpose
+
+This document records the actual NemoClaw-related deployment state for Project Velocity. It explains what exists, where it exists, why it exists, which ports are involved, and how the reasoning path works today.
+
+This is not the original intended architecture. This is the current operational truth.
+
+## 2. High-Level Summary
+
+Project Velocity uses the term "NemoClaw" for the reasoning and prompt layer attached to the Sentinel QD Engine. In practice, this is now split into two different concerns:
+
+1. Prompted reasoning used by the FastAPI backend
+2. OpenShell / gateway infrastructure that remains installed on the AWS node
+
+The active FastAPI inference path is NVIDIA-hosted OpenAI-compatible chat completions.
+
+The OpenShell gateway and Ollama are still installed and running as adjacent infrastructure, but they are not the active primary scoring path used by `backend/services/nemoclaw_client.py`.
+
+## 3. Node and Network Truth
+
+AWS region: `us-east-1`
+Current public IP: `54.152.236.10`
+SSH user: `ubuntu`
+
+### Port Map
+
+`22`
+SSH access to the AWS node.
+
+`443`
+nginx TLS reverse proxy. Public entry point for the backend.
+
+`127.0.0.1:8001`
+FastAPI/Uvicorn backend. Not directly public.
+
+`127.0.0.1:5432`
+PostgreSQL. Local-only.
+
+`8080`
+OpenShell/NemoClaw gateway target. Internal service path for gateway bootstrap and sandbox-related flows.
+
+`11434`
+Local Ollama runtime. Installed and reachable on the node, but not the current primary backend scoring path.
+
+`/api/videos/marketing`
+Backend catalog endpoint for Sentinel live-session marketing videos.
+
+## 4. File and Directory Layout
+
+### NVMe-backed runtime directories
+
+`/opt/dlami/nvme/velocity/current`
+Active backend code.
+
+`/opt/dlami/nvme/velocity/env`
+Environment file used by `velocity-backend.service`.
+
+`/opt/dlami/nvme/velocity/venv`
+Python virtual environment for the backend.
+
+`/opt/dlami/nvme/velocity/tls`
+TLS cert and key used by nginx.
+
+`/opt/dlami/nvme/nemoclaw/prompts`
+Prompt files used by the backend reasoning client.
+
+`/opt/dlami/nvme/assets/videos`
+Runtime marketing-video directory served by FastAPI static assets.
+
+`/opt/dlami/nvme/assets/videos/catalog.json`
+Optional checked catalog that controls video ordering, labels, and display metadata for the live-session picker.
+
+`/opt/dlami/nvme/pgdata/14/velocity`
+PostgreSQL 14 data directory.
+
+### Repo paths
+
+`backend/services/nemoclaw_client.py`
+Primary reasoning client used by the FastAPI backend.
+
+`backend/routers/videos.py`
+Marketing-video catalog endpoint for the Sentinel live-session picker.
+
+`backend/config/marketing_videos.catalog.json`
+Checked source catalog for the four current property walkthrough videos.
+
+`backend/nemoclaw_prompts/qd_calculator.md`
+QD scoring prompt.
+
+`backend/nemoclaw_prompts/lead_tagger.md`
+Lead enrichment prompt.
+
+`backend/nemoclaw_prompts/cctv_profiler.md`
+CCTV vehicle and plate profiling prompt.
+
+`backend/scripts/nemoclaw_deploy.sh`
+Historical deployment/bootstrap script for OpenShell/Ollama-style setup. Useful as reference, but no longer fully aligned with the active NVIDIA-primary truth.
+
+## 5. Services
+
+### `velocity-backend.service`
+
+Purpose:
+Runs FastAPI/Uvicorn from the NVMe release tree.
+
+Why it exists:
+Provides the production API and WebSocket layer for Sentinel, Vault, Scenes, CCTV, and Auth.
+
+Key behavior:
+- Reads `/opt/dlami/nvme/velocity/env`
+- Starts `uvicorn backend.main:app --host 127.0.0.1 --port 8001`
+
+### `nemoclaw-velocity.service`
+
+Purpose:
+Bootstraps the OpenShell/NemoClaw gateway state.
+
+Why it exists:
+Keeps the local gateway selection and related tooling available on the node even though FastAPI currently scores against NVIDIA directly.
+
+Current truth:
+- Implemented as a non-blocking `oneshot` systemd unit
+- Leaves the service in `active (exited)` when successful
+
+### `nginx`
+
+Purpose:
+TLS reverse proxy for the backend.
+
+Why it exists:
+Exposes the backend on `443`, terminates TLS, and forwards both HTTP and WebSocket traffic to Uvicorn.
+
+### `postgresql@14-velocity.service`
+
+Purpose:
+Owns the NVMe-backed PostgreSQL cluster.
+
+Why it exists:
+The Sentinel and Vault flows persist state in PostgreSQL, not Supabase.
+
+## 6. Environment Variables
+
+Active variables relevant to NemoClaw reasoning:
+
+`NVIDIA_API_KEY`
+Used by the backend to authenticate against NVIDIA hosted completions.
+
+`NVIDIA_BASE_URL`
+Set to `https://integrate.api.nvidia.com/v1`
+
+`NVIDIA_MODEL`
+Set to `nvidia/nemotron-3-super-120b-a12b`
+
+`NVIDIA_FALLBACK_MODEL`
+Set to `nvidia/llama-3.3-nemotron-super-49b-v1`
+
+`ALLOW_LOCAL_FALLBACK`
+Currently `false`
+
+`NEMOCLAW_PROMPT_DIR`
+Set to `/opt/dlami/nvme/nemoclaw/prompts`
+
+Historical-but-not-primary variables:
+
+`OLLAMA_BASE_URL`
+Still relevant if local fallback is re-enabled.
+
+`NEMOCLAW_BASE_URL`
+No longer the primary path for backend scoring.
+
+## 7. Inference Flow
+
+### Current backend inference flow
+
+1. Frontend emits biometric packet over `/api/sentinel/ws/perception`
+2. `backend/routers/sentinel.py` receives the packet
+3. Scene context is resolved from `video_scene_maps` if `video_asset_id` and `video_ts_ms` are present
+4. `backend/services/nemoclaw_client.py` builds an OpenAI-compatible messages payload
+5. The backend calls NVIDIA hosted completions using `nvidia/nemotron-3-super-120b-a12b`
+6. The result updates QD score state and is broadcast back over WebSocket
+
+### Current lead-tagging flow
+
+1. Broker or system calls `/api/sentinel/tag-lead`
+2. `tag_lead()` uses the NVIDIA path
+3. Lead tags are updated in `leads_intelligence`
+4. `LEAD_TAGGED` is broadcast to notifications
+
+### Current CCTV flow
+
+1. OCR/bridge posts to `/api/cctv/event`
+2. `profile_cctv_visitor()` uses the NVIDIA path
+3. `cctv_events` row is written
+4. Session evidence is updated
+5. Session can later be finalized through auto-mode matching
+
+### Current live-session video flow
+
+1. Frontend calls `GET /api/videos/marketing`
+2. Backend reads `/opt/dlami/nvme/assets/videos/catalog.json` if present
+3. Backend falls back to scanning `/opt/dlami/nvme/assets/videos` recursively for playable files if the catalog is missing or incomplete
+4. FastAPI serves the MP4 files through `/assets/videos/...`
+5. `SentinelLiveSession.tsx` renders smaller preview cards that autoplay in 3-second bursts on hover and advance 10 seconds between bursts
+6. `PerceptionPlayer.tsx` plays the selected asset through the same `/assets/videos/...` path
+
+## 8. OpenShell and Ollama Truth
+
+OpenShell and Ollama still matter, but in a narrower way than originally planned.
+
+### Ollama
+
+Location:
+Runs locally on port `11434`
+
+Why it still exists:
+- Historical deployment compatibility
+- Potential local fallback if NVIDIA is disabled
+- OpenShell-related infrastructure expectations
+
+### OpenShell gateway
+
+Location:
+Gateway target on port `8080`
+
+Why it still exists:
+- NemoClaw sandbox bootstrap
+- Local gateway control path
+- Operational continuity for the previously onboarded sandbox
+
+What it is not:
+- It is not the current primary inference path for backend scoring
+
+## 9. Prompts
+
+Prompt source-of-truth in repo:
+
+- `backend/nemoclaw_prompts/qd_calculator.md`
+- `backend/nemoclaw_prompts/lead_tagger.md`
+- `backend/nemoclaw_prompts/cctv_profiler.md`
+
+Prompt runtime location on node:
+
+- `/opt/dlami/nvme/nemoclaw/prompts/qd_calculator.md`
+- `/opt/dlami/nvme/nemoclaw/prompts/lead_tagger.md`
+- `/opt/dlami/nvme/nemoclaw/prompts/cctv_profiler.md`
+
+Why copied to NVMe:
+- Keeps runtime prompts off the root volume
+- Aligns with the NVMe-first deployment strategy
+- Prevents storage-eviction regressions
+
+## 10. Known Operational Risks
+
+### JSON compliance risk
+
+The NVIDIA model sometimes returns malformed or partially malformed JSON for the full QD prompt. The backend now includes partial-response recovery, but this is the biggest remaining correctness risk.
+
+### Dynamic IP risk
+
+The public IP has changed during execution. A stable Elastic IP or DNS entry is still recommended.
+
+### Trust-chain risk
+
+nginx TLS exists, but a production-trusted certificate should replace self-signed cert material.
+
+### External producer gap
+
+The OCR bridge script exists, but a production ONVIF/RTSP/OCR producer still needs to be pointed at the ingestion endpoint.
+
+### Catalog drift risk
+
+If new property videos are copied to NVMe without updating `catalog.json`, they will still be discoverable through directory scanning, but order, title, and display color may drift from the intended broker-facing presentation.
+
+## 11. Validation Commands
+
+Health:
+
+```bash
+curl -k https://54.152.236.10/health
+curl -k https://54.152.236.10/api/videos/marketing
+```
+
+Backend service:
+
+```bash
+sudo systemctl status velocity-backend.service
+```
+
+Gateway bootstrap:
+
+```bash
+sudo systemctl status nemoclaw-velocity.service
+```
+
+PostgreSQL:
+
+```bash
+sudo systemctl status postgresql@14-velocity.service
+sudo -u postgres psql -d velocity -c '\dt'
+```
+
+Local inference health from backend env:
+
+```bash
+source /opt/dlami/nvme/velocity/env
+PYTHONPATH=/opt/dlami/nvme/velocity/current /opt/dlami/nvme/velocity/venv/bin/python - <<'PY'
+import asyncio, json
+from backend.services.nemoclaw_client import health_check
+print(asyncio.run(health_check()))
+PY
+```
+
+## 12. What to Update If the Truth Changes
+
+Update this document whenever any of the following change:
+
+- Public IP or DNS target
+- Primary inference provider
+- Primary model
+- Prompt directory
+- nginx port or TLS behavior
+- OpenShell gateway port
+- service unit names
+- NVMe runtime paths