Built the Sentinel Tab
This commit is contained in:
327
.Agent Context/Sprint 1/nemoclaw_setup_truth.md
Normal file
327
.Agent Context/Sprint 1/nemoclaw_setup_truth.md
Normal file
@@ -0,0 +1,327 @@
|
||||
# NemoClaw Setup Truth
|
||||
|
||||
Updated: April 2, 2026
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This document records the actual NemoClaw-related deployment state for Project Velocity. It explains what exists, where it exists, why it exists, which ports are involved, and how the reasoning path works today.
|
||||
|
||||
This is not the original intended architecture. This is the current operational truth.
|
||||
|
||||
## 2. High-Level Summary
|
||||
|
||||
Project Velocity uses the term "NemoClaw" for the reasoning and prompt layer attached to the Sentinel QD Engine. In practice, this is now split into two different concerns:
|
||||
|
||||
1. Prompted reasoning used by the FastAPI backend
|
||||
2. OpenShell / gateway infrastructure that remains installed on the AWS node
|
||||
|
||||
The active FastAPI inference path is NVIDIA-hosted OpenAI-compatible chat completions.
|
||||
|
||||
The OpenShell gateway and Ollama are still installed and running as adjacent infrastructure, but they are not the active primary scoring path used by `backend/services/nemoclaw_client.py`.
|
||||
|
||||
## 3. Node and Network Truth
|
||||
|
||||
AWS region: `us-east-1`
|
||||
Current public IP: `54.152.236.10`
|
||||
SSH user: `ubuntu`
|
||||
|
||||
### Port Map
|
||||
|
||||
`22`
|
||||
SSH access to the AWS node.
|
||||
|
||||
`443`
|
||||
nginx TLS reverse proxy. Public entry point for the backend.
|
||||
|
||||
`127.0.0.1:8001`
|
||||
FastAPI/Uvicorn backend. Not directly public.
|
||||
|
||||
`127.0.0.1:5432`
|
||||
PostgreSQL. Local-only.
|
||||
|
||||
`8080`
|
||||
OpenShell/NemoClaw gateway target. Internal service path for gateway bootstrap and sandbox-related flows.
|
||||
|
||||
`11434`
|
||||
Local Ollama runtime. Installed and reachable on the node, but not the current primary backend scoring path.
|
||||
|
||||
`/api/videos/marketing`
|
||||
Backend catalog endpoint for Sentinel live-session marketing videos.
|
||||
|
||||
## 4. File and Directory Layout
|
||||
|
||||
### NVMe-backed runtime directories
|
||||
|
||||
`/opt/dlami/nvme/velocity/current`
|
||||
Active backend code.
|
||||
|
||||
`/opt/dlami/nvme/velocity/env`
|
||||
Environment file used by `velocity-backend.service`.
|
||||
|
||||
`/opt/dlami/nvme/velocity/venv`
|
||||
Python virtual environment for the backend.
|
||||
|
||||
`/opt/dlami/nvme/velocity/tls`
|
||||
TLS cert and key used by nginx.
|
||||
|
||||
`/opt/dlami/nvme/nemoclaw/prompts`
|
||||
Prompt files used by the backend reasoning client.
|
||||
|
||||
`/opt/dlami/nvme/assets/videos`
|
||||
Runtime marketing-video directory served by FastAPI static assets.
|
||||
|
||||
`/opt/dlami/nvme/assets/videos/catalog.json`
|
||||
Optional checked catalog that controls video ordering, labels, and display metadata for the live-session picker.
|
||||
|
||||
`/opt/dlami/nvme/pgdata/14/velocity`
|
||||
PostgreSQL 14 data directory.
|
||||
|
||||
### Repo paths
|
||||
|
||||
`backend/services/nemoclaw_client.py`
|
||||
Primary reasoning client used by the FastAPI backend.
|
||||
|
||||
`backend/routers/videos.py`
|
||||
Marketing-video catalog endpoint for the Sentinel live-session picker.
|
||||
|
||||
`backend/config/marketing_videos.catalog.json`
|
||||
Checked source catalog for the four current property walkthrough videos.
|
||||
|
||||
`backend/nemoclaw_prompts/qd_calculator.md`
|
||||
QD scoring prompt.
|
||||
|
||||
`backend/nemoclaw_prompts/lead_tagger.md`
|
||||
Lead enrichment prompt.
|
||||
|
||||
`backend/nemoclaw_prompts/cctv_profiler.md`
|
||||
CCTV vehicle and plate profiling prompt.
|
||||
|
||||
`backend/scripts/nemoclaw_deploy.sh`
|
||||
Historical deployment/bootstrap script for OpenShell/Ollama-style setup. Useful as reference, but no longer fully aligned with the active NVIDIA-primary truth.
|
||||
|
||||
## 5. Services
|
||||
|
||||
### `velocity-backend.service`
|
||||
|
||||
Purpose:
|
||||
Runs FastAPI/Uvicorn from the NVMe release tree.
|
||||
|
||||
Why it exists:
|
||||
Provides the production API and WebSocket layer for Sentinel, Vault, Scenes, CCTV, and Auth.
|
||||
|
||||
Key behavior:
|
||||
- Reads `/opt/dlami/nvme/velocity/env`
|
||||
- Starts `uvicorn backend.main:app --host 127.0.0.1 --port 8001`
|
||||
|
||||
### `nemoclaw-velocity.service`
|
||||
|
||||
Purpose:
|
||||
Bootstraps the OpenShell/NemoClaw gateway state.
|
||||
|
||||
Why it exists:
|
||||
Keeps the local gateway selection and related tooling available on the node even though FastAPI currently scores against NVIDIA directly.
|
||||
|
||||
Current truth:
|
||||
- Implemented as a non-blocking `oneshot` systemd unit
|
||||
- Leaves the service in `active (exited)` when successful
|
||||
|
||||
### `nginx`
|
||||
|
||||
Purpose:
|
||||
TLS reverse proxy for the backend.
|
||||
|
||||
Why it exists:
|
||||
Exposes the backend on `443`, terminates TLS, and forwards both HTTP and WebSocket traffic to Uvicorn.
|
||||
|
||||
### `postgresql@14-velocity.service`
|
||||
|
||||
Purpose:
|
||||
Owns the NVMe-backed PostgreSQL cluster.
|
||||
|
||||
Why it exists:
|
||||
The Sentinel and Vault flows persist state in PostgreSQL, not Supabase.
|
||||
|
||||
## 6. Environment Variables
|
||||
|
||||
Active variables relevant to NemoClaw reasoning:
|
||||
|
||||
`NVIDIA_API_KEY`
|
||||
Used by the backend to authenticate against NVIDIA hosted completions.
|
||||
|
||||
`NVIDIA_BASE_URL`
|
||||
Set to `https://integrate.api.nvidia.com/v1`
|
||||
|
||||
`NVIDIA_MODEL`
|
||||
Set to `nvidia/nemotron-3-super-120b-a12b`
|
||||
|
||||
`NVIDIA_FALLBACK_MODEL`
|
||||
Set to `nvidia/llama-3.3-nemotron-super-49b-v1`
|
||||
|
||||
`ALLOW_LOCAL_FALLBACK`
|
||||
Currently `false`
|
||||
|
||||
`NEMOCLAW_PROMPT_DIR`
|
||||
Set to `/opt/dlami/nvme/nemoclaw/prompts`
|
||||
|
||||
Historical-but-not-primary variables:
|
||||
|
||||
`OLLAMA_BASE_URL`
|
||||
Still relevant if local fallback is re-enabled.
|
||||
|
||||
`NEMOCLAW_BASE_URL`
|
||||
No longer the primary path for backend scoring.
|
||||
|
||||
## 7. Inference Flow
|
||||
|
||||
### Current backend inference flow
|
||||
|
||||
1. Frontend emits biometric packet over `/api/sentinel/ws/perception`
|
||||
2. `backend/routers/sentinel.py` receives the packet
|
||||
3. Scene context is resolved from `video_scene_maps` if `video_asset_id` and `video_ts_ms` are present
|
||||
4. `backend/services/nemoclaw_client.py` builds an OpenAI-compatible messages payload
|
||||
5. The backend calls NVIDIA hosted completions using `nvidia/nemotron-3-super-120b-a12b`
|
||||
6. The result updates QD score state and is broadcast back over WebSocket
|
||||
|
||||
### Current lead-tagging flow
|
||||
|
||||
1. Broker or system calls `/api/sentinel/tag-lead`
|
||||
2. `tag_lead()` uses the NVIDIA path
|
||||
3. Lead tags are updated in `leads_intelligence`
|
||||
4. `LEAD_TAGGED` is broadcast to notifications
|
||||
|
||||
### Current CCTV flow
|
||||
|
||||
1. OCR/bridge posts to `/api/cctv/event`
|
||||
2. `profile_cctv_visitor()` uses the NVIDIA path
|
||||
3. `cctv_events` row is written
|
||||
4. Session evidence is updated
|
||||
5. Session can later be finalized through auto-mode matching
|
||||
|
||||
### Current live-session video flow
|
||||
|
||||
1. Frontend calls `GET /api/videos/marketing`
|
||||
2. Backend reads `/opt/dlami/nvme/assets/videos/catalog.json` if present
|
||||
3. Backend falls back to scanning `/opt/dlami/nvme/assets/videos` recursively for playable files if the catalog is missing or incomplete
|
||||
4. FastAPI serves the MP4 files through `/assets/videos/...`
|
||||
5. `SentinelLiveSession.tsx` renders smaller preview cards that autoplay in 3-second bursts on hover and advance 10 seconds between bursts
|
||||
6. `PerceptionPlayer.tsx` plays the selected asset through the same `/assets/videos/...` path
|
||||
|
||||
## 8. OpenShell and Ollama Truth
|
||||
|
||||
OpenShell and Ollama still matter, but in a narrower way than originally planned.
|
||||
|
||||
### Ollama
|
||||
|
||||
Location:
|
||||
Runs locally on port `11434`
|
||||
|
||||
Why it still exists:
|
||||
- Historical deployment compatibility
|
||||
- Potential local fallback if NVIDIA is disabled
|
||||
- OpenShell-related infrastructure expectations
|
||||
|
||||
### OpenShell gateway
|
||||
|
||||
Location:
|
||||
Gateway target on port `8080`
|
||||
|
||||
Why it still exists:
|
||||
- NemoClaw sandbox bootstrap
|
||||
- Local gateway control path
|
||||
- Operational continuity for the previously onboarded sandbox
|
||||
|
||||
What it is not:
|
||||
- It is not the current primary inference path for backend scoring
|
||||
|
||||
## 9. Prompts
|
||||
|
||||
Prompt source-of-truth in repo:
|
||||
|
||||
- `backend/nemoclaw_prompts/qd_calculator.md`
|
||||
- `backend/nemoclaw_prompts/lead_tagger.md`
|
||||
- `backend/nemoclaw_prompts/cctv_profiler.md`
|
||||
|
||||
Prompt runtime location on node:
|
||||
|
||||
- `/opt/dlami/nvme/nemoclaw/prompts/qd_calculator.md`
|
||||
- `/opt/dlami/nvme/nemoclaw/prompts/lead_tagger.md`
|
||||
- `/opt/dlami/nvme/nemoclaw/prompts/cctv_profiler.md`
|
||||
|
||||
Why copied to NVMe:
|
||||
- Keeps runtime prompts off the root volume
|
||||
- Aligns with the NVMe-first deployment strategy
|
||||
- Prevents storage-eviction regressions
|
||||
|
||||
## 10. Known Operational Risks
|
||||
|
||||
### JSON compliance risk
|
||||
|
||||
The NVIDIA model sometimes returns malformed or partially malformed JSON for the full QD prompt. The backend now includes partial-response recovery, but this is the biggest remaining correctness risk.
|
||||
|
||||
### Dynamic IP risk
|
||||
|
||||
The public IP has changed during execution. A stable Elastic IP or DNS entry is still recommended.
|
||||
|
||||
### Trust-chain risk
|
||||
|
||||
nginx TLS exists, but a production-trusted certificate should replace self-signed cert material.
|
||||
|
||||
### External producer gap
|
||||
|
||||
The OCR bridge script exists, but a production ONVIF/RTSP/OCR producer still needs to be pointed at the ingestion endpoint.
|
||||
|
||||
### Catalog drift risk
|
||||
|
||||
If new property videos are copied to NVMe without updating `catalog.json`, they will still be discoverable through directory scanning, but order, title, and display color may drift from the intended broker-facing presentation.
|
||||
|
||||
## 11. Validation Commands
|
||||
|
||||
Health:
|
||||
|
||||
```bash
|
||||
curl -k https://54.152.236.10/health
|
||||
curl -k https://54.152.236.10/api/videos/marketing
|
||||
```
|
||||
|
||||
Backend service:
|
||||
|
||||
```bash
|
||||
sudo systemctl status velocity-backend.service
|
||||
```
|
||||
|
||||
Gateway bootstrap:
|
||||
|
||||
```bash
|
||||
sudo systemctl status nemoclaw-velocity.service
|
||||
```
|
||||
|
||||
PostgreSQL:
|
||||
|
||||
```bash
|
||||
sudo systemctl status postgresql@14-velocity.service
|
||||
sudo -u postgres psql -d velocity -c '\dt'
|
||||
```
|
||||
|
||||
Local inference health from backend env:
|
||||
|
||||
```bash
|
||||
source /opt/dlami/nvme/velocity/env
|
||||
PYTHONPATH=/opt/dlami/nvme/velocity/current /opt/dlami/nvme/velocity/venv/bin/python - <<'PY'
|
||||
import asyncio, json
|
||||
from backend.services.nemoclaw_client import health_check
|
||||
print(asyncio.run(health_check()))
|
||||
PY
|
||||
```
|
||||
|
||||
## 12. What to Update If the Truth Changes
|
||||
|
||||
Update this document whenever any of the following change:
|
||||
|
||||
- Public IP or DNS target
|
||||
- Primary inference provider
|
||||
- Primary model
|
||||
- Prompt directory
|
||||
- nginx port or TLS behavior
|
||||
- OpenShell gateway port
|
||||
- service unit names
|
||||
- NVMe runtime paths
|
||||
Reference in New Issue
Block a user