forked from sagnik/Project_Velocity
328 lines
9.3 KiB
Markdown
328 lines
9.3 KiB
Markdown
# NemoClaw Setup Truth
|
|
|
|
Updated: April 2, 2026
|
|
|
|
## 1. Purpose
|
|
|
|
This document records the actual NemoClaw-related deployment state for Project Velocity. It explains what exists, where it exists, why it exists, which ports are involved, and how the reasoning path works today.
|
|
|
|
This is not the original intended architecture. This is the current operational truth.
|
|
|
|
## 2. High-Level Summary
|
|
|
|
Project Velocity uses the term "NemoClaw" for the reasoning and prompt layer attached to the Sentinel QD Engine. In practice, this is now split into two different concerns:
|
|
|
|
1. Prompted reasoning used by the FastAPI backend
|
|
2. OpenShell / gateway infrastructure that remains installed on the AWS node
|
|
|
|
The active FastAPI inference path is NVIDIA-hosted OpenAI-compatible chat completions.
|
|
|
|
The OpenShell gateway and Ollama are still installed and running as adjacent infrastructure, but they are not the active primary scoring path used by `backend/services/nemoclaw_client.py`.
|
|
|
|
## 3. Node and Network Truth
|
|
|
|
AWS region: `us-east-1`
|
|
Current public IP: `54.152.236.10`
|
|
SSH user: `ubuntu`
|
|
|
|
### Port Map
|
|
|
|
`22`
|
|
SSH access to the AWS node.
|
|
|
|
`443`
|
|
nginx TLS reverse proxy. Public entry point for the backend.
|
|
|
|
`127.0.0.1:8001`
|
|
FastAPI/Uvicorn backend. Not directly public.
|
|
|
|
`127.0.0.1:5432`
|
|
PostgreSQL. Local-only.
|
|
|
|
`8080`
|
|
OpenShell/NemoClaw gateway target. Internal service path for gateway bootstrap and sandbox-related flows.
|
|
|
|
`11434`
|
|
Local Ollama runtime. Installed and reachable on the node, but not the current primary backend scoring path.
|
|
|
|
`/api/videos/marketing`
|
|
Backend catalog endpoint for Sentinel live-session marketing videos.
|
|
|
|
## 4. File and Directory Layout
|
|
|
|
### NVMe-backed runtime directories
|
|
|
|
`/opt/dlami/nvme/velocity/current`
|
|
Active backend code.
|
|
|
|
`/opt/dlami/nvme/velocity/env`
|
|
Environment file used by `velocity-backend.service`.
|
|
|
|
`/opt/dlami/nvme/velocity/venv`
|
|
Python virtual environment for the backend.
|
|
|
|
`/opt/dlami/nvme/velocity/tls`
|
|
TLS cert and key used by nginx.
|
|
|
|
`/opt/dlami/nvme/nemoclaw/prompts`
|
|
Prompt files used by the backend reasoning client.
|
|
|
|
`/opt/dlami/nvme/assets/videos`
|
|
Runtime marketing-video directory served by FastAPI static assets.
|
|
|
|
`/opt/dlami/nvme/assets/videos/catalog.json`
|
|
Optional checked catalog that controls video ordering, labels, and display metadata for the live-session picker.
|
|
|
|
`/opt/dlami/nvme/pgdata/14/velocity`
|
|
PostgreSQL 14 data directory.
|
|
|
|
### Repo paths
|
|
|
|
`backend/services/nemoclaw_client.py`
|
|
Primary reasoning client used by the FastAPI backend.
|
|
|
|
`backend/routers/videos.py`
|
|
Marketing-video catalog endpoint for the Sentinel live-session picker.
|
|
|
|
`backend/config/marketing_videos.catalog.json`
|
|
Checked source catalog for the four current property walkthrough videos.
|
|
|
|
`backend/nemoclaw_prompts/qd_calculator.md`
|
|
QD scoring prompt.
|
|
|
|
`backend/nemoclaw_prompts/lead_tagger.md`
|
|
Lead enrichment prompt.
|
|
|
|
`backend/nemoclaw_prompts/cctv_profiler.md`
|
|
CCTV vehicle and plate profiling prompt.
|
|
|
|
`backend/scripts/nemoclaw_deploy.sh`
|
|
Historical deployment/bootstrap script for OpenShell/Ollama-style setup. Useful as reference, but no longer fully aligned with the active NVIDIA-primary truth.
|
|
|
|
## 5. Services
|
|
|
|
### `velocity-backend.service`
|
|
|
|
Purpose:
|
|
Runs FastAPI/Uvicorn from the NVMe release tree.
|
|
|
|
Why it exists:
|
|
Provides the production API and WebSocket layer for Sentinel, Vault, Scenes, CCTV, and Auth.
|
|
|
|
Key behavior:
|
|
- Reads `/opt/dlami/nvme/velocity/env`
|
|
- Starts `uvicorn backend.main:app --host 127.0.0.1 --port 8001`
|
|
|
|
### `nemoclaw-velocity.service`
|
|
|
|
Purpose:
|
|
Bootstraps the OpenShell/NemoClaw gateway state.
|
|
|
|
Why it exists:
|
|
Keeps the local gateway selection and related tooling available on the node even though FastAPI currently scores against NVIDIA directly.
|
|
|
|
Current truth:
|
|
- Implemented as a non-blocking `oneshot` systemd unit
|
|
- Leaves the service in `active (exited)` when successful
|
|
|
|
### `nginx`
|
|
|
|
Purpose:
|
|
TLS reverse proxy for the backend.
|
|
|
|
Why it exists:
|
|
Exposes the backend on `443`, terminates TLS, and forwards both HTTP and WebSocket traffic to Uvicorn.
|
|
|
|
### `postgresql@14-velocity.service`
|
|
|
|
Purpose:
|
|
Owns the NVMe-backed PostgreSQL cluster.
|
|
|
|
Why it exists:
|
|
The Sentinel and Vault flows persist state in PostgreSQL, not Supabase.
|
|
|
|
## 6. Environment Variables
|
|
|
|
Active variables relevant to NemoClaw reasoning:
|
|
|
|
`NVIDIA_API_KEY`
|
|
Used by the backend to authenticate against NVIDIA hosted completions.
|
|
|
|
`NVIDIA_BASE_URL`
|
|
Set to `https://integrate.api.nvidia.com/v1`
|
|
|
|
`NVIDIA_MODEL`
|
|
Set to `nvidia/nemotron-3-super-120b-a12b`
|
|
|
|
`NVIDIA_FALLBACK_MODEL`
|
|
Set to `nvidia/llama-3.3-nemotron-super-49b-v1`
|
|
|
|
`ALLOW_LOCAL_FALLBACK`
|
|
Currently `false`
|
|
|
|
`NEMOCLAW_PROMPT_DIR`
|
|
Set to `/opt/dlami/nvme/nemoclaw/prompts`
|
|
|
|
Historical-but-not-primary variables:
|
|
|
|
`OLLAMA_BASE_URL`
|
|
Still relevant if local fallback is re-enabled.
|
|
|
|
`NEMOCLAW_BASE_URL`
|
|
No longer the primary path for backend scoring.
|
|
|
|
## 7. Inference Flow
|
|
|
|
### Current backend inference flow
|
|
|
|
1. Frontend emits biometric packet over `/api/sentinel/ws/perception`
|
|
2. `backend/routers/sentinel.py` receives the packet
|
|
3. Scene context is resolved from `video_scene_maps` if `video_asset_id` and `video_ts_ms` are present
|
|
4. `backend/services/nemoclaw_client.py` builds an OpenAI-compatible messages payload
|
|
5. The backend calls NVIDIA hosted completions using `nvidia/nemotron-3-super-120b-a12b`
|
|
6. The result updates QD score state and is broadcast back over WebSocket
|
|
|
|
### Current lead-tagging flow
|
|
|
|
1. Broker or system calls `/api/sentinel/tag-lead`
|
|
2. `tag_lead()` uses the NVIDIA path
|
|
3. Lead tags are updated in `leads_intelligence`
|
|
4. `LEAD_TAGGED` is broadcast to notifications
|
|
|
|
### Current CCTV flow
|
|
|
|
1. OCR/bridge posts to `/api/cctv/event`
|
|
2. `profile_cctv_visitor()` uses the NVIDIA path
|
|
3. `cctv_events` row is written
|
|
4. Session evidence is updated
|
|
5. Session can later be finalized through auto-mode matching
|
|
|
|
### Current live-session video flow
|
|
|
|
1. Frontend calls `GET /api/videos/marketing`
|
|
2. Backend reads `/opt/dlami/nvme/assets/videos/catalog.json` if present
|
|
3. Backend falls back to scanning `/opt/dlami/nvme/assets/videos` recursively for playable files if the catalog is missing or incomplete
|
|
4. FastAPI serves the MP4 files through `/assets/videos/...`
|
|
5. `SentinelLiveSession.tsx` renders smaller preview cards that autoplay in 3-second bursts on hover and advance 10 seconds between bursts
|
|
6. `PerceptionPlayer.tsx` plays the selected asset through the same `/assets/videos/...` path
|
|
|
|
## 8. OpenShell and Ollama Truth
|
|
|
|
OpenShell and Ollama still matter, but in a narrower way than originally planned.
|
|
|
|
### Ollama
|
|
|
|
Location:
|
|
Runs locally on port `11434`
|
|
|
|
Why it still exists:
|
|
- Historical deployment compatibility
|
|
- Potential local fallback if NVIDIA is disabled
|
|
- OpenShell-related infrastructure expectations
|
|
|
|
### OpenShell gateway
|
|
|
|
Location:
|
|
Gateway target on port `8080`
|
|
|
|
Why it still exists:
|
|
- NemoClaw sandbox bootstrap
|
|
- Local gateway control path
|
|
- Operational continuity for the previously onboarded sandbox
|
|
|
|
What it is not:
|
|
- It is not the current primary inference path for backend scoring
|
|
|
|
## 9. Prompts
|
|
|
|
Prompt source-of-truth in repo:
|
|
|
|
- `backend/nemoclaw_prompts/qd_calculator.md`
|
|
- `backend/nemoclaw_prompts/lead_tagger.md`
|
|
- `backend/nemoclaw_prompts/cctv_profiler.md`
|
|
|
|
Prompt runtime location on node:
|
|
|
|
- `/opt/dlami/nvme/nemoclaw/prompts/qd_calculator.md`
|
|
- `/opt/dlami/nvme/nemoclaw/prompts/lead_tagger.md`
|
|
- `/opt/dlami/nvme/nemoclaw/prompts/cctv_profiler.md`
|
|
|
|
Why copied to NVMe:
|
|
- Keeps runtime prompts off the root volume
|
|
- Aligns with the NVMe-first deployment strategy
|
|
- Prevents storage-eviction regressions
|
|
|
|
## 10. Known Operational Risks
|
|
|
|
### JSON compliance risk
|
|
|
|
The NVIDIA model sometimes returns malformed or partially malformed JSON for the full QD prompt. The backend now includes partial-response recovery, but this is the biggest remaining correctness risk.
|
|
|
|
### Dynamic IP risk
|
|
|
|
The public IP has changed during execution. A stable Elastic IP or DNS entry is still recommended.
|
|
|
|
### Trust-chain risk
|
|
|
|
nginx TLS exists, but a production-trusted certificate should replace self-signed cert material.
|
|
|
|
### External producer gap
|
|
|
|
The OCR bridge script exists, but a production ONVIF/RTSP/OCR producer still needs to be pointed at the ingestion endpoint.
|
|
|
|
### Catalog drift risk
|
|
|
|
If new property videos are copied to NVMe without updating `catalog.json`, they will still be discoverable through directory scanning, but order, title, and display color may drift from the intended broker-facing presentation.
|
|
|
|
## 11. Validation Commands
|
|
|
|
Health:
|
|
|
|
```bash
|
|
curl -k https://54.152.236.10/health
|
|
curl -k https://54.152.236.10/api/videos/marketing
|
|
```
|
|
|
|
Backend service:
|
|
|
|
```bash
|
|
sudo systemctl status velocity-backend.service
|
|
```
|
|
|
|
Gateway bootstrap:
|
|
|
|
```bash
|
|
sudo systemctl status nemoclaw-velocity.service
|
|
```
|
|
|
|
PostgreSQL:
|
|
|
|
```bash
|
|
sudo systemctl status postgresql@14-velocity.service
|
|
sudo -u postgres psql -d velocity -c '\dt'
|
|
```
|
|
|
|
Local inference health from backend env:
|
|
|
|
```bash
|
|
source /opt/dlami/nvme/velocity/env
|
|
PYTHONPATH=/opt/dlami/nvme/velocity/current /opt/dlami/nvme/velocity/venv/bin/python - <<'PY'
|
|
import asyncio, json
|
|
from backend.services.nemoclaw_client import health_check
|
|
print(asyncio.run(health_check()))
|
|
PY
|
|
```
|
|
|
|
## 12. What to Update If the Truth Changes
|
|
|
|
Update this document whenever any of the following change:
|
|
|
|
- Public IP or DNS target
|
|
- Primary inference provider
|
|
- Primary model
|
|
- Prompt directory
|
|
- nginx port or TLS behavior
|
|
- OpenShell gateway port
|
|
- service unit names
|
|
- NVMe runtime paths
|