Files
Project_Velocity/.Agent Context/Sprint 1/Founder CRM and Platform Delivery Pack/15 - Infrastructure and Operator Truth_ Linux Ingress GPU Comfy.md

4.0 KiB

Infrastructure and Operator Truth_ Linux Ingress GPU Comfy

Date: 2026-04-18
Status: Draft
Owner: Sagnik
Reviewers: Sayan, Sourik
Scope: Canonical founder-owned operator reference for Linux origin, ingress, GPU/Comfy, and owner boundaries
Purpose: Consolidate the live infrastructure truth needed by anyone operating or extending Project Velocity.
Decision Boundary: This document describes current truth and operational non-negotiables. It does not redefine infrastructure ownership or deployment architecture without later approval.

1. Canonical Topology

Linux origin

  • host role: application origin for Velocity frontend/backend and related platform services
  • current LAN IP: 192.168.1.2
  • responsibility:
    • app hosting
    • backend hosting
    • asset/storage concerns that do not belong on GPU runtime

Ingress node

  • public edge role: stable entrypoint and reverse proxy/routing layer
  • responsibility:
    • public DNS routing
    • TLS
    • path routing to internal services
    • stable public access pattern

GPU worker / Comfy runtime

  • current AWS GPU worker referenced in truth docs:
    • instance: i-0e4eab5fe67cf9abe
    • type: g6.12xlarge
    • private IP: 172.31.46.190
  • responsibility:
    • ComfyUI runtime
    • Wan and other heavy generation model execution
    • GPU-bound workflow execution

2. Canonical Public Access Rule

Public Comfy access must use:

  • https://comfy.desineuron.in

Never use:

  • raw GPU public IPs
  • ad hoc direct node access in application code

3. Canonical GPU NVMe Rule

GPU model/runtime work must use NVMe-backed storage only.

Canonical roots documented today:

  • /opt/dlami/nvme/ComfyUI
  • /opt/dlami/nvme/hf
  • /opt/dlami/nvme/model-staging
  • /opt/dlami/nvme/model-logs

Non-negotiable:

  • do not stage heavyweight models on slower or incidental volumes when the GPU path is expected to be production-like

4. Comfy Endpoint Appendix

Current valid Comfy endpoint family:

  • /
  • /prompt
  • /history/{prompt_id}
  • /queue
  • /upload/image

These are the operator-facing paths that application and tooling assumptions must align with.

5. Access Pattern

Correct access pattern:

  1. frontend and backend are hosted through Linux origin and ingress routing
  2. public users reach the platform through stable ingress-hosted domains
  3. backend or orchestration layers call Comfy through https://comfy.desineuron.in
  4. GPU workers keep models and runtime assets on NVMe

6. Ownership Matrix

Owner Primary Scope Notes
Founder CRM, client graph, import architecture, QD-linked client intelligence, monolithic truth owns the customer-operating model
Sayan Oracle templates, inventory, admin/control surfaces, multi-surface delivery owns major UI/product surfaces beside founder CRM scope
Sourik colony/runtime/orchestration surfaces, agentic architecture, MCP-adjacent runtime work owns deeper runtime and orchestration layer planning

7. Planning Implications

  • CRM should not be planned as a GPU concern
  • ComfyUI should not be planned as a Linux-origin heavy-model runtime
  • Oracle and CRM route families should assume ingress-safe public access patterns
  • any future endpoint planning must respect owner boundaries and the current topology

8. Source Documents

Primary source docs reconciled into this file:

  • Sprint 1/comfyui_setup_truth.md
  • Sprint 1/Desineuron Stable Ingress Handoff.md

This document is intended to become the faster founder/operator reference when those older docs are too fragmented for day-to-day use.

9. Non-Negotiables

  • never use GPU public IP directly in app/runtime assumptions
  • always use NVMe for GPU-hosted models and staging
  • Linux origin hosts application and platform services, not heavyweight Comfy model execution
  • public routing must go through stable ingress

10. Bottom Line

Project Velocity now has enough infra truth that the main risk is no longer lack of deployment. The main risk is operator confusion. This document exists to remove that confusion.