Files
Project_Astral/.agent Context/Project Astral_ SRS & Sprint Plan.md
2026-02-25 00:50:23 +05:30

7.0 KiB
Raw Blame History

Part 1: Software Requirements Specification (SRS)

1. Project Overview

  • Project Name: Project Astral (Ethical Celebrity AI Platform)
  • Client: Cine Bahini Studios
  • Developer: Desineuron Labs
  • Objective: To build a locally hosted, air-gapped AI production suite that generates high-fidelity, consent-driven commercial videos using LTX-2 (Audio+Video) and LiDAR data.

2. System Architecture

The system follows a Hybrid-Local Architecture to satisfy the "Air-Gapped Security" requirement while maintaining ease of use.

  • Frontend (The Studio): Next.js + Tailwind CSS (Glassmorphism UI).

  • Middleware (The Agent): OpenClaw running as a local API Gateway to orchestrate tasks between the UI and the GPU.

  • Backend (The Engine): ComfyUI (Headless Mode) executing hidden JSON workflows.

  • AI Models:

    • Video: LTX-2 (19B Param Asymmetric Dual-Stream) for joint Audio-Video generation.

    • Image/Refinement: Flux or SDXL for the initial frame; FaceDetailer for restoration.

    • Geometry: ControlNet driven by LiDAR Depth Maps.

  • Storage: Synology NAS (Local 10GbE Mount) for saving final assets.

3. Functional Requirements (FR)

FR-01: Identity Ingestion (The "Astral Capture")

  • Input: System must accept .obj or .usdz files from iPhone Pro LiDAR + 48MP reference photos.

  • Processing: System must convert LiDAR point clouds into Grayscale Depth Maps for ControlNet usage.

  • Security: Raw biometric data must be encrypted and stored in the "Astral Vault" (isolated directory structure).

FR-02: The "Hidden" Orchestrator

  • Logic: The user interacts with a simple text prompt (e.g., "Drinking coffee in rain"). The backend must inject a "System Prompt" (e.g., "Arri Alexa, 8k, highly detailed") automatically.

  • Routing: The OpenClaw agent must route requests to the available GPU (RTX 6000 #1 vs #2) to balance load between Training and Inference.

FR-03: Audio-Visual Synchronization

  • Mechanism: System must use LTX-2s Asymmetric Dual-Stream architecture to generate foley (background sound) and video simultaneously to ensure synchronization (e.g., cup hitting table = "clink" sound).

FR-04: Ethics & Compliance

  • The Kill-Switch: An admin toggle that instantly unloads a celebrity's LoRA from VRAM and locks their dataset if a contract expires.

  • Watermarking: Every generated frame must include an invisible watermark denoting it as "Synthetic".

---

Part 2: The 14-Day "Skunkworks" Sprint

This roadmap is strictly derived from your Handwritten Note and the Pitch Deck Timeline.

Phase 1: The Spine (Infrastructure & Workflow)

Goal: A working "Ugly" Prototype that generates video.

  • Day 1: Hardware & Environment

    • Task: Mount RTX 6000s. Install Ubuntu + CUDA 12.x.
    • Task: Install ComfyUI and the LTX-2 Nodes.
    • Task: Verify NAS connectivity (/mnt/nas/output).
  • Day 2: The "Hidden" Workflow

    • Task (Handwritten): Plan the initial ComfyUI Workflow.
    • Execution: Build a Comfy JSON that takes Image_Input + Depth_Map -> LTX-2 Img2Vid.
    • Test: Generate a generic "Man holding bottle" video to prove LTX-2 audio-video sync works.
  • Day 3: OpenClaw Integration

    • Task (Handwritten): Setup Claw Bot.
    • Execution: Configure OpenClaw to listen on a local port. Write a custom "Skill" (comfy_skill.py) that allows OpenClaw to send POST requests to ComfyUI's /prompt endpoint.

Phase 2: The Identity (Fine-Tuning & Ingestion)

Goal: Putting the "Celebrity" inside the machine.

  • Day 4: LiDAR Pipeline
    • Task: Write a Python script to convert iPhone LiDAR .obj dumps into normalized Depth Maps (White = Near, Black = Far) for ControlNet.
  • Day 5: Training (LoRA)
    • Task (Handwritten): Fine Tune Model.
    • Execution: Train a test LoRA on a Cine Bahini actor (or yourself) using the captured photos. Target 2000 steps on the RTX 6000.
  • Day 6: Verification
    • Task: Run the LoRA through the Workflow. Tweaking the "System Prompt" to ensure the actor doesn't look plastic.

Phase 3: The Skin (Frontend & Dashboard)

Goal: Making it look like a "Studio," not a "Lab."

  • Day 7: UI Skeleton

    • Task (Handwritten): Plan the Dashboard and all components.
    • Execution: Sketch the "Glassmorphism" layout. Sidebar (Nav), Center (Drop Zone), Bottom (Task Strip).
  • Day 8: Frontend Build

    • Task (Handwritten): Make the initial Dashboard.
    • Execution: Initialize Next.js project. Build the DragAndDrop component that accepts files and sends them to the OpenClaw API.
  • Day 9: Real-Time Feedback

    • Task: Implement WebSockets to show the "Green/Red pulse" for API health and the Progress Bar.

Phase 4: The Brain (Auth & Logic)

Goal: Security and "Productization."

  • Day 10: Authentication

    • Task (Handwritten): Setup Authentication.
    • Execution: Integrate Firebase Auth. Create the "Admin" vs "Creative" roles. Ensure "Creative" users cannot access the "Model Forge" page.
  • Day 11: The "Kill Switch"

    • Task: Code the logic where the OpenClaw agent checks Firebase for contract_status: active before loading any LoRA.

Phase 5: Final Polish & "The Shot"

Goal: The Demo Asset.

  • Day 12: Dashboard Polish

    • Task (Handwritten): Finalise Dashboard.
    • Execution: Apply the "Midnight Black" theme. Ensure LiDAR 3D previews render using Three.js.
  • Day 13: Stress Test

    • Task: Queue 20 videos. Watch GPU thermals and VRAM usage.
  • Day 14: The Demo

    • Task (Handwritten): Generate 5 Sec Demo Video with custom product.
    • Execution: Generate the "Astral Shot" (e.g., The actor holding a specific Cine Bahini product, perfectly lit, speaking a line). This is your deliverable.

---

Part 3: Project Folder Structure

To keep this organized for you and Sayan, use this structure:

Plaintext

/Project_Astral
├── /docs # Pitch decks, RFP, Architecture diagrams
├── /infrastructure # Docker compose files for OpenClaw, Redis, NAS mount scripts
├── /models
│ ├── /checkpoints # LTX-2, Flux, SDXL weights
│ ├── /loras # The "Astral Vault" (Client LoRAs)
│ └── /controlnet # Depth/Canny models
├── /backend-agent # OpenClaw Custom Skills
│ ├── /skills # Python scripts for ComfyUI interaction
│ └── agent_config.yaml # Routing logic
├── /comfy-workflows # JSON files for the "Hidden" workflows
│ ├── workflow_dev.json
│ └── workflow_prod.json
└── /frontend-studio # Next.js Application
├── /components # DragDrop, LiDARPreview, ProgressBar
├── /pages # ProductionHub, AssetLibrary, Admin
└── /lib # Firebase config, WebSocket hooks