sayan/Project_Astral

Fork 0

forked from sagnik/Project_Astral

Files

GitLab Admin 4df35610ca Initial commit for Project Astral blueprint

2026-02-25 00:50:23 +05:30

7.0 KiB

Raw Blame History

Part 1: Software Requirements Specification (SRS)

1. Project Overview

Project Name: Project Astral (Ethical Celebrity AI Platform)
Client: Cine Bahini Studios
Developer: Desineuron Labs
Objective: To build a locally hosted, air-gapped AI production suite that generates high-fidelity, consent-driven commercial videos using LTX-2 (Audio+Video) and LiDAR data.

2. System Architecture

The system follows a Hybrid-Local Architecture to satisfy the "Air-Gapped Security" requirement while maintaining ease of use.

Frontend (The Studio): Next.js + Tailwind CSS (Glassmorphism UI).
Middleware (The Agent): OpenClaw running as a local API Gateway to orchestrate tasks between the UI and the GPU.
Backend (The Engine): ComfyUI (Headless Mode) executing hidden JSON workflows.
AI Models:
- Video: LTX-2 (19B Param Asymmetric Dual-Stream) for joint Audio-Video generation.
- Image/Refinement: Flux or SDXL for the initial frame; FaceDetailer for restoration.
- Geometry: ControlNet driven by LiDAR Depth Maps.
Storage: Synology NAS (Local 10GbE Mount) for saving final assets.

3. Functional Requirements (FR)

FR-01: Identity Ingestion (The "Astral Capture")

Input: System must accept .obj or .usdz files from iPhone Pro LiDAR + 48MP reference photos.
Processing: System must convert LiDAR point clouds into Grayscale Depth Maps for ControlNet usage.
Security: Raw biometric data must be encrypted and stored in the "Astral Vault" (isolated directory structure).

FR-02: The "Hidden" Orchestrator

Logic: The user interacts with a simple text prompt (e.g., "Drinking coffee in rain"). The backend must inject a "System Prompt" (e.g., "Arri Alexa, 8k, highly detailed") automatically.
Routing: The OpenClaw agent must route requests to the available GPU (RTX 6000 #1 vs #2) to balance load between Training and Inference.

FR-03: Audio-Visual Synchronization

Mechanism: System must use LTX-2’s Asymmetric Dual-Stream architecture to generate foley (background sound) and video simultaneously to ensure synchronization (e.g., cup hitting table = "clink" sound).

FR-04: Ethics & Compliance

The Kill-Switch: An admin toggle that instantly unloads a celebrity's LoRA from VRAM and locks their dataset if a contract expires.
Watermarking: Every generated frame must include an invisible watermark denoting it as "Synthetic".

---

Part 2: The 14-Day "Skunkworks" Sprint

This roadmap is strictly derived from your Handwritten Note and the Pitch Deck Timeline.

Phase 1: The Spine (Infrastructure & Workflow)

Goal: A working "Ugly" Prototype that generates video.

Day 1: Hardware & Environment
- Task: Mount RTX 6000s. Install Ubuntu + CUDA 12.x.
- Task: Install ComfyUI and the LTX-2 Nodes.
- Task: Verify NAS connectivity (/mnt/nas/output).
Day 2: The "Hidden" Workflow
- Task (Handwritten): Plan the initial ComfyUI Workflow.
- Execution: Build a Comfy JSON that takes Image_Input + Depth_Map -> LTX-2 Img2Vid.
- Test: Generate a generic "Man holding bottle" video to prove LTX-2 audio-video sync works.
Day 3: OpenClaw Integration
- Task (Handwritten): Setup Claw Bot.
- Execution: Configure OpenClaw to listen on a local port. Write a custom "Skill" (comfy_skill.py) that allows OpenClaw to send POST requests to ComfyUI's /prompt endpoint.

Phase 2: The Identity (Fine-Tuning & Ingestion)

Goal: Putting the "Celebrity" inside the machine.

Day 4: LiDAR Pipeline
- Task: Write a Python script to convert iPhone LiDAR .obj dumps into normalized Depth Maps (White = Near, Black = Far) for ControlNet.
Day 5: Training (LoRA)
- Task (Handwritten): Fine Tune Model.
- Execution: Train a test LoRA on a Cine Bahini actor (or yourself) using the captured photos. Target 2000 steps on the RTX 6000.
Day 6: Verification
- Task: Run the LoRA through the Workflow. Tweaking the "System Prompt" to ensure the actor doesn't look plastic.

Phase 3: The Skin (Frontend & Dashboard)

Goal: Making it look like a "Studio," not a "Lab."

Day 7: UI Skeleton
- Task (Handwritten): Plan the Dashboard and all components.
- Execution: Sketch the "Glassmorphism" layout. Sidebar (Nav), Center (Drop Zone), Bottom (Task Strip).
Day 8: Frontend Build
- Task (Handwritten): Make the initial Dashboard.
- Execution: Initialize Next.js project. Build the DragAndDrop component that accepts files and sends them to the OpenClaw API.
Day 9: Real-Time Feedback
- Task: Implement WebSockets to show the "Green/Red pulse" for API health and the Progress Bar.

Phase 4: The Brain (Auth & Logic)

Goal: Security and "Productization."

Day 10: Authentication
- Task (Handwritten): Setup Authentication.
- Execution: Integrate Firebase Auth. Create the "Admin" vs "Creative" roles. Ensure "Creative" users cannot access the "Model Forge" page.
Day 11: The "Kill Switch"
- Task: Code the logic where the OpenClaw agent checks Firebase for contract_status: active before loading any LoRA.

Phase 5: Final Polish & "The Shot"

Goal: The Demo Asset.

Day 12: Dashboard Polish
- Task (Handwritten): Finalise Dashboard.
- Execution: Apply the "Midnight Black" theme. Ensure LiDAR 3D previews render using Three.js.
Day 13: Stress Test
- Task: Queue 20 videos. Watch GPU thermals and VRAM usage.
Day 14: The Demo
- Task (Handwritten): Generate 5 Sec Demo Video with custom product.
- Execution: Generate the "Astral Shot" (e.g., The actor holding a specific Cine Bahini product, perfectly lit, speaking a line). This is your deliverable.

---

Part 3: Project Folder Structure

To keep this organized for you and Sayan, use this structure:

Plaintext

/Project_Astral
├── /docs # Pitch decks, RFP, Architecture diagrams
├── /infrastructure # Docker compose files for OpenClaw, Redis, NAS mount scripts
├── /models
│ ├── /checkpoints # LTX-2, Flux, SDXL weights
│ ├── /loras # The "Astral Vault" (Client LoRAs)
│ └── /controlnet # Depth/Canny models
├── /backend-agent # OpenClaw Custom Skills
│ ├── /skills # Python scripts for ComfyUI interaction
│ └── agent_config.yaml # Routing logic
├── /comfy-workflows # JSON files for the "Hidden" workflows
│ ├── workflow_dev.json
│ └── workflow_prod.json
└── /frontend-studio # Next.js Application
├── /components # DragDrop, LiDARPreview, ProgressBar
├── /pages # ProductionHub, AssetLibrary, Admin
└── /lib # Firebase config, WebSocket hooks

7.0 KiB Raw Blame History Unescape Escape