Files
Project_Velocity/.Agent Context/Sprint 1/Project Velocity_ Dream Weaver.md
sagnik 55bb5e5a90 feat: Build the Dream Weaver interior restyling workflow to preserve room geometry while changing aesthetics (#5)
#3 Self-approved and unit tests passed with flying colors.

Co-authored-by: Sagnik <sagnik7896@gmail.com>
Reviewed-on: #5
2026-03-10 01:36:27 +05:30

5.0 KiB
Raw Blame History

1. Executive Summary: The "Dream Weaver" Objective
The goal is to move beyond simple "image-to-image" generation, which often "hallucinates" new walls or windows. "Dream Weaver" uses Structural Constraint Logic to ensure that while the furniture, wallpaper, and flooring change, the physical dimensions, window placements, and vanishing points of the original room remain 100% accurate to the real-world property.

2. Technical Architecture & Component Research
A. The Foundation: RealVisXL V5.0 (Lightning)

  • Why: Unlike Juggernaut (which is cinematic), RealVisXL (https://civitai.com/models/139562?modelVersionId=789646) is trained on architectural photography datasets. It understands the "white balance" of a real room and doesn't over-saturate colors.
  • V5.0 Lightning Advantage: It allows for high-quality generation in just 48 steps, making the "visualizer" tool feel snappy and responsive for the end-user.

B. The Guidance Layer: Dual-ControlNet Strategy
To preserve geometry, a single ControlNet is rarely enough. We will use a stacked approach:

  1. M-LSD (Line Segment Detection): Best for architecture. It identifies straight lines (ceiling joints, floor corners, door frames). This prevents the walls from "bending."
  2. Depth (Zoe or MiDaS): Provides the model with a 3D map of the room. This ensures that a new rug placed on the floor correctly recedes into the distance.

C. The Isolation Layer: SAM (Segment Anything Model)

  • Purpose: We don't want to change the view out of the window or the specific crown molding if it's a selling point.
  • Implementation: SAM allows the workflow to "mask" specific areas (e.g., only the back wall) so the AI only repaints the pixels within that mask.

3. Implementation Guide: Step-by-Step Build
Phase 1: Input & Pre-Processing

  1. Image Load & Rescale: Input image must be scaled to 1024x1024 (SDXL native) while maintaining aspect ratio via padding.
  2. Analysis: Pass the image through two parallel pre-processor nodes:
    1. M-LSD Lines Preprocessor: Set threshold to detect only structural lines.
    2. Zoe-DepthMap Preprocessor: Generate a high-contrast depth map.

Phase 2: Semantic Masking (The "Wall Selector")

  1. GroundingDINO + SAM: Use a text-based segmenter.
    1. Prompt: "walls, floor, ceiling."
  2. Mask Refinement: Use a Mask Dilate node (2-5 pixels) to ensure the AI "bleeds" slightly into the corners, avoiding ugly seams between the new style and the old structure.

Phase 3: The K-Sampler Logic (The "Restyler")

  1. Positive Prompting (The Style): Use a LoRA-weighted prompt.
    1. Example: <lora:Interior_Style_Modern_Scandi:0.8>, hyper-realistic interior design, oak wood textures, minimalist furniture, soft sunlight, 8k architectural photography.
  2. ControlNet Integration:
    1. Apply M-LSD ControlNet at a strength of 0.8 (High structural adherence).
    2. Apply Depth ControlNet at a strength of 0.5 (Medium adherence for furniture placement).
  3. Inpainting / Latent Noise:
    1. Set denoising_strength to 0.65 - 0.75.
    2. Lower than 0.6 keeps too much of the "empty" wall.
    3. Higher than 0.8 might ignore the ControlNet and hallucinate a new room.

4. SWOT Analysis of the "Dream Weaver" Workflow

STRENGTHS WEAKNESSES
High Fidelity: M-LSD ensures the "bones" of the house never change. Hardware Intensive: SDXL + Dual ControlNet + SAM requires at least 12GB+ VRAM.
Lightning Speed: RealVisXL V5.0 allows for sub-10 second renders. Prompt Sensitivity: Requires specific "Architectural" keywords to avoid looking like a render.
OPPORTUNITIES THREATS
Custom LoRAs: Can train a LoRA on a developer's specific "Signature Style" or furniture catalog. Copyright: Ensure the LoRAs used aren't trained on copyrighted photographer assets.
API Integration: JSON workflows allow this to be the backend for a mobile app. Edge Cases: Very dark rooms or highly reflective surfaces can confuse Depth maps.

5. Best Practices & "Gotchas"

  • Lighting Consistency: Always include "global illumination" or "soft natural light" in the negative prompt to avoid the AI creating conflicting light sources (e.g., two suns).
  • The "Straight Lines" Rule: Real estate photos are shot at eye level with "verticals" corrected. If the input photo is tilted, the AI will struggle. Use a Perspective Correction node at the start of the workflow.
  • Negative Prompting: This is crucial for RealVisXL.
    • Standard Negative: (worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), blurry, distorted, deformed, extra windows, unrealistic lighting.
  • JSON Portability: When exporting the workflow, use "API Format" in ComfyUI. Ensure all custom nodes (like Impact Pack for SAM) are version-locked to prevent the internal tool from breaking during updates.