#3 Self-approved and unit tests passed with flying colors. Co-authored-by: Sagnik <sagnik7896@gmail.com> Reviewed-on: #5
74 lines
5.0 KiB
Markdown
74 lines
5.0 KiB
Markdown
1\. Executive Summary: The "Dream Weaver" Objective
|
||
The goal is to move beyond simple "image-to-image" generation, which often "hallucinates" new walls or windows. "Dream Weaver" uses **Structural Constraint Logic** to ensure that while the furniture, wallpaper, and flooring change, the **physical dimensions, window placements, and vanishing points** of the original room remain 100% accurate to the real-world property.
|
||
---
|
||
|
||
2\. Technical Architecture & Component Research
|
||
A. The Foundation: RealVisXL V5.0 (Lightning)
|
||
|
||
* **Why:** Unlike Juggernaut (which is cinematic), RealVisXL ([https://civitai.com/models/139562?modelVersionId=789646](https://civitai.com/models/139562?modelVersionId=789646)) is trained on architectural photography datasets. It understands the "white balance" of a real room and doesn't over-saturate colors.
|
||
* **V5.0 Lightning Advantage:** It allows for high-quality generation in just 4–8 steps, making the "visualizer" tool feel snappy and responsive for the end-user.
|
||
|
||
B. The Guidance Layer: Dual-ControlNet Strategy
|
||
To preserve geometry, a single ControlNet is rarely enough. We will use a **stacked approach**:
|
||
|
||
1. **M-LSD (Line Segment Detection):** Best for architecture. It identifies straight lines (ceiling joints, floor corners, door frames). This prevents the walls from "bending."
|
||
2. **Depth (Zoe or MiDaS):** Provides the model with a 3D map of the room. This ensures that a new rug placed on the floor correctly recedes into the distance.
|
||
|
||
C. The Isolation Layer: SAM (Segment Anything Model)
|
||
|
||
* **Purpose:** We don't want to change the view out of the window or the specific crown molding if it's a selling point.
|
||
* **Implementation:** SAM allows the workflow to "mask" specific areas (e.g., *only* the back wall) so the AI only repaints the pixels within that mask.
|
||
|
||
---
|
||
|
||
3\. Implementation Guide: Step-by-Step Build
|
||
Phase 1: Input & Pre-Processing
|
||
|
||
1. **Image Load & Rescale:** Input image must be scaled to **1024x1024** (SDXL native) while maintaining aspect ratio via padding.
|
||
2. **Analysis:** Pass the image through two parallel pre-processor nodes:
|
||
1. `M-LSD Lines Preprocessor`: Set threshold to detect only structural lines.
|
||
2. `Zoe-DepthMap Preprocessor`: Generate a high-contrast depth map.
|
||
|
||
Phase 2: Semantic Masking (The "Wall Selector")
|
||
|
||
1. **GroundingDINO \+ SAM:** Use a text-based segmenter.
|
||
1. *Prompt:* "walls, floor, ceiling."
|
||
2. **Mask Refinement:** Use a `Mask Dilate` node (2-5 pixels) to ensure the AI "bleeds" slightly into the corners, avoiding ugly seams between the new style and the old structure.
|
||
|
||
Phase 3: The K-Sampler Logic (The "Restyler")
|
||
|
||
1. **Positive Prompting (The Style):** Use a LoRA-weighted prompt.
|
||
1. *Example:* `<lora:Interior_Style_Modern_Scandi:0.8>, hyper-realistic interior design, oak wood textures, minimalist furniture, soft sunlight, 8k architectural photography.`
|
||
2. **ControlNet Integration:**
|
||
1. Apply **M-LSD ControlNet** at a strength of **0.8** (High structural adherence).
|
||
2. Apply **Depth ControlNet** at a strength of **0.5** (Medium adherence for furniture placement).
|
||
3. **Inpainting / Latent Noise:**
|
||
1. Set `denoising_strength` to **0.65 \- 0.75**.
|
||
2. Lower than 0.6 keeps too much of the "empty" wall.
|
||
3. Higher than 0.8 might ignore the ControlNet and hallucinate a new room.
|
||
|
||
---
|
||
|
||
4\. SWOT Analysis of the "Dream Weaver" Workflow
|
||
|
||
| STRENGTHS | WEAKNESSES |
|
||
| :---- | :---- |
|
||
| **High Fidelity:** M-LSD ensures the "bones" of the house never change. | **Hardware Intensive:** SDXL \+ Dual ControlNet \+ SAM requires at least 12GB+ VRAM. |
|
||
| **Lightning Speed:** RealVisXL V5.0 allows for sub-10 second renders. | **Prompt Sensitivity:** Requires specific "Architectural" keywords to avoid looking like a render. |
|
||
| **OPPORTUNITIES** | **THREATS** |
|
||
| **Custom LoRAs:** Can train a LoRA on a developer's specific "Signature Style" or furniture catalog. | **Copyright:** Ensure the LoRAs used aren't trained on copyrighted photographer assets. |
|
||
| **API Integration:** JSON workflows allow this to be the backend for a mobile app. | **Edge Cases:** Very dark rooms or highly reflective surfaces can confuse Depth maps. |
|
||
|
||
---
|
||
|
||
5\. Best Practices & "Gotchas"
|
||
|
||
* **Lighting Consistency:** Always include "global illumination" or "soft natural light" in the negative prompt to avoid the AI creating conflicting light sources (e.g., two suns).
|
||
* **The "Straight Lines" Rule:** Real estate photos are shot at eye level with "verticals" corrected. If the input photo is tilted, the AI will struggle. Use a **Perspective Correction** node at the start of the workflow.
|
||
* **Negative Prompting:** This is crucial for RealVisXL.
|
||
* *Standard Negative:* `(worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), blurry, distorted, deformed, extra windows, unrealistic lighting.`
|
||
* **JSON Portability:** When exporting the workflow, use **"API Format"** in ComfyUI. Ensure all custom nodes (like Impact Pack for SAM) are version-locked to prevent the internal tool from breaking during updates.
|
||
|
||
---
|
||
|