30 KiB
Animatrix Monolithic SRS - Wan 2.2 Flow Studio
Date: 2026-04-15
Authoring context: This document defines the first production-ready Animatrix system built on top of the existing Desineuron ingress, the current ComfyUI GPU service, and the Wan 2.2 model family.
1. Purpose
Animatrix is a focused product for guided character video generation. It is not a general-purpose node editor. It is a constrained, operator-safe application that exposes two production workflows behind one simple frontend:
- Character Animation and Replacement using
Wan2.2-Animate-14B - Audio-Driven Character Performance using
Wan2.2-S2V-14B
The frontend interaction model is inspired by the simplicity and compositional feel of Google Flow, but the execution runtime is ComfyUI-backed and Desineuron-hosted.
The objective is to give users a minimal interface:
- prompt box
- ground-truth starting image upload
- optional reference images and pose sheet uploads
- optional audio upload
- simple mode selection
- one-click generation
while the backend handles:
- asset ingestion
- workflow selection
- parameter validation
- ComfyUI prompt orchestration
- queueing
- status tracking
- result persistence
- streaming-ready delivery
2. Executive Product Truth
Animatrix v1 must be built around the actual Wan 2.2 model split, not a blended assumption.
Capability mapping:
Wan2.2-Animate-14Bis for character animation and character replacement.Wan2.2-S2V-14Bis for audio-driven video generation with dialogue, singing, and performance.Wan2.2 Fun Inpis the Wan family workflow for strict first-frame and last-frame control.Wan2.2 Fun Controlis the Wan family workflow for stronger control-video inputs such as OpenPose, depth, canny, and trajectory control.
Therefore the first release must not falsely claim that one single model covers all of the following natively:
- character replacement
- motion transfer
- audio lip-sync
- exact first/last-frame constraints
It does not.
The correct v1 product line is:
- Workflow A:
Animate StudioonWan2.2-Animate-14B - Workflow B:
Audio Performance StudioonWan2.2-S2V-14B
The correct v1.1 or v2 expansion is:
- Workflow C:
Start/End Frame StudioonWan2.2 Fun Inp - Workflow D:
Pose/Trajectory Control StudioonWan2.2 Fun Control
This distinction is mandatory because it affects UI truthfulness, node graphs, validation rules, asset requirements, and customer expectations.
3. Source Truth and Rationale
This SRS is grounded in the following current sources:
- official Wan 2.2 GitHub repository:
https://github.com/Wan-Video/Wan2.2 - official Wan 2.2 Animate model page:
https://huggingface.co/Wan-AI/Wan2.2-Animate-14B - official ComfyUI Wan 2.2 docs:
https://docs.comfy.org/tutorials/video/wan/wan2_2https://docs.comfy.org/tutorials/video/wan/wan2-2-animatehttps://docs.comfy.org/tutorials/video/wan/wan2-2-s2vhttps://docs.comfy.org/tutorials/video/wan/wan2-2-fun-inphttps://docs.comfy.org/tutorials/video/wan/wan2-2-fun-control
- current Desineuron infrastructure truth:
- [comfyui_setup_truth.md](F:\Workin In Progress\DESINEURON\GITLAB\Project_Velocity.Agent Context\Sprint 1\comfyui_setup_truth.md)
- [Desineuron Stable Ingress Handoff.md](F:\Workin In Progress\DESINEURON\GITLAB\Project_Velocity.Agent Context\Sprint 1\Desineuron Stable Ingress Handoff.md)
Critical source-backed facts that drive the design:
- ComfyUI is already exposed safely through
https://comfy.desineuron.in - the GPU service already runs behind stable ingress
Wan2.2-Animate-14Bsupports two operating modes in ComfyUI docs:MixandMoveWan2.2-S2V-14Bis the audio-driven workflow with image plus audio inputs- ComfyUI’s official Animate docs require additional custom nodes for the full direct workflow
- exact start/end-frame control is documented under
Wan2.2 Fun Inp, not Animate
4. Product Vision
Animatrix should behave like a focused video creation surface, not like a research sandbox.
The product promise is:
"Upload a hero frame, optionally attach references, pose guidance, or audio, write a prompt, and generate a directed character video without touching ComfyUI nodes."
The UI must feel lightweight, but the execution system behind it must be opinionated and rigid enough to be supportable.
That means:
- limited number of modes
- strict validation
- controlled presets
- reproducible workflow JSON
- consistent output formats
- no raw-node exposure in the customer-facing frontend
5. Scope
5.1 In Scope for v1
- one frontend
- one backend API
- two ComfyUI production workflows
- status and result tracking
- stable ingress compatibility
- persistent storage for uploads and outputs
- preview and download experience
- operator-oriented logging and troubleshooting
- support for team usage through the existing Desineuron architecture
5.2 Out of Scope for v1
- arbitrary node editing by end users
- live collaborative editing
- in-browser timeline editing
- multi-scene stitching
- automatic sound effects design
- full NLE replacement
- customer-facing batch farms
- fine-tuning or LoRA training
5.3 Deferred but Planned
- first/last-frame exact control as a third workflow using
Wan2.2 Fun Inp - stronger pose or trajectory control using
Wan2.2 Fun Control - style packs and prompt presets
- branded credits or quota system
- user libraries and reusable character packs
6. User Personas
6.1 Internal Creative Operator
This user understands creative direction but should not need to edit node graphs. They need:
- fast iteration
- predictable inputs
- reliable outputs
- access to previous runs
6.2 Sales Demo Operator
This user needs a polished experience that can be shown live. They need:
- simple UX
- low operator error
- dependable queue feedback
- visible result cards
6.3 Technical Media Designer
This user understands reference material quality and wants more control without dropping into raw ComfyUI. They need:
- reference images
- pose sheet upload
- clear mode distinctions
- optional advanced settings
7. Functional Overview
Animatrix v1 will contain one shell product with two generation modes.
7.1 Mode A: Animate Studio
Underlying engine:
Wan2.2-Animate-14B
Primary purpose:
- animate a character from a source image using the motion and expression from a source video
- replace the subject in a video with a new character image
Sub-modes:
MoveMix
User inputs:
- prompt
- ground-truth character image, required
- source motion video, required
- optional reference images
- optional pose sheet image set
- optional aspect preset
- optional duration target
7.2 Mode B: Audio Performance Studio
Underlying engine:
Wan2.2-S2V-14B
Primary purpose:
- generate a character video from a static image and audio input
- support dialogue, singing, and audio-driven performance
User inputs:
- prompt
- ground-truth character image, required
- source audio, required
- optional reference images
- optional pose sheet image set
- optional full-body / half-body framing preset
- optional duration target inferred from audio length
8. Frontend Vision
The frontend must preserve the interaction language shown in the reference screenshots:
- one large prompt composer
- image chips at the top-left of the composer
- plus button for additional attachments
- compact right-aligned mode selector
- advanced settings revealed through a controlled panel, not always visible
The frontend should feel immediate, not enterprise-heavy.
8.1 Core Layout
Top-level zones:
- Attachment rail
- Prompt composer
- Optional advanced drawer
- Generate action and mode switch
- Run history / output gallery below
8.2 Attachment Types
Attachment chips in v1:
Ground TruthReferencePose SheetAudioMotion Video
Visibility rules:
Ground Truthalways availableMotion Videovisible only in Animate StudioAudiovisible only in Audio Performance StudioPose Sheetoptional in both modesReferenceoptional in both modes
8.3 Frontend Controls
Base controls:
- prompt text area
- optional keyword helper line
- mode toggle:
Animate/Audio - output aspect toggle:
9:16,16:9, later1:1 - quality profile:
Draft,Standard,High - generate button
Advanced controls:
- Animate sub-mode:
Move/Mix - target duration
- seed
- negative prompt
- extension segments
- background preservation flag
- relighting flag
- lip-sync intensity or audio adherence preset
8.4 UX Rules
- do not expose raw model names to standard users
- use user language like
Animate,Replace Character,Audio Performance - surface warnings before submission if required inputs are missing
- show asset previews as compact rounded chips
- keep advanced panel collapsed by default
9. Exact Capability Mapping by Workflow
9.1 Workflow A: Animate Studio
Supported in v1:
- character animation from image plus motion video
- character replacement from image plus source video
- prompt conditioning
- optional pose preprocessing
- iterative video extension
Not truly supported by Animate Studio itself:
- direct audio-driven lip sync
- strict start/end-frame guarantees
9.2 Workflow B: Audio Performance Studio
Supported in v1:
- image plus audio driven generation
- prompt-conditioned motion/environment
- dialogue and singing style use cases
- long-form generation by extension chunks
Not truly supported by S2V itself:
- guaranteed subject replacement from an existing motion video
- exact last-frame lock
9.3 Pose Sheet Truth
The user-requested pose sheet can be supported in two ways:
-
Soft support in v1
- pose sheet stored as reference asset
- backend uses it for prompt augmentation and optional preprocessing assistance
- operator can map selected sheet frames to manual key pose hints
-
Hard support in later release
- migrate pose guidance to a dedicated
Wan2.2 Fun Controlor equivalent control-video workflow
- migrate pose guidance to a dedicated
The v1 document must state this honestly. A static pose sheet is not the same as a control video. It helps guide generation but does not become full deterministic motion control without an additional preprocessing and control pipeline.
10. Ground Truth Asset Model
The user’s "ground truth" image is the canonical identity anchor.
In both workflows it must serve as:
- the primary subject identity reference
- the default starting visual state
- the basis for preview thumbnails
Rules:
- exactly one primary ground-truth image per run
- image must pass minimum size and aspect checks
- background should preferably be clean but not mandatory
- user may crop or center the character before submission
Optional extension:
- future support for multiple identity references per character pack
11. Workflow Architecture
11.1 System Shape
Browser
-> Animatrix frontend
-> Animatrix API
-> job store
-> asset store
-> workflow composer
-> ComfyUI client
-> https://comfy.desineuron.in
-> GPU ComfyUI service
-> Wan2.2 workflow execution
-> result collector
-> output persistence
-> result CDN / static delivery
11.2 Architectural Rule
The frontend must never submit raw prompts directly to ComfyUI.
The backend must always mediate:
- asset upload
- workflow selection
- workflow JSON parameter binding
- run metadata persistence
- output tracking
This is required for observability, rate control, product safety, and sales-readiness.
12. Ingress and Deployment Compatibility
Animatrix must be designed around the current Desineuron ingress truth.
Current infrastructure constraints:
- ComfyUI is already live at
https://comfy.desineuron.in - ComfyUI runs behind AWS ingress and stable TLS
- GPU private IP is not a stable application contract
- Linux origin is currently
192.168.1.2
12.1 Mandatory Integration Rule
Animatrix backend must integrate with ComfyUI through the stable hostname or through a controlled internal service abstraction that resolves to the same managed route.
Do not bind Animatrix to:
- the GPU public IP
- direct
8188public traffic - hardcoded current private IP
12.2 Recommended Host Layout
Recommended public routing:
animatrix.desineuron.in-> frontend and public product shellapi.animatrix.desineuron.inoranimatrix.desineuron.in/api-> backend APIcomfy.desineuron.in-> internal execution dependency only, not user-facing
If separate subdomains are not created immediately, the fallback deployment pattern may mirror the current Velocity site pattern:
- frontend served from Linux origin through ingress
- backend served from Linux origin through ingress
- backend calls ComfyUI through
https://comfy.desineuron.in
13. Runtime Components
13.1 Frontend Application
Responsibilities:
- render simplified generation interface
- manage uploads
- validate user fields before submit
- create job requests
- poll or subscribe to job progress
- render previews and outputs
Suggested stack:
- Next.js or Vite React app
- Tailwind or CSS modules
- upload components with image/audio/video preview
13.2 Animatrix Backend API
Responsibilities:
- receive upload metadata
- store files
- generate canonical run record
- choose workflow template
- bind node inputs
- submit prompt payload to ComfyUI
- track prompt ID and history
- collect generated outputs
- persist result artifacts
Suggested stack:
- FastAPI if aligned with existing Python-heavy operations
- or Node/TypeScript only if the team wants one frontend-backend language
Recommendation:
- use Python FastAPI for v1 if reusing current Desineuron operational style and image/media tooling
13.3 Workflow Composer
Responsibilities:
- keep frozen template JSON files in version control
- inject prompt text, model selections, size, length, and asset paths
- enforce mode-specific constraints
This component must be deterministic. It is not a prompt improviser.
13.4 ComfyUI Execution Layer
Responsibilities:
- execute pre-approved workflow JSON
- expose queue, prompt, history, upload endpoints
- return output metadata
13.5 Asset Store
Responsibilities:
- raw upload persistence
- normalized derivative generation
- final output video persistence
- preview image generation
Recommended storage split:
- hot local cache on Linux origin
- durable object storage in S3 for long-term retention
13A. Current Infrastructure Contract
Animatrix v1 must be compatible with the currently operating Desineuron media stack as it exists today.
Live execution truth:
- public ComfyUI hostname:
https://comfy.desineuron.in - ingress elastic IP:
98.87.120.120 - GPU private target currently managed behind ingress
- Linux origin currently:
192.168.1.2
Current GPU-side storage truth:
- ComfyUI app root:
/opt/dlami/nvme/ComfyUI - HF cache:
/opt/dlami/nvme/hf - model staging root:
/opt/dlami/nvme/model-staging - model logs:
/opt/dlami/nvme/model-logs
Current model hydration truth:
- durable bucket family already in use:
s3://project-velocity/models/ - existing Wan hydration prefix:
s3://project-velocity/models/Wan2.2-Animate-14B/
Animatrix must not introduce a second contradictory deployment path for ComfyUI. It must reuse this stable route and storage discipline.
13B. ComfyUI API Contract
The backend integration layer must be implemented against the current ComfyUI HTTP contract.
Required endpoints:
GET /POST /promptGET /history/{prompt_id}GET /queuePOST /upload/image
Recommended extension checks:
- health probe against
/ - prompt submission response validation
- history polling with bounded backoff
- queue introspection for operator dashboards
The backend must wrap these endpoints in a typed client and must not scatter raw HTTP calls throughout business logic.
13C. Model and Node Manifest
Workflow A: Animate Studio Required Assets
Required model family:
Wan2.2-Animate-14Bclip_vision_h.safetensorswan_2.1_vae.safetensorsumt5_xxl_fp8_e4m3fn_scaled.safetensors
Required custom nodes:
ComfyUI-KJNodesComfyUI-comfyui_controlnet_aux
Suggested placement contract:
- diffusion model files under
ComfyUI/models/diffusion_models/ - text encoder under
ComfyUI/models/text_encoders/ - VAE under
ComfyUI/models/vae/ - CLIP Vision under
ComfyUI/models/clip_vision/
Workflow B: Audio Performance Studio Required Assets
Required model family:
wan2.2_s2v_14B_fp8_scaled.safetensorsorwan2.2_s2v_14B_bf16.safetensorswav2vec2_large_english_fp16.safetensorswan_2.1_vae.safetensorsumt5_xxl_fp8_e4m3fn_scaled.safetensors
Suggested placement contract:
- diffusion model under
ComfyUI/models/diffusion_models/ - text encoder under
ComfyUI/models/text_encoders/ - audio encoder under
ComfyUI/models/audio_encoders/ - VAE under
ComfyUI/models/vae/
Deferred Workflow Assets
For future strict start/end-frame control:
Wan2.2 Fun Inpmodels and optional associated LoRAs
For future stronger pose control:
Wan2.2 Fun Control
The frontend and API must be written so these workflows can be added later without reworking the entire product shell.
14. File and Repository Blueprint
Animatrix should be structured as an application repository or top-level product directory with explicit separation between app, API, and workflow assets.
Recommended layout:
Animatrix/
docs/
Animatrix Monolithic SRS - Wan 2.2 Flow Studio.md
frontend/
src/
app/
components/
features/
lib/
styles/
backend/
app/
api/
services/
models/
repositories/
workers/
workflows/
animate/
wan22_animate_mix.json
wan22_animate_move.json
s2v/
wan22_s2v_base.json
shared/
prompt_profiles/
node_maps/
scripts/
deploy/
media/
sync/
infra/
systemd/
nginx/
caddy/
tests/
api/
workflows/
ui/
15. Workflow A Detailed Design: Animate Studio
15.1 Objective
Deliver a workflow that supports:
- character replacement from a source video
- character animation from a performer video
- prompt-guided visual refinement
15.2 Input Contract
Required:
promptground_truth_imagemotion_videomode:moveormix
Optional:
reference_images[]pose_sheet_images[]negative_promptduration_override_secondsaspect_ratioquality_profileseed
15.3 Output Contract
Primary outputs:
video_mp4poster_frame_jpgjob_manifest.jsondebug_metadata.json
Secondary outputs:
- pose preview if preprocessing is enabled
- first-frame snapshot
15.4 Internal Workflow Stages
- Ingest image and video
- Normalize formats and dimensions
- Extract first frame and thumbnail
- Run optional DWPose or auxiliary preprocessing
- Bind workflow JSON for
moveormix - Upload normalized assets to ComfyUI
- Submit workflow
- Poll queue and history
- Collect result paths
- Persist final outputs and metadata
15.5 ComfyUI Notes
The official Animate workflow requires:
clip_vision_h.safetensorswan_2.1_vae.safetensorsumt5_xxl_fp8_e4m3fn_scaled.safetensors- Animate diffusion model
- optional Lightning LoRA
- custom nodes:
ComfyUI-KJNodesComfyUI-comfyui_controlnet_aux
15.6 Product-Level Rule
Animatrix v1 must hide these internals from the standard UI, but the backend and operator docs must track them exactly.
16. Workflow B Detailed Design: Audio Performance Studio
16.1 Objective
Deliver a workflow that supports:
- talking-head and half-body performance
- singing and dialogue use cases
- audio-driven facial and motion synthesis
16.2 Input Contract
Required:
promptground_truth_imageaudio_file
Optional:
reference_images[]pose_sheet_images[]negative_promptframing_mode:portrait,half_body,full_bodyquality_profileseed
16.3 Output Contract
Primary outputs:
video_mp4poster_frame_jpgjob_manifest.jsondebug_metadata.json
16.4 Internal Workflow Stages
- Ingest image and audio
- Normalize sample rate and file format
- Infer required frame count from audio duration
- Determine required S2V extension chunks
- Bind workflow JSON
- Upload image and audio to ComfyUI
- Submit workflow
- Poll queue and history
- Collect output video
- Persist artifacts
16.5 ComfyUI Notes
The official S2V workflow requires:
wan2.2_s2v_14B_fp8_scaled.safetensorsor bf16 variantwav2vec2_large_english_fp16.safetensorswan_2.1_vae.safetensorsumt5_xxl_fp8_e4m3fn_scaled.safetensors
The ComfyUI docs note that:
- fp8 uses less VRAM
- bf16 may reduce quality degradation
- Lightning LoRA can reduce generation time but can also significantly reduce quality and dynamics
Therefore Animatrix must default to:
Standard: fp8 without aggressive LoRA by default for customer-facing quality stabilityDraft: fp8 with acceleration optionsHigh: bf16 where hardware allows
17. UI-to-Workflow Mapping
The UI must map cleanly to backend request objects.
17.1 Shared Fields
modepromptnegative_promptground_truth_asset_idreference_asset_ids[]pose_sheet_asset_ids[]aspect_ratioquality_profileseed
17.2 Animate-Specific Fields
motion_video_asset_idanimate_submodebackground_preservationrelightingextension_segments
17.3 Audio-Specific Fields
audio_asset_idframing_modeaudio_adherence_profileextension_segments
18. Suggested Backend API
18.1 Asset Endpoints
POST /api/assets/imagePOST /api/assets/videoPOST /api/assets/audioGET /api/assets/{asset_id}
18.2 Job Endpoints
POST /api/jobs/animatePOST /api/jobs/audio-performanceGET /api/jobs/{job_id}GET /api/jobs/{job_id}/eventsGET /api/jobs/{job_id}/outputsPOST /api/jobs/{job_id}/cancel
18.3 Admin Endpoints
GET /api/admin/workflowsGET /api/admin/healthGET /api/admin/queuePOST /api/admin/retry/{job_id}
18.4 Websocket or SSE Progress Channel
Recommended:
GET /api/jobs/{job_id}/stream
This should emit:
- accepted
- uploaded
- queued
- executing
- collecting_outputs
- completed
- failed
The frontend should use this channel if available and fall back to polling if the connection drops.
19. Data Model
19.1 Asset
Fields:
asset_idasset_typemime_typeoriginal_filenamestorage_urlthumbnail_urlwidthheightduration_secondssize_bytescreated_at
19.2 Job
Fields:
job_idmodeworkflow_templatestatussubmitted_bypromptnegative_promptsettings_jsoncomfy_prompt_idcreated_atupdated_at
19.3 JobOutput
Fields:
output_idjob_idvideo_urlposter_urlmanifest_urlduration_secondsresolutionfpscreated_at
20. Workflow Template Governance
Workflow JSON must be treated as versioned product assets.
Rules:
- each production workflow JSON must have an immutable version identifier
- node IDs must be mapped in a dedicated config file
- backend parameter injection must never depend on informal manual node lookup
- each workflow change must pass snapshot regression checks
Required metadata for every workflow:
workflow_nameworkflow_versionmodel_familyrequired_assetsrequired_modelscustom_nodescompatible_backend_version
21. Storage and Delivery Design
21.1 Inputs
Store raw uploads in durable storage with stable references.
Recommended:
- object storage in S3
- local temporary cache for preprocessing
21.2 Outputs
Store:
- mp4 output
- poster image
- optional animated preview
- manifest json
21.3 Delivery
Outputs must be streamable from a public HTTPS origin via ingress.
If using Linux origin:
- serve final assets through nginx under Animatrix public domain
If using S3-backed storage:
- use signed or public-read delivery depending on account mode
22. Quality Profiles
Animatrix must expose productized quality profiles rather than raw step counts to users.
22.1 Draft
Purpose:
- internal ideation
- faster previews
Behavior:
- lower resolution
- lower steps
- acceleration LoRA allowed
22.2 Standard
Purpose:
- most normal production runs
Behavior:
- balanced speed and quality
- conservative defaults
- no quality-destructive shortcuts unless explicitly enabled
22.3 High
Purpose:
- demo and delivery quality
Behavior:
- higher quality model variant when available
- larger resolution
- longer runtime accepted
23. Error Handling
Failure classes:
- missing asset
- invalid asset format
- unsupported aspect ratio
- workflow binding failure
- ComfyUI upload failure
- ComfyUI queue failure
- generation timeout
- result collection failure
User-facing errors must be simplified.
Operator-facing logs must preserve exact failure cause.
23A. Validation Rules
Shared Validation
- reject empty prompt if prompt is required by the selected workflow profile
- reject missing ground-truth image
- reject unsupported file extensions
- reject files above configured upload limit
Animate Studio Validation
- reject missing motion video
- reject unsupported source video codecs that cannot be normalized
- reject conflicting
moveandmixsettings
Audio Performance Studio Validation
- reject missing audio
- reject audio longer than configured maximum duration for the selected profile
- normalize sample rate before workflow submission
Pose Sheet Validation
- accept only supported image formats
- cap pose sheet image count in v1
- mark pose sheet as "soft guidance" in job metadata unless a later hard-control pipeline is introduced
24. Observability
Minimum operational telemetry:
- job creation rate
- queue depth
- mean wait time
- mean generation time by workflow
- failure rate by workflow version
- storage growth
- top asset sizes
Required correlation identifiers:
job_idasset_idcomfy_prompt_id
25. Security and Access Control
Rules:
- do not expose raw ComfyUI publicly to end users as the product surface
- backend owns ComfyUI credentials and workflow orchestration
- validate file size and MIME type on upload
- strip executable uploads
- limit accepted formats
- preserve audit trail for every run
26. Team and Operator UX
The system must support:
- internal team usage through the stable ingress
- supportable operator triage
- easy workflow version rollback
- safe demo usage during sales calls
Operators need:
- admin queue view
- job replay
- access to input and output manifests
- workflow version annotation
27. Non-Functional Requirements
27.1 Reliability
- no direct dependency on ephemeral GPU public IP
- graceful retry around ComfyUI upload and history polling
- job state persisted outside memory
27.2 Performance
- fast upload validation
- async polling and result collection
- cached thumbnails
27.3 Scalability
- workflow templates stateless
- API horizontally scalable
- storage externalized
27.4 Maintainability
- one source-of-truth workflow config per mode
- explicit model manifest
- no hidden hand-edited production JSON
27.5 Sales Readiness
- stable hostname
- reliable queue messaging
- polished success and failure states
- deterministic demo inputs
27A. Demo and Commercial Readiness Requirements
Animatrix will be used in live demos and pre-sales conversations. That changes the bar.
Required product behavior:
- first meaningful UI paint fast enough for live sales use
- one-click sample project loading for demo mode
- clear progress messaging during long generations
- shareable output URL or operator download path
- no raw ComfyUI terminology in the customer-facing layer unless explicitly in admin mode
Required operator support behavior:
- known-good demo assets packaged and versioned
- visible warning when GPU queue is saturated
- ability to retry a failed job without recreating all metadata manually
28. MVP Acceptance Criteria
Animatrix v1 is only considered complete when all of the following are true:
- A user can upload a ground-truth image, type a prompt, attach a motion video, select
MoveorMix, and receive a finished video output. - A user can upload a ground-truth image, type a prompt, attach audio, and receive an audio-driven character video.
- Both flows work through the stable Desineuron ingress model and do not depend on hardcoded GPU IPs.
- Every run produces a persisted job record and output manifest.
- Generated videos are streamable over HTTPS.
- Operators can inspect job state and correlate product job ID to ComfyUI prompt ID.
- The UI remains simple enough for a non-technical demo operator.
29. Explicit Product Decisions
29.1 What v1 Must Say No To
Animatrix v1 must not claim:
- perfect deterministic pose-sheet control
- exact first and last frame locking
- full timeline editing
- full audio mastering
29.2 What v1 Must Say Yes To
Animatrix v1 can truthfully claim:
- guided character animation
- guided character replacement
- audio-driven talking or performance video
- reference-assisted generation
- production-safe simplified UI on top of ComfyUI
30. Recommended Delivery Phases
Phase 1
- backend skeleton
- asset model
- one frozen Animate workflow
- one frozen S2V workflow
- barebones frontend
Phase 2
- quality profiles
- operator dashboard
- output gallery
- S3 persistence
Phase 3
- first/last-frame workflow
- stronger pose control
- reusable character libraries
31. Final Architecture Recommendation
Build Animatrix as a thin product layer over stable infrastructure that already exists:
- keep ComfyUI where it is
- keep ingress where it is
- add a dedicated Animatrix backend
- keep the frontend intentionally minimal
- treat workflow JSON as versioned software artifacts
Do not begin by building a large generic creative suite.
Build the narrowest saleable product first:
Animate StudioAudio Performance Studio
Then expand to:
Start/End Frame StudioPose Control Studio
32. Bottom Line
Animatrix v1 should be a Flow-like creative surface backed by two real Wan 2.2 workflows, not one imaginary super-workflow.
The correct implementation target is:
- one frontend
- one orchestration backend
- two workflow families
- one stable ingress-compatible execution path
- one durable output system
If the team follows this document strictly, the result will be productizable, supportable, and compatible with the current Desineuron infrastructure without lying about model capabilities.