21 KiB
Desineuron Stable Ingress Handoff
Date: 2026-04-08
Chapters
- Outcome
- Final Architecture
- AWS Resources
- Linux Origin State
- Migration Changes Applied
- Validation Results
- ComfyUI Recovery and GPU Route
- Files and Config Artifacts
- Dynamic Home IP Sync
- Operational Commands
- Future Service Mapping Runbook
- Security Notes
- Remaining Improvement Ideas
- Rollback
- Team Summary
- Current Status Snapshot - 2026-04-12
- Linux Ops Control Plane
Outcome
The Cloudflare Tunnel dependency for the six public desineuron.in services has been replaced with a self-hosted AWS ingress layer:
- Public edge: AWS EC2
t4g.micro - Stable public IP:
98.87.120.120 - TLS termination:
Caddyon the ingress node - Private backend relay:
rathole - Origin: Linux box at
192.168.1.4 - DNS: Cloudflare,
DNS only
Public hostnames now route through AWS instead of Cloudflare Tunnel:
office.desineuron.ingit.desineuron.incloud.desineuron.inprojects.desineuron.intalk.desineuron.invpn.desineuron.incomfy.desineuron.in(ingress route created for AWS GPU ComfyUI)ops.desineuron.in(private operator control surface on the Linux box)
Final Architecture
Internet
-> Cloudflare DNS
-> 98.87.120.120
-> EC2 ingress: desineuron-ingress-01
-> Caddy :443
-> rathole server (control on 2333, local relay on 127.0.0.1:8443)
-> Linux origin tunnel client
-> Linux nginx :443
-> per-host upstream routing
-> Gitea
-> Nextcloud
-> Taiga
-> OnlyOffice
-> NetBird
-> comfy.desineuron.in
-> EC2 ingress Caddy
-> private proxy to AWS GPU box `172.31.46.190:8188`
-> ComfyUI endpoints on systemd-managed GPU service
AWS Resources
- Instance name:
desineuron-ingress-01 - Instance ID:
i-094df09acafb72494 - Type:
t4g.micro - Region:
us-east-1 - Subnet:
subnet-03d684ed15f327151 - VPC:
vpc-081d2397920aad268 - Root disk:
20 GB gp3 - Elastic IP:
98.87.120.120 - IAM role:
desineuron-ingress-role - Instance profile:
desineuron-ingress-profile - Security group:
sg-0721b8b48e12c531d
Current GPU worker:
- Instance ID:
i-0e4eab5fe67cf9abe - Type:
g6.12xlarge - Region:
us-east-1 - Private IP:
172.31.46.190 - Current public IP:
100.31.64.121 - Launch time:
2026-04-11T06:14:04Z
Open ingress ports:
80/tcpfrom internet443/tcpfrom internet22/tcprestricted to the current home public IP and auto-synced from the Linux origin2333/tcpfrom internet forratholecontrol and data relay
GPU node security posture for ComfyUI:
- public
8118/tcpremoved - public
8188/tcpremoved 8188/tcpnow allowed only from ingress security groupsg-0721b8b48e12c531d
Linux Origin State
Services exposed to local nginx:
git.desineuron.in->127.0.0.1:3000(gitea)cloud.desineuron.in->127.0.0.1:11000(nextcloud_app)talk.desineuron.in->127.0.0.1:11000(nextcloud_app, Talk-focused hostname)projects.desineuron.in->127.0.0.1:9100(taiga-gateway)office.desineuron.in->127.0.0.1:9980(nextcloud_onlyoffice)vpn.desineuron.in->127.0.0.1:8080/127.0.0.1:8081(netbird)
Tunnel state:
rathole-client.serviceactive on Linuxrathole-server.serviceactive on AWScloudflaredinactive on Linux
Migration Changes Applied
Cloudflare
Old CNAME tunnel records were removed for the six public hostnames.
New records were created:
- Type:
A - Value:
98.87.120.120 - Proxy status:
DNS only - TTL:
300
AWS Ingress
Installed and configured:
Caddyratholeamazon-ssm-agent- Linux-driven SSH allowlist sync for the ingress node
TLS:
- Existing valid certificate/key pair from the Linux origin was copied to the ingress node.
- Caddy now terminates HTTPS at the edge.
Linux Origin
nginx was already routing by hostname and remains the origin router.
Nextcloud was adjusted so talk.desineuron.in no longer canonicalizes to cloud.desineuron.in:
- removed
overwritehostpin - added
talk.desineuron.into trusted domains - restarted
nextcloud_app
Validation Results
Public hostname checks through the new ingress:
office.desineuron.in->200 /welcome/git.desineuron.in->200cloud.desineuron.in->200 /loginprojects.desineuron.in->200talk.desineuron.in->200 /loginontalk.desineuron.invpn.desineuron.in->200ops.desineuron.in/login->200comfy.desineuron.in->200
Important note:
talk.desineuron.innow stays on thetalkhostname.- It is still backed by the same Nextcloud origin and presents the Nextcloud login flow, which is expected given the current Linux-side app layout.
ComfyUI Recovery and GPU Route
Root cause of the earlier 502:
- ingress route and TLS were correct
- the GPU spot node had lost the actual
/opt/dlami/nvme/ComfyUIapp tree - nothing was listening on
172.31.46.190:8188
Permanent fix applied:
- restored
/opt/dlami/nvme/ComfyUIfrom upstream source control - installed ComfyUI Python requirements on the GPU node
- created
systemdunitcomfyui.service - enabled
comfyui.serviceat boot with automatic restart - kept
comfy.desineuron.inmapped through ingress Caddy - removed direct public access to
8118and8188 - allowed
8188only from ingress security group
Current live path:
https://comfy.desineuron.in-> ingress98.87.120.120-> Caddy reverse proxy -> GPU private IP172.31.46.190:8188->comfyui.service
Current public result:
comfy.desineuron.incurrently returns200 OK- ingress route is now managed dynamically instead of hardcoded to one GPU private IP
Current GPU service:
comfyui.service- app path:
/opt/dlami/nvme/ComfyUI - log path:
/var/log/comfyui/service.log - port:
8188/tcp
Current backend state on 2026-04-12:
comfyui.serviceisactivemain.pyis present under/opt/dlami/nvme/ComfyUI- the process is listening on
0.0.0.0:8188 - the public ingress path is healthy again
Auto-healing fix applied:
- ComfyUI
systemdservice now runs anExecStartPrerecovery script at/usr/local/bin/desineuron-ensure-comfyui.sh - that script reclones/repairs
/opt/dlami/nvme/ComfyUIif the tree is missing or damaged - Linux now runs
desineuron-comfy-route-sync.timer - the timer updates the managed Caddy route for
comfy.desineuron.into the current private IP of the AWS instance taggedDesineuronRole=comfyui - this protects the public route from GPU instance IP drift without manual Caddy edits
Expected endpoints:
https://comfy.desineuron.in/https://comfy.desineuron.in/prompthttps://comfy.desineuron.in/history/{prompt_id}https://comfy.desineuron.in/queuehttps://comfy.desineuron.in/upload/image
Files and Config Artifacts
Infrastructure artifacts in repo:
- README.md
- Caddyfile
- rathole-server.toml
- rathole-client.toml
- install_linux_rathole_client.sh
- user_data.sh
- install_gpu_comfyui_service.sh
- map_gpu_comfy_security.ps1
- sync_ingress_home_ip.py
- desineuron-ingress-home-ip-sync.service
- desineuron-ingress-home-ip-sync.timer
- install_linux_ingress_ip_sync.sh
- sync_comfy_route.py
- desineuron-comfy-route-sync.service
- desineuron-comfy-route-sync.timer
- install_linux_comfy_route_sync.sh
- README.md
- Desineuron Ops Control Plane Bibel.md
Linux origin files touched:
/etc/nginx/sites-enabled/desineuron.conf/mnt/ServerStorage/docker_apps/nextcloud/.env/mnt/ServerStorage/docker_apps/nextcloud/data/config/config.php/mnt/ServerStorage/docker_apps/nextcloud/data/config/reverse-proxy.config.php
Backups created on Linux:
/mnt/ServerStorage/docker_apps/nextcloud/.env.pre_ingress_backup_2026-04-08/mnt/ServerStorage/docker_apps/nextcloud/data/config/reverse-proxy.config.php.pre_ingress_backup_2026-04-08
Dynamic Home IP Sync
Purpose:
- Keep ingress
22/tcprestricted to the current Airtel public IP even when the ISP changes it - Prevent future manual outages for SSH fallback caused by stale home-IP security-group rules
Design:
- Linux origin runs
desineuron-ingress-home-ip-sync.timer - Timer fires on boot and every 5 minutes
- Service resolves the current home public IP via
https://api.ipify.org - Service updates only the ingress security group
sg-0721b8b48e12c531d - Only the SSH fallback rule is mutated
ratholeis no longer dependent on the Airtel IP because2333/tcpremains open on the ingress
Installed Linux paths:
/usr/local/bin/sync_ingress_home_ip.py/etc/systemd/system/desineuron-ingress-home-ip-sync.service/etc/systemd/system/desineuron-ingress-home-ip-sync.timer/etc/desineuron-ingress-home-ip-sync.env/opt/desineuron-ingress-ip-sync/.venv/var/lib/desineuron-ingress-ip-sync/current_ip.txt
Current state:
- Timer: enabled and active
- Last recorded home public IP:
223.185.28.89 - Ingress SSH rule CIDR:
223.185.28.89/32
Dynamic Comfy Route Sync
Purpose:
- keep
comfy.desineuron.inmapped to the correct AWS GPU private IP even if the GPU instance public/private IP changes - remove the need to hand-edit
/etc/caddy/Caddyfilefor ComfyUI moves
Design:
- Linux runs
desineuron-comfy-route-sync.timer - timer fires on boot and every 2 minutes
- service looks for the newest running EC2 instance tagged
DesineuronRole=comfyui - service reads its current private IP
- service connects to the ingress node and updates the managed Caddy route with
/usr/local/bin/manage_desineuron_routes.py - Caddy is validated and reloaded only after a successful route update
Installed Linux paths:
/usr/local/bin/sync_comfy_route.py/etc/systemd/system/desineuron-comfy-route-sync.service/etc/systemd/system/desineuron-comfy-route-sync.timer/etc/desineuron-comfy-route-sync.env/opt/desineuron-comfy-route-sync/.venv/var/lib/desineuron-comfy-route-sync/current_target.txt
Current state:
- Timer: enabled and active
- Current synced target:
172.31.46.190 - Current target instance tag:
DesineuronRole=comfyui
Operational Commands
Check AWS ingress status:
aws ec2 describe-instances --instance-ids i-094df09acafb72494 --region us-east-1
aws ec2 describe-addresses --allocation-ids eipalloc-0d54fc0f827450e7b --region us-east-1
Check ingress services:
aws ssm send-command --region us-east-1 --instance-ids i-094df09acafb72494 --document-name AWS-RunShellScript --parameters commands="sudo systemctl status caddy rathole-server --no-pager"
Check GPU ComfyUI service:
aws ssm send-command --region us-east-1 --instance-ids i-0e4eab5fe67cf9abe --document-name AWS-RunShellScript --parameters commands="sudo systemctl status comfyui --no-pager","ss -ltnp | grep 8188 || true","tail -n 40 /var/log/comfyui/service.log || true"
Check Linux origin services:
ssh -i "$env:USERPROFILE\.ssh\id_ed25519_desineuron_lan" desineuron-node-01@192.168.1.4 "echo '***' | sudo -S systemctl status rathole-client nginx"
ssh -i "$env:USERPROFILE\.ssh\id_ed25519_desineuron_lan" desineuron-node-01@192.168.1.4 "echo '***' | sudo -S systemctl status desineuron-ingress-home-ip-sync.service desineuron-ingress-home-ip-sync.timer"
ssh -i "$env:USERPROFILE\.ssh\id_ed25519_desineuron_lan" desineuron-node-01@192.168.1.4 "echo '***' | sudo -S journalctl -u desineuron-ingress-home-ip-sync -n 50 --no-pager"
ssh -i "$env:USERPROFILE\.ssh\id_ed25519_desineuron_lan" desineuron-node-01@192.168.1.4 "echo '***' | sudo -S systemctl status desineuron-ops-control-plane.service --no-pager"
ssh -i "$env:USERPROFILE\.ssh\id_ed25519_desineuron_lan" desineuron-node-01@192.168.1.4 "echo '***' | sudo -S docker compose -f /opt/desineuron-ops-control-plane/docker-compose.yml ps"
ssh -i "$env:USERPROFILE\.ssh\id_ed25519_desineuron_lan" desineuron-node-01@192.168.1.4 "echo '***' | sudo -S systemctl status desineuron-comfy-route-sync.service desineuron-comfy-route-sync.timer --no-pager"
Public endpoint validation:
curl.exe -I https://office.desineuron.in
curl.exe -I https://git.desineuron.in
curl.exe -I https://cloud.desineuron.in
curl.exe -I https://projects.desineuron.in
curl.exe -I https://talk.desineuron.in
curl.exe -I https://vpn.desineuron.in
curl.exe -I https://comfy.desineuron.in
curl.exe -I https://ops.desineuron.in/login
Future Service Mapping Runbook
Use this pattern for any future public service behind the stable ingress layer.
- Decide the backend location.
- Linux origin behind
rathole - AWS GPU/private EC2 node
- another private backend later
- Decide whether the service should terminate TLS at ingress.
- default: yes
- Caddy on ingress should own the public hostname and certificate
- Create the DNS record in Cloudflare.
- type:
A - value:
98.87.120.120 - proxy mode:
DNS only - low TTL during rollout
- Add the ingress route in
Caddyfile.
Patterns:
- Linux-origin service:
- proxy to
https://127.0.0.1:8443 - preserve
Host
- proxy to
- private AWS backend service:
- proxy to
http://<private-ip>:<port>orhttps://<private-ip>:<port>
- proxy to
- Restrict backend network access.
- never leave backend app ports open to
0.0.0.0/0unless absolutely necessary - prefer security-group rule allowing traffic only from ingress security group
- for home-origin services, keep them private behind
rathole
- Reload ingress.
ssh -i "F:\Workin In Progress\DESINEURON\GITLAB\Project_Velocity\desineuron-l4-node.pem" ec2-user@98.87.120.120 "sudo caddy validate --config /etc/caddy/Caddyfile && sudo systemctl reload caddy"
- Validate TLS and app response.
- check certificate subject matches hostname
- check
curl -I https://<host> - check login page or health endpoint
- check browser behavior
- If the backend is stateful, create a persistent service.
- prefer
systemd - enable restart on failure
- log to a stable path
- record service name, working directory, ports, and restart policy in this handoff doc
- Update team docs immediately.
- hostname
- DNS record type
- ingress route target
- backend service owner
- service name
- health check command
- rollback step
Security Notes
- Public traffic terminates only at the AWS edge.
- The Linux box no longer needs Cloudflare Tunnel for these six routes.
- The Linux origin is reached through an outbound tunnel, not by directly exposing the home machine to the public for app traffic.
- SSH on the Linux box remains key-only.
- The AWS ingress IAM role is limited to SSM core.
- ComfyUI is no longer directly exposed on the GPU public IP; only the ingress layer can reach
8188. - Ingress
22/tcpstays restricted and is now auto-synced from the Linux origin. - Ingress
2333/tcpis intentionally open soratholesurvives Airtel IP changes without operator action.
Remaining Improvement Ideas
- Move the Linux nginx certificate issuance/renewal model to the AWS edge permanently instead of copying an existing certificate.
- Clean up nginx warnings about duplicated protocol options.
- Separate
talk.desineuron.inmore fully from general Nextcloud if a distinct Talk-only UX is desired. - Add authentication in front of
comfy.desineuron.in; internet scanners started hitting the route immediately after it went live. - Consider putting Basic Auth or an allowlist in front of
comfy.desineuron.inbefore broader team rollout. - Add monitoring and alerting on:
caddyrathole-serverrathole-client- public HTTPS checks
- Add infrastructure-as-code for the EC2 ingress node if this should be reproducible by the team without manual AWS CLI steps.
Rollback
If rollback is needed:
- Recreate Cloudflare CNAME/tunnel routes or repoint the DNS records away from
98.87.120.120. - Stop
caddyandrathole-serveron AWS. - Stop
rathole-clienton Linux. - Restore Nextcloud files from:
.env.pre_ingress_backup_2026-04-08reverse-proxy.config.php.pre_ingress_backup_2026-04-08
- Restart
nextcloud_appand nginx.
Team Summary
This migration is complete.
Cloudflare Tunnel is no longer the production path for the six public service hostnames. The stable production ingress is now the AWS t4g.micro node with Elastic IP 98.87.120.120, and the Linux machine remains the private origin behind rathole.
Additional mapped route:
comfy.desineuron.innow terminates on the same stable ingress and forwards to the GPU node's private address172.31.46.190:8188.- No further DNS change is needed for ComfyUI.
- The backend is supervised by
systemdand currently healthy. - The route is now auto-synced from Linux based on the tagged AWS ComfyUI worker, so future IP changes do not require manual ingress edits.
- The team can use:
https://comfy.desineuron.in/prompthttps://comfy.desineuron.in/history/{prompt_id}https://comfy.desineuron.in/queuehttps://comfy.desineuron.in/upload/image
Current Status Snapshot - 2026-04-12
Live public service state:
office.desineuron.in->200git.desineuron.in->200cloud.desineuron.in->200projects.desineuron.in->200talk.desineuron.in->200vpn.desineuron.in->200ops.desineuron.in/login->200comfy.desineuron.in->200
Linux-origin health:
nginx.service->activerathole-client.service->activedesineuron-ingress-home-ip-sync.timer->activedesineuron-ops-control-plane.service->active
Linux ops stack containers:
desineuron-ops-api->Updesineuron-ops-db->Up (healthy)desineuron-ops-worker->Up
Ingress health:
caddy->activerathole-server->activecomfy.desineuron.inCaddy route is present in/etc/caddy/Caddyfile
GPU ComfyUI state:
comfyui.service->activemain.pypresent under/opt/dlami/nvme/ComfyUI- listener present on
0.0.0.0:8188 - public ingress path is healthy
Comfy auto-heal state:
desineuron-comfy-route-sync.timer->active- synced target file ->
/var/lib/desineuron-comfy-route-sync/current_target.txt - current synced target ->
172.31.46.190
Linux Ops Control Plane
The Linux box now also hosts the private AWS control surface for the team.
Public operator URL:
https://ops.desineuron.in/login
Purpose:
- launch/stop/terminate AWS machines
- view spot/on-demand market data
- track runtime and estimated cost
- ingest model directories from the Linux box into S3
- hydrate models from S3 to AWS GPU nodes
- manage ingress routes through the
t4g.micro - export session/cost CSVs
Linux runtime paths:
- stack root:
/opt/desineuron-ops-control-plane - env file:
/opt/desineuron-ops-control-plane/.env - exports:
/opt/desineuron-ops-control-plane/exports - state:
/opt/desineuron-ops-control-plane/state
Canonical S3 bucket:
desineuron-ops-control-plane-819079556187-us-east-1
Model library source on Linux:
/mnt/ServerStorage/ai-models/models
Current operator accounts:
sagnik@desineuron.insayan@desineuron.insourik@desineuron.in
Reference docs: