Space with ZeroGPU + ComfyUI + GGUF not working

angelts · December 18, 2025, 1:37pm

Has anyone managed to run ComfyUI + GGUF in ZeroGPU Pro hosting Space?

John6666 · December 18, 2025, 2:26pm

Zero GPU space is more like “CPU space with occasional GPU availability” than dedicated GPU space, making it difficult to run ComfyUI itself. Furthermore, the constraint that it can only be used as Zero GPU space if it’s a Gradio space makes it even more challenging. (ComfyUI’s GUI is not Gradio.)

It might be usable if the workflow is converted to Python first…

You generally cannot run the actual ComfyUI web app (the long-running UI server) on ZeroGPU, even with Pro. The reason is structural: ZeroGPU is Gradio-only and “GPU-per-function-call”, while ComfyUI is a persistent server process that expects a GPU to exist continuously.

What ZeroGPU actually is

ZeroGPU is “serverless GPU” for Spaces:

The Space only gets a GPU while a decorated function runs.
After the function finishes, the GPU is released.
You opt into this by decorating GPU work with @spaces.GPU(...).
There is an explicit duration budget per call (default 60s, configurable). (Hugging Face)

ZeroGPU also has a hard compatibility constraint:

“ZeroGPU Spaces are exclusively compatible with the Gradio SDK.” (Hugging Face)
The ZeroGPU Explorers page repeats the same: “only works with the Gradio SDK.” (Hugging Face)

Pro affects quota and queue priority, not the runtime model:

Pro users get “x7 more daily usage quota and highest priority in GPU queues.” (Hugging Face)

Why ComfyUI conflicts with ZeroGPU

ComfyUI, when you “run ComfyUI,” is a web server that stays up, maintains a job queue, holds models in VRAM, and runs inference over time. That deployment style usually maps to a Docker Space or a VM.

But:

ZeroGPU is not available for Docker SDK Spaces. Hugging Face staff state this directly in the ZeroGPU Explorers discussions: “unfortunately ZeroGPU is not available for Docker SDK Spaces.” (Hugging Face)
The official ComfyUI Space example on Hugging Face is a Docker-based setup (CUDA base image, apt installs, etc.), which is exactly the type of Space ZeroGPU does not support. (Hugging Face)

So the common failure pattern is:

You try to host ComfyUI in Docker → ZeroGPU cannot be selected or does nothing → ComfyUI starts without a usable CUDA GPU → “No CUDA GPUs are available” or similar.

Where GGUF fits (and why it adds friction)

“ComfyUI + GGUF” usually means using ComfyUI-GGUF, a custom node pack that loads GGUF-quantized diffusion/DiT models (Flux, SD 3.5, etc.).

Key points from the project itself:

It is “very much WIP.”
It relies on custom ops support in ComfyUI and wants a “recent-enough” ComfyUI.
Installation is “pip install --upgrade gguf”.
You place .gguf models under ComfyUI/models/unet and swap in the “Unet Loader (GGUF)” node. (GitHub)

That is very doable in a Docker + dedicated GPU Space. It is much less reliable in a serverless, Gradio-only environment.

What does work on Hugging Face today

Option A: Full ComfyUI UI + GGUF (recommended if you need the real UI)

Use a Docker Space with a dedicated GPU upgrade (L4, A10G, A100, etc.), not ZeroGPU.

Docker Spaces are explicitly meant for custom servers and non-Gradio apps. (Hugging Face)
Hugging Face provides a ComfyUI Docker Space example you can fork and adapt. (Hugging Face)
Then add ComfyUI-GGUF and models exactly as the repo describes. (GitHub)

Option B: “ComfyUI workflow” on ZeroGPU (not the ComfyUI UI)

If your goal is “run this workflow and expose a simple interface,” Hugging Face’s official approach is:

Export the ComfyUI workflow to Python.
Wrap it in a Gradio app.
Run inference inside @spaces.GPU(...).

This is documented step-by-step in Hugging Face’s blog post on running ComfyUI workflows on ZeroGPU. (Hugging Face)

Important nuance:

This runs a workflow, not the ComfyUI web UI.
You also typically move model initialization to global scope to avoid re-loading every call. (Hugging Face)

GGUF in this setup is “maybe, but expect work”:

In principle you can include custom nodes by folding their requirements into the Space and importing them once. (Hugging Face)
In practice, ComfyUI-GGUF’s “custom ops / recent ComfyUI” requirements can collide with ZeroGPU’s constrained runtime and the need to keep execution inside the decorated call. (GitHub)

Quick checklist to confirm what’s going wrong

If your Space uses Docker (sdk: docker): ZeroGPU is not supported there. That alone explains it. (Hugging Face)
If your Space is Gradio + ZeroGPU:
- Any GPU work must be inside @spaces.GPU.
- If you see “No CUDA GPUs are available,” you are likely calling CUDA code outside the decorated function or starting a persistent server that never “enters” the GPU context. (Hugging Face)
If your inference exceeds the call budget: raise duration=... on the decorator, but remember this is still “per call” serverless execution. (Hugging Face)

Links (in one place)

ZeroGPU docs (compatibility + how GPU is allocated): https://huggingface.co/docs/hub/en/spaces-zerogpu
ZeroGPU Explorers (usage + duration): https://huggingface.co/zero-gpu-explorers
ZeroGPU not available for Docker (HF staff reply): https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/42
HF blog: run ComfyUI workflows on ZeroGPU via Gradio: https://huggingface.co/blog/run-comfyui-workflows-on-spaces
ComfyUI Docker Space example: https://huggingface.co/spaces/SpacesExamples/ComfyUI/blob/main/Dockerfile
ComfyUI-GGUF repo: https://github.com/city96/ComfyUI-GGUF

Summary

ZeroGPU is Gradio-only and GPU-per-function-call. (Hugging Face)
Full ComfyUI UI is a persistent server, usually deployed as Docker. ZeroGPU does not support Docker. (Hugging Face)
Best path for ComfyUI + GGUF is Docker Space + paid GPU. (Hugging Face)
Best path for ZeroGPU is workflow-to-Gradio, not running the ComfyUI UI. (Hugging Face)

angelts · December 18, 2025, 4:09pm

Topic		Replies	Views
Comfyui + HF + hardware Beginners	1	201	May 1, 2025
Fast api zero GPU problem Beginners	1	144	June 12, 2025
Does ZeroGPU not work for all spaces? Beginners	5	1981	February 28, 2025
ZeroGPU has not been initialized when trying to use space in other space Spaces	1	344	August 30, 2024
Zero GPU Worker Error Spaces	32	1430	August 25, 2025