ComfyUI workflows are node graphs serialised to JSON. comfy-pilot is a copilot
that reads those graphs, explains them, and edits them from a chat box. It works
with Ollama, Claude, and friends. The interesting part was everything the model
got confidently wrong.
Graphs are not prose
The first naive version dumped the whole workflow JSON into the prompt and asked
for edits. The model would hallucinate node IDs, invent input slots, and wire
edges that pointed nowhere. The fix was to never let the LLM emit raw graph JSON.
Instead it calls a small set of typed operations — add_node, connect,
set_param — that I validate against the actual node schema before applying.
Make wrong answers cheap
Every operation is reversible and dry-run-able. The model proposes, the validator disposes, and the user sees a diff before anything touches their canvas. Treating the LLM as a fallible intern who needs a code reviewer — rather than an oracle — is the whole design.
What surprised me
The biggest quality jump didn't come from a bigger model. It came from giving the smaller one better tools and a tighter schema. Constraint beats capability more often than the demos admit.