ZeroCLIP β Pre-Built Model Files
This repository hosts pre-built bootstrapped model artifacts for ZeroCLIP, a text-free deterministic conditioning system for Stable Diffusion. Instead of writing text prompts, ZeroCLIP maps integer seeds to CLIP-compatible conditioning vectors β producing deterministic, reproducible imagery with no language model involved.
These artifacts are the result of offline build processes (anchor extraction, MLP training, PCA projection, and self-bootstrapped prior discovery) that would otherwise need to be run locally. By downloading them here, you skip the build step entirely and can start generating immediately.
What's Included
Pre-built artifacts are provided for both SDXL and SD1.5 checkpoints, covering all four ZeroCLIP conditioning variants:
ZeroCLIP-A: Basis Decomposition Anchors
- Files:
.npyanchor libraries - What they are: Libraries of CLIP text embeddings extracted from a curated vocabulary. At runtime, a seed selects a weighted combination of these anchors via coherent noise and softmax, producing a conditioning vector that is a smooth blend of known concepts.
- Why pre-built: Extraction requires running a CLIP text encoder over the full vocabulary (~2 min on CPU with
transformersinstalled). These files let you skip that dependency.
ZeroCLIP-B: Latent Coordinate MLP
- Files:
.ptmodel checkpoints (~50k parameters) - What they are: Tiny neural networks trained to map 3D coordinates (derived from seed values) directly to CLIP embedding space. The MLP learns a continuous manifold over the embedding space, enabling microsecond-speed inference.
- Why pre-built: Training requires generating a dataset of CLIP embeddings and fitting the MLP (~10 min on CPU). These checkpoints are ready to load and run.
ZeroCLIP-C: PCA Projection (Guided Mode)
- Files:
.npzprojection matrices - What they are: PCA-derived projection matrices that constrain entropy-sampled vectors to the principal components of real CLIP embedding space. This produces outputs that are statistically closer to natural language embeddings while remaining fully deterministic from seed.
- Why pre-built: Computing the projection requires CLIP encoder access (~2 min on CPU). Note: ZeroCLIP-C pure mode requires no artifacts at all β it works immediately with zero setup.
ZeroCLIP-D: Self-Bootstrapped Prior Anchors
- Files:
.npyanchor libraries - What they are: Anchor embeddings discovered by the diffusion model itself through an iterative bootstrapping process. Unlike variant A (which derives anchors from text), variant D finds conditioning vectors that the model responds to strongly β with no text encoder involved at any stage. This is a fully text-free pipeline from build to inference.
- Why pre-built: The bootstrap process requires running hundreds of diffusion inference steps to discover high-response anchors (~5 hours on GPU with
diffusersandaccelerate). These files save significant compute time.
Installation
Download the model files for your target checkpoint type (SDXL, SD1.5, or both).
Place all downloaded files into your ComfyUI models directory:
ComfyUI/models/zeroclip/
This is the default path that ZeroCLIP loader nodes scan. The directory is created automatically on first load of the custom nodes, but you can also create it manually.
- Install the ZeroCLIP custom nodes from: https://github.com/MushroomFleet/ComfyUI-ZeroCLIP-nodes
Copy the node pack into your ComfyUI custom nodes directory:
ComfyUI/custom_nodes/ZeroClip-nodes/
- Restart ComfyUI. The loader nodes (e.g., ZeroClip-A Load Anchors, ZeroClip-B Load MLP) will present dropdown file pickers populated from
models/zeroclip/.
Example Workflows
The ComfyUI-ZeroCLIP-nodes repository includes ready-to-use workflow JSON files in the /workflows/ folder. Import them into ComfyUI by dragging the JSON file onto the canvas.
SD1.5 Workflows
| Workflow | Description |
|---|---|
zeroclip_a_basic.json |
SeedPack -> A-Load Anchors -> A-Conditioning -> KSampler |
zeroclip_b_basic.json |
SeedPack -> B-Load MLP -> B-Conditioning -> KSampler |
zeroclip_c_pure.json |
C-Conditioning (pure, no model files needed) -> KSampler |
zeroclip_c_guided.json |
C-Load Projection -> C-Conditioning (guided) -> KSampler |
zeroclip_d_basic.json |
SeedPack -> D-Load Anchors -> D-Conditioning -> KSampler |
zeroclip_blend_example.json |
Two seeds -> Two conditionings -> Blend -> KSampler |
SDXL Workflows
| Workflow | Description |
|---|---|
SDXL_zeroclip_a_basic.json |
SeedPack -> A-Load Anchors (seq+pooled) -> A-Conditioning SDXL -> KSampler |
SDXL_zeroclip_b_basic.json |
SeedPack -> B-Load MLP (seq+pooled) -> B-Conditioning SDXL -> KSampler |
SDXL_zeroclip_c_pure.json |
C-Conditioning SDXL (pure, no model files needed) -> KSampler |
SDXL_zeroclip_c_guided.json |
C-Load Projection -> C-Conditioning SDXL (guided) -> KSampler |
SDXL_zeroclip_d_basic.json |
SeedPack -> D-Load Anchors (seq+pooled) -> D-Conditioning SDXL -> KSampler |
SDXL_zeroclip_blend_example.json |
Two seeds -> Two conditionings -> Blend -> KSampler |
How ZeroCLIP Works
Traditional Stable Diffusion workflows require a text prompt that gets encoded into a conditioning vector by CLIP. ZeroCLIP replaces this entire text pipeline with deterministic mathematical operations:
- You provide integer seeds β a concept_id, style_id, mood_salt, and world_seed β packed into a seed tuple.
- The seed maps to a conditioning vector through one of four strategies (A/B/C/D), each offering different trade-offs between setup cost, runtime speed, and visual character.
- The conditioning vector is standard CLIP-compatible output β it plugs directly into KSampler as positive conditioning, just like text-encoded conditioning would.
The result: the same seed always produces the same image, on any machine, in any session, with no text encoder required at inference time. Seeds can be swept, gridded, blended, and explored procedurally.
Which Variant Should I Use?
| Variant | Artifacts Needed | Runtime Speed | Visual Character | Best For |
|---|---|---|---|---|
| A | Anchor library (.npy) |
Fast | Smooth blends of known concepts | General-purpose text-free conditioning |
| B | MLP checkpoint (.pt) |
Fastest (microseconds) | Continuous manifold navigation | Minimal runtime footprint |
| C pure | None | Fast | Unpredictable, pure prior resonance | Zero-setup exploration |
| C guided | Projection matrix (.npz) |
Fast | Near-language, semantically plausible | Structured entropy |
| D | Bootstrap anchors (.npy) |
Fast | Model-discovered semantics | Research into text-free priors |
π Citation
Academic Citation
If you use this codebase in your research or project, please cite:
@software{zeroclip,
title = {ZeroCLIP: Text-Free Deterministic CLIP Conditioning for Stable Diffusion},
author = {Drift Johnson},
year = {2025},
url = {https://github.com/MushroomFleet/ComfyUI-ZeroCLIP-nodes},
version = {1.0.0}
}
Donate:
Model tree for mushroomfleet/zeroclip
Base model
stabilityai/stable-diffusion-xl-base-1.0