ZeroCLIP — Pre-Built Model Files

This repository hosts pre-built bootstrapped model artifacts for ZeroCLIP, a text-free deterministic conditioning system for Stable Diffusion. Instead of writing text prompts, ZeroCLIP maps integer seeds to CLIP-compatible conditioning vectors — producing deterministic, reproducible imagery with no language model involved.

These artifacts are the result of offline build processes (anchor extraction, MLP training, PCA projection, and self-bootstrapped prior discovery) that would otherwise need to be run locally. By downloading them here, you skip the build step entirely and can start generating immediately.

What's Included

Pre-built artifacts are provided for both SDXL and SD1.5 checkpoints, covering all four ZeroCLIP conditioning variants:

ZeroCLIP-A: Basis Decomposition Anchors

Files: .npy anchor libraries
What they are: Libraries of CLIP text embeddings extracted from a curated vocabulary. At runtime, a seed selects a weighted combination of these anchors via coherent noise and softmax, producing a conditioning vector that is a smooth blend of known concepts.
Why pre-built: Extraction requires running a CLIP text encoder over the full vocabulary (~2 min on CPU with transformers installed). These files let you skip that dependency.

ZeroCLIP-B: Latent Coordinate MLP

Files: .pt model checkpoints (~50k parameters)
What they are: Tiny neural networks trained to map 3D coordinates (derived from seed values) directly to CLIP embedding space. The MLP learns a continuous manifold over the embedding space, enabling microsecond-speed inference.
Why pre-built: Training requires generating a dataset of CLIP embeddings and fitting the MLP (~10 min on CPU). These checkpoints are ready to load and run.

ZeroCLIP-C: PCA Projection (Guided Mode)

Files: .npz projection matrices
What they are: PCA-derived projection matrices that constrain entropy-sampled vectors to the principal components of real CLIP embedding space. This produces outputs that are statistically closer to natural language embeddings while remaining fully deterministic from seed.
Why pre-built: Computing the projection requires CLIP encoder access (~2 min on CPU). Note: ZeroCLIP-C pure mode requires no artifacts at all — it works immediately with zero setup.

ZeroCLIP-D: Self-Bootstrapped Prior Anchors

Files: .npy anchor libraries
What they are: Anchor embeddings discovered by the diffusion model itself through an iterative bootstrapping process. Unlike variant A (which derives anchors from text), variant D finds conditioning vectors that the model responds to strongly — with no text encoder involved at any stage. This is a fully text-free pipeline from build to inference.
Why pre-built: The bootstrap process requires running hundreds of diffusion inference steps to discover high-response anchors (~5 hours on GPU with diffusers and accelerate). These files save significant compute time.

Installation

Download the model files for your target checkpoint type (SDXL, SD1.5, or both).
Place all downloaded files into your ComfyUI models directory:

ComfyUI/models/zeroclip/

This is the default path that ZeroCLIP loader nodes scan. The directory is created automatically on first load of the custom nodes, but you can also create it manually.

Install the ZeroCLIP custom nodes from: https://github.com/MushroomFleet/ComfyUI-ZeroCLIP-nodes

Copy the node pack into your ComfyUI custom nodes directory:

ComfyUI/custom_nodes/ZeroClip-nodes/

Restart ComfyUI. The loader nodes (e.g., ZeroClip-A Load Anchors, ZeroClip-B Load MLP) will present dropdown file pickers populated from models/zeroclip/.

Example Workflows

The ComfyUI-ZeroCLIP-nodes repository includes ready-to-use workflow JSON files in the /workflows/ folder. Import them into ComfyUI by dragging the JSON file onto the canvas.

SD1.5 Workflows

Workflow	Description
`zeroclip_a_basic.json`	SeedPack -> A-Load Anchors -> A-Conditioning -> KSampler
`zeroclip_b_basic.json`	SeedPack -> B-Load MLP -> B-Conditioning -> KSampler
`zeroclip_c_pure.json`	C-Conditioning (pure, no model files needed) -> KSampler
`zeroclip_c_guided.json`	C-Load Projection -> C-Conditioning (guided) -> KSampler
`zeroclip_d_basic.json`	SeedPack -> D-Load Anchors -> D-Conditioning -> KSampler
`zeroclip_blend_example.json`	Two seeds -> Two conditionings -> Blend -> KSampler

SDXL Workflows

Workflow	Description
`SDXL_zeroclip_a_basic.json`	SeedPack -> A-Load Anchors (seq+pooled) -> A-Conditioning SDXL -> KSampler
`SDXL_zeroclip_b_basic.json`	SeedPack -> B-Load MLP (seq+pooled) -> B-Conditioning SDXL -> KSampler
`SDXL_zeroclip_c_pure.json`	C-Conditioning SDXL (pure, no model files needed) -> KSampler
`SDXL_zeroclip_c_guided.json`	C-Load Projection -> C-Conditioning SDXL (guided) -> KSampler
`SDXL_zeroclip_d_basic.json`	SeedPack -> D-Load Anchors (seq+pooled) -> D-Conditioning SDXL -> KSampler
`SDXL_zeroclip_blend_example.json`	Two seeds -> Two conditionings -> Blend -> KSampler

How ZeroCLIP Works

Traditional Stable Diffusion workflows require a text prompt that gets encoded into a conditioning vector by CLIP. ZeroCLIP replaces this entire text pipeline with deterministic mathematical operations:

You provide integer seeds — a concept_id, style_id, mood_salt, and world_seed — packed into a seed tuple.
The seed maps to a conditioning vector through one of four strategies (A/B/C/D), each offering different trade-offs between setup cost, runtime speed, and visual character.
The conditioning vector is standard CLIP-compatible output — it plugs directly into KSampler as positive conditioning, just like text-encoded conditioning would.

The result: the same seed always produces the same image, on any machine, in any session, with no text encoder required at inference time. Seeds can be swept, gridded, blended, and explored procedurally.

Which Variant Should I Use?

Variant	Artifacts Needed	Runtime Speed	Visual Character	Best For
A	Anchor library (`.npy`)	Fast	Smooth blends of known concepts	General-purpose text-free conditioning
B	MLP checkpoint (`.pt`)	Fastest (microseconds)	Continuous manifold navigation	Minimal runtime footprint
C pure	None	Fast	Unpredictable, pure prior resonance	Zero-setup exploration
C guided	Projection matrix (`.npz`)	Fast	Near-language, semantically plausible	Structured entropy
D	Bootstrap anchors (`.npy`)	Fast	Model-discovered semantics	Research into text-free priors

📚 Citation

Academic Citation

If you use this codebase in your research or project, please cite:

@software{zeroclip,
  title = {ZeroCLIP: Text-Free Deterministic CLIP Conditioning for Stable Diffusion},
  author = {Drift Johnson},
  year = {2025},
  url = {https://github.com/MushroomFleet/ComfyUI-ZeroCLIP-nodes},
  version = {1.0.0}
}

Donate:

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mushroomfleet/zeroclip

Base model

stabilityai/stable-diffusion-xl-base-1.0

Finetuned

(1180)

this model