File size: 2,591 Bytes
b1f14a6 0f4a0dc 2ff743f 0f4a0dc b1f14a6 2ff743f b1f14a6 0f4a0dc b1f14a6 0f4a0dc b1f14a6 2ff743f b1f14a6 2ff743f b1f14a6 0f4a0dc b1f14a6 2ff743f 894ad96 2ff743f b1f14a6 0f4a0dc b1f14a6 2ff743f b1f14a6 2ff743f b1f14a6 0f4a0dc b1f14a6 2ff743f b1f14a6 2ff743f b1f14a6 2ff743f b1f14a6 2ff743f b1f14a6 0f4a0dc b1f14a6 2ff743f b1f14a6 2ff743f b1f14a6 2ff743f 0f4a0dc 894ad96 0f4a0dc b1f14a6 0f4a0dc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 | ---
library_name: diffusers
license: mit
pipeline_tag: image-to-image
tags:
- computed-tomography
- ct-reconstruction
- diffusion-model
- inverse-problems
- dm4ct
- sparse-view-ct
---
# Pixel Diffusion UNet – LoDoInd (DM4CT)
This repository contains the pretrained **pixel-space diffusion UNet** used in the benchmark study **DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)**.
- **Paper:** [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589)
- **ArXiv:** [https://arxiv.org/abs/2602.18589](https://arxiv.org/abs/2602.18589)
- **Project Page:** [https://dm4ct.github.io/DM4CT/](https://dm4ct.github.io/DM4CT/)
- **Codebase:** [https://github.com/DM4CT/DM4CT](https://github.com/DM4CT/DM4CT)
---
## 🔬 Model Overview
This model learns a **prior over CT reconstruction images** using a denoising diffusion probabilistic model (DDPM). It operates directly in **pixel space** (not latent space).
- **Architecture**: 2D UNet (Diffusers `UNet2DModel`)
- **Input resolution**: 512 × 512
- **Channels**: 1 (grayscale CT slice)
- **Training objective**: ε-prediction (standard DDPM formulation)
- **Noise schedule**: Linear beta schedule
- **Training dataset**: Industry CT dataset (LoDoInd)
- **Intensity normalization**: Rescaled to (-1, 1)
This model is intended to be combined with data-consistency correction for CT reconstruction tasks.
---
## 📊 Dataset: LoDoInd
Source: [LoDoInd on Zenodo](https://zenodo.org/records/10391412)
Preprocessing steps:
- Train/test split
- Rescale reconstructed slices to (-1, 1)
- No geometry information is embedded in the model
The model learns an unconditional image prior over CT slices.
---
## 🧠 Training Details
- **Optimizer:** AdamW
- **Learning rate:** 1e-4
- **Hardware:** NVIDIA A100 GPU
- **Training script:** [train_pixel.py](https://github.com/DM4CT/DM4CT/blob/main/train_pixel.py)
---
## 🚀 Usage
```python
from diffusers import DDPMPipeline
# Load the pipeline
pipeline = DDPMPipeline.from_pretrained("jiayangshi/lodoind_pixel_diffusion")
pipeline.to("cuda")
# Generate a CT slice prior
image = pipeline().images[0]
image.save("generated_ct_slice.png")
```
---
## Citation
```bibtex
@inproceedings{shi2026dmct,
title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction},
author={Shi, Jiayang and Pelt, Dani{\"{e}}l M and Batenburg, K Joost},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=YE5scJekg5}
}
``` |