File size: 2,591 Bytes
b1f14a6
 
0f4a0dc
 
2ff743f
0f4a0dc
 
 
 
 
 
b1f14a6
 
2ff743f
b1f14a6
0f4a0dc
b1f14a6
0f4a0dc
 
 
 
b1f14a6
2ff743f
b1f14a6
2ff743f
b1f14a6
0f4a0dc
b1f14a6
2ff743f
 
 
 
 
894ad96
2ff743f
b1f14a6
0f4a0dc
b1f14a6
2ff743f
b1f14a6
2ff743f
b1f14a6
0f4a0dc
b1f14a6
2ff743f
 
 
 
b1f14a6
2ff743f
b1f14a6
2ff743f
b1f14a6
2ff743f
b1f14a6
0f4a0dc
 
 
 
b1f14a6
2ff743f
b1f14a6
2ff743f
b1f14a6
2ff743f
 
0f4a0dc
 
894ad96
0f4a0dc
 
 
 
 
 
 
 
b1f14a6
0f4a0dc
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
library_name: diffusers
license: mit
pipeline_tag: image-to-image
tags:
- computed-tomography
- ct-reconstruction
- diffusion-model
- inverse-problems
- dm4ct
- sparse-view-ct
---

# Pixel Diffusion UNet – LoDoInd (DM4CT)

This repository contains the pretrained **pixel-space diffusion UNet** used in the benchmark study **DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)**.

- **Paper:** [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589)
- **ArXiv:** [https://arxiv.org/abs/2602.18589](https://arxiv.org/abs/2602.18589)
- **Project Page:** [https://dm4ct.github.io/DM4CT/](https://dm4ct.github.io/DM4CT/)
- **Codebase:** [https://github.com/DM4CT/DM4CT](https://github.com/DM4CT/DM4CT)

---

## 🔬 Model Overview

This model learns a **prior over CT reconstruction images** using a denoising diffusion probabilistic model (DDPM). It operates directly in **pixel space** (not latent space).

- **Architecture**: 2D UNet (Diffusers `UNet2DModel`)
- **Input resolution**: 512 × 512
- **Channels**: 1 (grayscale CT slice)
- **Training objective**: ε-prediction (standard DDPM formulation)
- **Noise schedule**: Linear beta schedule
- **Training dataset**: Industry CT dataset (LoDoInd)
- **Intensity normalization**: Rescaled to (-1, 1)

This model is intended to be combined with data-consistency correction for CT reconstruction tasks.

---

## 📊 Dataset: LoDoInd

Source: [LoDoInd on Zenodo](https://zenodo.org/records/10391412)

Preprocessing steps:
- Train/test split
- Rescale reconstructed slices to (-1, 1)
- No geometry information is embedded in the model

The model learns an unconditional image prior over CT slices.

---

## 🧠 Training Details

- **Optimizer:** AdamW
- **Learning rate:** 1e-4
- **Hardware:** NVIDIA A100 GPU
- **Training script:** [train_pixel.py](https://github.com/DM4CT/DM4CT/blob/main/train_pixel.py)

---

## 🚀 Usage

```python
from diffusers import DDPMPipeline

# Load the pipeline
pipeline = DDPMPipeline.from_pretrained("jiayangshi/lodoind_pixel_diffusion")
pipeline.to("cuda")

# Generate a CT slice prior
image = pipeline().images[0]
image.save("generated_ct_slice.png")
```

---

## Citation

```bibtex
@inproceedings{shi2026dmct,
title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction},
author={Shi, Jiayang and Pelt, Dani{\"{e}}l M and Batenburg, K Joost},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=YE5scJekg5}
}
```