| --- |
| library_name: diffusers |
| license: mit |
| pipeline_tag: image-to-image |
| tags: |
| - computed-tomography |
| - ct-reconstruction |
| - diffusion-model |
| - inverse-problems |
| - dm4ct |
| - sparse-view-ct |
| --- |
| |
| # Pixel Diffusion UNet β LoDoInd (DM4CT) |
|
|
| This repository contains the pretrained **pixel-space diffusion UNet** used in the benchmark study **DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)**. |
|
|
| - **Paper:** [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589) |
| - **ArXiv:** [https://arxiv.org/abs/2602.18589](https://arxiv.org/abs/2602.18589) |
| - **Project Page:** [https://dm4ct.github.io/DM4CT/](https://dm4ct.github.io/DM4CT/) |
| - **Codebase:** [https://github.com/DM4CT/DM4CT](https://github.com/DM4CT/DM4CT) |
|
|
| --- |
|
|
| ## π¬ Model Overview |
|
|
| This model learns a **prior over CT reconstruction images** using a denoising diffusion probabilistic model (DDPM). It operates directly in **pixel space** (not latent space). |
|
|
| - **Architecture**: 2D UNet (Diffusers `UNet2DModel`) |
| - **Input resolution**: 512 Γ 512 |
| - **Channels**: 1 (grayscale CT slice) |
| - **Training objective**: Ξ΅-prediction (standard DDPM formulation) |
| - **Noise schedule**: Linear beta schedule |
| - **Training dataset**: Industry CT dataset (LoDoInd) |
| - **Intensity normalization**: Rescaled to (-1, 1) |
|
|
| This model is intended to be combined with data-consistency correction for CT reconstruction tasks. |
|
|
| --- |
|
|
| ## π Dataset: LoDoInd |
|
|
| Source: [LoDoInd on Zenodo](https://zenodo.org/records/10391412) |
|
|
| Preprocessing steps: |
| - Train/test split |
| - Rescale reconstructed slices to (-1, 1) |
| - No geometry information is embedded in the model |
|
|
| The model learns an unconditional image prior over CT slices. |
|
|
| --- |
|
|
| ## π§ Training Details |
|
|
| - **Optimizer:** AdamW |
| - **Learning rate:** 1e-4 |
| - **Hardware:** NVIDIA A100 GPU |
| - **Training script:** [train_pixel.py](https://github.com/DM4CT/DM4CT/blob/main/train_pixel.py) |
|
|
| --- |
|
|
| ## π Usage |
|
|
| ```python |
| from diffusers import DDPMPipeline |
| |
| # Load the pipeline |
| pipeline = DDPMPipeline.from_pretrained("jiayangshi/lodoind_pixel_diffusion") |
| pipeline.to("cuda") |
| |
| # Generate a CT slice prior |
| image = pipeline().images[0] |
| image.save("generated_ct_slice.png") |
| ``` |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{shi2026dmct, |
| title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction}, |
| author={Shi, Jiayang and Pelt, Dani{\"{e}}l M and Batenburg, K Joost}, |
| booktitle={The Fourteenth International Conference on Learning Representations}, |
| year={2026}, |
| url={https://openreview.net/forum?id=YE5scJekg5} |
| } |
| ``` |