Diffusion4Sim (CaloDiT)
Description
With this repository you can train and evaluate diffusion models for fast simulation of calorimeter showers. Currently the following methods are supported:
- DDPM1 2 and DDIM3
- EDM4
- DPM-Solver5 and DPM-Solver++6
- Consistency Distillation7
- Easy Consistency Tuning8
It also features:
- DiT9-like transformer architecture
- multi-GPU training with Accelerate
- training progress tracking with Weights & Biases
- conversion to c++ inference compatible forms (ONNX & TorchScript)
- training, distillation, validation and generation pipelines
- ...and more
To train the models, you can use one of the datasets provide in the CaloChallenge. Keep in mind that some changes in the src/data/geometry.py file might be needed to load the data properly.
Note: Model available in Par04 is based on EDM and distilled using consistency distillation.
Installation
# clone repository
git clone https://gitlab.cern.ch/fastsim/diffusion4sim.git
cd diffusion4sim
# create environment and install dependencies
conda create -n diffusion4sim python=3.9
conda activate diffusion4sim
pip install -r requirements.txt
# copy the examplar .env file and fill in the paths and W&B credentials
cp .env-example .env && vim .env
Usage
Training
Modify the appropriate configuration file in the configs/train directory to specify all hyperparameters. Refer configs/train/edm_allegro_scratch.yaml Then, run the training script with the path to the config file as the first argument. You can also specify command line arguments to override the config values. For example:
python scripts/train.py configs/train/edm.yaml experiment.run_name="edm_calo_challenge_dataset2"
The training script will save the model checkpoints, logs and intermediate evaluation results in the specified output_dir directory. You can track the training progress with Weights & Biases by setting the use_wandb parameter in the config file.
To run multi-gpu training, use the following command:
accelerate launch --multi_gpu --num_processes=<num_gpus> scripts/train.py configs/train/edm_allegro_scratch.yaml
For more information, see the Accelerate documentation.
Distillation
Similarly to training, you run distillation using the scripts/distill.py script. The distillation process requires the teacher model checkpoint, which should be specified in the config file as model_path. Currently only the Consistency Distillation7 is supported. Refer configs/distill/cd_allegro_scratch.yaml.
Conversion
Conversion to ONNX or TorchScript models can be done using scripts/convert.py. Refer the file as an example.
Validation
Use the scripts/validate.py to generate showers under the same conditions as in the provided file, and compare with this reference data. The script will save generated showers and create the plots comparing observables of generated samples against reference showers. See the run_evaluation.sh script for an example.
Generation
Use the scripts/generate.py to generate showers from the trained model under the specified conditions. The generated showers will be saved in the output_dir directory in the .h5 format for further analysis.
Multi-geometry training and adaptation
Training
If you need to pretrain on multiple geometries, follow the instructions below. A couple of changes to inform about the geometry is needed, mostly in the .yaml files. See configs/train/edm_multi.yaml for an example.
- Change the way data files are described both during training and testing. Use
[geometry_name, file_path]instead offile_path. - Setup
need_geo_condn&train_on. In model architecture, changeconditions_size. This should be one more than the length oftrain_onif you are interested in finetuning over a new detector later. - Recommended to use the standarization based on all of your pretraining geometries.
Distillation: Make the above changes for the distillation .yaml file and mention the teacher checkpoint of the pretrained model as stated previously. Refer configs/distill/cd_multi.yaml
Adaptation
You would need a pretrained model to finetune. Get one from this link or following the previous steps. Adaptation is done by re-training the pretrained model using a low learning rate. So .yaml file will be same as the one used in multi-geometry training, except:
- Don't use a very high learning rate. Anything less than or equal to 1e-3 should be fine.
- Fewer training steps, i.e.,
max_stepsand respective lr_schedular args. - Data files. Keep
train_onthe same as before. Any detector geometry not in the list will be the last one, i.e.,[0 0 ... 1]. - Mention the checkpoint on the pretrained model as
model_pathundermodelparams. - Refer to
configs/train/edm_allegro_adapt.yaml
Distillation:
- If you are not doing the pretraining yourselves, get the pretrained distilled model from this link.
- Mention checkpoints in .yaml file for the distillation. Teacher checkpoint is pretrained EDM model. Student checkpoint is pretrained distilled model.
- Make sure
init_student_from_teacherisfalse. - Refer
configs/distill/cd_allegro_adapt.yamlfor more details.
Credits
This repository is derived from the Diffusion4FastSim, however it also benefits a lot from the code implemented in the following repositories:
-
Ho, Jonathan, Ajay Jain, and Pieter Abbeel. "Denoising diffusion probabilistic models." Advances in neural information processing systems 33 (2020): 6840-6851. ↩
-
Nichol, Alexander Quinn, and Prafulla Dhariwal. "Improved denoising diffusion probabilistic models." International conference on machine learning. PMLR, 2021. ↩
-
Song, Jiaming, Chenlin Meng, and Stefano Ermon. "Denoising diffusion implicit models." arXiv preprint arXiv:2010.02502 (2020). ↩
-
Karras, Tero, et al. "Elucidating the design space of diffusion-based generative models." Advances in neural information processing systems 35 (2022): 26565-26577. ↩
-
Lu, Cheng, et al. "Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps." Advances in Neural Information Processing Systems 35 (2022): 5775-5787. ↩
-
Lu, Cheng, et al. "Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models." arXiv preprint arXiv:2211.01095 (2022). ↩
-
Song, Yang, et al. "Consistency models." arXiv preprint arXiv:2303.01469 (2023). ↩ ↩2
-
Geng, Zhengyang, et al. "Consistency Models Made Easy." arXiv preprint arXiv:2406.14548 (2024). ↩
-
Peebles, William, and Saining Xie. "Scalable diffusion models with transformers." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023. ↩