(11/06/2024): We currently only release the Train-small subset of our dataset, which is about 10% of the full dataset. The full dataset will be released in the coming weeks. The Public version of our evaluation set is also available for download. The instructions to reconstruct the Comprehensive evaluation set will be released in the future.
(11/20/2024): Full training dataset is released. We are also working on releasing the classifier model weights.
(01/13/2025): Filtered training dataset is released.
(01/16/2025): Model weights released.
The dataset is provided for research purposes only. Each image in this dataset has been generated by the models with their respective licenses. Please refer to the metadata for license information.
Train-full: Full dataset with minimal filtering.
Train-filtered: Full dataset filtered using Stable Diffusion Safety Checker (Link (Hugging Face)). Roughly 1.33% of the data is filtered out.
Train-small: Small dataset containing about 10% of the full dataset.
Eval-public: Evaluation set where generated images are paired with MS-COCO dataset.
Metadata: Metadata for the dataset.
Model Weights: Classifier model weights. Please check the GitHub repository for the model code.
Download the dataset by clicking the link below:
Download Train-full (1.1T) Download Train-filtered (1.1T) Download Train-small (77G)