Release of Diffusion-generated Deepfake Detection dataset (D3)

The current deepfake detection datasets lack diversity in terms of image generators and are insufficient in terms of quantity. To address this limitation, we have developed and released a new dataset named the Diffusion-generated Deepfake Detection (D3) dataset. This dataset comprises almost 2.3 million records and 11.5 million images. Each record includes a prompt, a genuine image, and four images produced by various generators. Both prompts and authentic images are sourced from LAION-400M, while the fake images are generated using different text-to-image generators. This dataset aims to facilitate the training of deepfake detection methods from the ground up.

More details about the D3 training dataset are available here.