Results - Media Analytics - ELSA Benchmarks Platform

method: deepfakedetection2024-06-01

Authors: Chuangchuang Tan

Affiliation: Beijing Jiaotong University

Email: chuangchuangtan@aliyun.com

Description: deepfakedetection

Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection

Source code

method: Swin Transformer DCT2023-09-04

Authors: Davide Alessandro Coccomini, Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro

Affiliation: ISTI-CNR

Email: davidealessandro.coccomini@isti.cnr.it

Description: We fine-tuned a Swin Transformer Base pre-trained on Imagenet on the provided training set. The training images underwent heavy random data augmentation in the training phase (inspired by [1]) to spur the models to generalize better. Since the images generated by Diffusion Models are known to introduce noise, the models could be made to overfit by learning to recognize it exclusively. To avoid this, among the various transformations applied to images, there are many noise addition and compression techniques, even in combination. Also, some random rotation, brightness, crops, dropouts, resize and many other manipulations are applied to boost generalization.
During the training process the images are also transformed in the DCT domain since with a probability of 50% since, as shown in [2], this should emphasize the artifacts.

In order to choose the best model we also created a custom Validation Set composed of real images taken from Flickr Dataset and images generated by GANs (ProGAN, StyleGAN, StyleGAN2 and RelGAN) and with Diffusion Models (Stable Diffusion and GLIDE) inspired by "Detecting Images generated by Diffusers".

Coccomini, D.A.; Zilos, G.K., Caldelli, R.; Falchi, F.; Amato G.; S. Papadopoulos; Gennaro, C.G. MINTIME: Multi-Identity Size-Invariant Video Deepfake Detection, Arxiv, 2022

Guarnera, L.; Giudice, O.; Guarnera, F.; Ortis, A.; Puglisi, G.; Paratore, A.; Bui, L.M.Q.; Fontani, M.; Coccomini, D.A.; Caldelli, R.; et al. The Face Deepfake Detection Challenge. J. Imaging 2022, 8, 263.

Source code

method: Swin Transformer + Swin Transformer DCT2023-08-31

Authors: Davide Alessandro Coccomini, Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro

Description: We fine-tuned two Deep Learning models pretrained on Imagenet. Specifically a two Swin Transformer Base. The images underwent heavy random data augmentation in the training phase (inspired by [1]) to spur the models to generalize better. Since the images generated by Diffusion Models are known to introduce noise, the models could be made to overfit by learning to recognize it exclusively. To avoid this, among the various transformations applied to images, there are many noise addition and compression techniques, even in combination. Also, some random rotation, brightness, crops, dropouts, resize and many other manipulations are applied to boost generalization.
During the training process of one of the two Swin Transformers, the images are also transformed in the DCT domain since with a probability of 50% since, as shown in [1], this should emphasize the artifacts.
Both the models are used to make a prediction on each image in the test set and the final prediction is the mean of the two predictions.

In order to choose the best model we also created a custom Validation Set composed of real images taken from Flickr Dataset and images generated by GANs (ProGAN, StyleGAN, StyleGAN2 and RelGAN) and with Diffusion Models (Stable Diffusion and GLIDE) inspired by "Detecting Images generated by Diffusers".

Giudice, O.; Guarnera, L.; Battiato, S. Fighting Deepfakes by Detecting GAN DCT Anomalies. J. Imaging 2021, 7, 128.

Coccomini, D.A.; Zilos, G.K., Caldelli, R.; Falchi, F.; Amato G.; S. Papadopoulos; Gennaro, C.G. MINTIME: Multi-Identity Size-Invariant Video Deepfake Detection, Arxiv, 2022

Source code

Ranking Table

Description Paper Source Code

		Metrics
Date	Method	f1_score
2024-06-01	deepfakedetection	0.98926487283156
2023-09-04	Swin Transformer DCT	0.97725668575014
2023-08-31	Swin Transformer + Swin Transformer DCT	0.97365746892832
2023-08-24	Swin Transformer	0.97105355677956
2023-08-24	Swin Transformer + Resnet50 DCT	0.95234775873754
2023-08-22	Resnet50 + Swin Transformer	0.94966915523661
2023-09-28	CNN detection with Multi-modal	0.88971233544612
2023-09-08	Basic	0.80222598068634
2023-09-08	MiniVGG	0.8006292644557
2023-10-27	First Submission	0.79736329918108
2023-09-02	Baseline	0.77303002356799
2023-09-10	Task1 testing submission	0.68246036940662
2023-08-26	swin baseline	0.20702247191011
2023-09-25	grag 2epoch	0.13617305480316
2023-09-25	grag 3epoch	0.063666215955186
2023-09-25	grag 5epoch	0.059210526315789
2023-09-25	grag 4epoch	0.035153797865662
2023-08-02	Random	0
2023-08-24	Random	0
2023-08-24	Random 01	0
2023-08-24	Random 02	0

Media Analytics

Inactive evaluations

method: deepfakedetection2024-06-01

method: Swin Transformer DCT2023-09-04

method: Swin Transformer + Swin Transformer DCT2023-08-31

Ranking Table

Ranking Graphic