- Task 1 - Track 1 – Single-domain training (Models must be trained exclusively on the published Cityscapes dataset) - Method: DINOv2, ViT-L, 8x8 patch size, linear decoder
- Method info
- Samples list
- Per sample details
method: DINOv2, ViT-L, 8x8 patch size, linear decoder2024-08-23
Authors: Tommie Kerssies, Daan de Geus, and Gijs Dubbelman
Affiliation: Eindhoven University of Technology
Email: t.kerssies@tue.nl
Description: Fine-tuning for ~40 epochs on Cityscapes, following the setup described in: "How to Benchmark Vision Foundation Models for Semantic Segmentation?" (https://www.tue-mps.org/benchmark-vfm-ss/)