Overview - Document Intelligence

Check out the slides and videos of our NeurIPS 2023 workshop

 

The objective of the Privacy Preserving Federated Learning Document VQA (PFL-DocVQA) competition is to develop privacy-preserving solutions for fine-tuning multi-modal language models for document understanding on distributed data. We seek efficient federated learning solutions for finetuning a pre-trained generic Document Visual Question Answering (DocVQA) model on a new domain, that of invoice processing.

Automatically managing the information of document workflows is a core aspect of business intelligence and process automation. Reasoning over the information extracted from documents fuels subsequent decision-making processes that can directly affect humans, especially in sectors such as finance, legal or insurance. At the same time, documents tend to contain private information, restricting access to them during training. This common scenario requires training large-scale models over private and widely distributed data.

invoicing_scenario.png

 

The participating teams will create methods to train Document Visual Question Answering models on the provided documents with privacy guarantees, using a federated-learning set-up. The competition is structured in 2 different tracks:

  • Track 1 - Federated Learning only: The methods will be trained within a federated learning framework, simulating the need for cooperation between different entities to achieve the best performing model in the most efficient way. Track 1 participant's objective is to reduce the communication used (#bytes), while achieving a comparable performance with the baseline.
  • Track 2 - Federated Learning + Privacy-preserving: In this track, in addition to training over distributed data, we seek to protect the identity of providers that could be exposed to textual (provider company name) or visual (logo, presentation) information. If a malicious competitor (adversary) manages to infer information about a company's providers, it could have a direct impact on the company's business.

 

PFL-DocVQA Workshop at NeurIPS 2023

We will host a half-day workshop at NeurIPS 2023 on Friday 15 Dec., from 7 a.m. to 10 a.m. PST in hybrid format.

  • Competition workshop link

 

Prizes

ELSA will reimburse travel costs of up to 3000 EUR, and provide a complimentary registration for a single representative from the top competition teams to facilitate their participation to the NeurIPS workshop.

 

Contact information

For any question about this challenge, please contact info_pfl@cvc.uab.cat 

 

Publications

  1. Rubèn Tito, Khanh Nguyen, Marlon Tobaben, Raouf Kerkouche, Mohamed Ali Souibgui, Kangsoo Jung, Lei Kang, Ernest Valveny, Antti Honkela, Mario Fritz, and Dimosthenis Karatzas. Privacy-Aware Document Visual Question Answering on arXiv 2023 [arxiv:2312.10108]

 

Citation

If you use this dataset or code, please cite our paper.

@article{tito2023privacy,
  title={Privacy-Aware Document Visual Question Answering},
  author={Tito, Rub{\`e}n and Nguyen, Khanh and Tobaben, Marlon and Kerkouche, Raouf and Souibgui, Mohamed Ali and Jung, Kangsoo and Kang, Lei and Valveny, Ernest and Honkela, Antti and Fritz, Mario and Karatzas, Dimosthenis},
  journal={arXiv preprint arXiv:2312.10108},
  year={2023}
}

Important Dates

November 15, 2023: Winning teams announced.

November 1, 2023: Privacy proof reports due for Track 2 participant teams.

October 27, 2023: End of the competition. Submission data deadline. 

June 30, 2023: Release of training and validation splits.

June 15, 2023: Competition registration opens.