Text Corpora


Fact Checking Dataset

Sentiment Analysis Datasets

Czech Text Document Corpus

Czech Historical Named Entity Corpus

Posel od Cerchova

OCR Corpora and Tools

Text processing logo OCR logo

Image Corpora


ChronSeg: Dataset for Segmentation of Handwritten Historical Chronicles

Heimatkunde: Dataset for Multi-modal Historical Document Analysis

COMICORDA: Dialogue Act Recognition in Comic Books

Unconstrained Facial Images: Database for Face Recognition under Real-world Conditions

Img processing logo Faces logo

Historical Maps Corpora


Historical Map Dataset v 1.0: Dataset for Detection and Segmentation tasks in Historical Maps

Historical Map Dataset v 2.0: Extended Dataset for Detection and Segmentation tasks in Historical Cadastral Maps

Nomenclature Dataset: Dataset for Detection and Recognition of Handwritten Nomenclatures and toponyms from Historical Cadastral Maps

Map processing logo Map processing logo