Kunal Purkayastha

Datasets

Below are the listed datasets I've worked on so far.

CTW-1500

Description: The SCUT-CTW1500 dataset contains 1,500 images: 1,000 for training and 500 for testing. In particular, it provides 10,751 cropped text instance images, including 3,530 with curved text. The images are manually harvested from the Internet, image libraries such as Google Open-Image, or phone cameras. The dataset contains a lot of horizontal and multi-oriented text.

Source

TotalText

Description: The Total-Text dataset contains 1,555 images with polygonal annotations for text instances in various orientations, including horizontal, multi-oriented, and curved text. It is split into 1,255 training images and 300 testing images. This dataset is essential for evaluating text detection and recognition algorithms in diverse real-world scenarios.

Source