1. Lightning channel segmentation dataset

This dataset contains 20 images, which are captured by total sky imagers or CCD cameras. These images are grouped into four types: 1) classic lightning channels, which have single background and simple shapes; 2) low contrast lightning channels, which have low contrast compared with their bright background; 3) interfered lightning channels, which are accompanied with raindrops or territorial objects; 4) complicated lightning channels, which are characterized with complicated shape.

2. Total-sky cloud image set (TCIS)

This dataset contains 5000 total-sky images, which were obtained from a total-sky cloud imager located in Tibet (29.25 N, 88.88 E). The dataset is developed by the Chinese Academy of Meteorological Sciences and Beijing Jiaotong University. All images are stored in color JPEG format with a resolution of 1392*1040 pixels. Note that these images are rectangular in shape but the whole sky mapped is circular, where the center is the zenith and the horizon is along the border. The images are divided into five sky conditions: cirriform, cumuliform, stratiform, clear sky and mixed cloudiness.

3. RSDDs (Rail surface discrete defects) dataset

The dataset contains two types of dataset: the first is Type-I RSDDs dataset captured from express rails, which has 67 challenging images. The second is Type-II RSDDs dataset captured from common/heavy haul rails, which has 128 challenging images. Note that each image from these two datasets contains at least one defect, and have complex background with much noise.

4. Singapore dataset

This dataset contains 1086 images with resolution of 256*256 pixels, which was constructed from Google Earth, of Singapore. The spatial resolution per pixel is about 0.6 m and all images are stored in color tiff format. The dataset manually labeled contains 9 challenging scene categories (i.e, airplane, forest, harbor, industry, meadow, overpass, residential, river and runway).

5. Chinese Medical Documents Dataset

This dataset contains 357 images that divided into three groups. All images are stored in color JPEG format. The images in the first group have a resolution of 2500*3490 pixels. The images in other two groups have a resolution of 2448*3264 pixels. Each image file is named as “xx_xx_xx”. The first field represents the control variables: scan, illumination and rotation. The second field means whether the reported items are more than 10. The last field is the sequence number under the first two fields.