Theos JSON
Our official bounding box label format.
Overview
In the Theos JSON format, each dataset image has its own labels file. For example, if we have an image named image.jpg we should also have an image.json file where the labels in this image will be stored. When you import a folder into Theos, all images and labels within all subdirectories must have a unique name, otherwise they will be incorrectly recognized as the same image.
Folder structure
classes.json
train
images
image1.jpg
image2.jpg
image(N).jpg
labels
image3.json
image4.json
image(N+1).json
valid
images
image5.jpg
image6.jpg
image(N+2).jpg
labels
image5.json
image6.json
image(N+2).json
test
images
image7.jpg
image8.jpg
image(N+3).jpg
labels
image7.json
image8.json
image(N+3).json
The classes file
This is where all the classes of the dataset are defined. Each class is composed of its id, name, color, whether if it is_superclass or not (is_superclass should always be false) and whether if it's a class with_text or not (for OCR labeling). The first id must always be 0, and consequent ids must be in order: 0, 1, 2, 3, 4, 5, etc.
Following is an example of a classes.json file.
The label file
This is a file representing all the labels present within a particular image.
einstein.jpg
einstein.json
Each bounding box is composed of its class_id, (x, y) position of its top left point, and its width and height.
If you are also labeling for OCR (optical character recognition) you must add a text field.
Last updated