Datasets

Make API calls to manage your datasets.

Overview

Copy your Project Key and your API Token from your project settings and replace them on the API calls you want to use.

Create a new dataset

There is 1 possible field on this request.

  • name (required): the name of your new dataset.

curl https://api.theos.ai/v1/project/<project_key>/datasets/create/ \
     -F "name=Vehicles" \
     -H "Authorization: Token <api_token>" \
     -X POST

List all your datasets

curl https://api.theos.ai/v1/project/<project_key>/datasets/ \
     -H "Authorization: Token <api_token>"

Get a specific dataset

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/ \
     -H "Authorization: Token <api_token>"

Delete a specific dataset

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/delete/ \
     -H "Authorization: Token <api_token>" \
     -X POST

Upload images to your dataset

There is 1 possible field on this request.

  • manifest_url (required): the URL of a manifest JSON file containing the URLs of the images to upload along with the dataset split in which it should be placed on (train, valid or test). The maximum manifest json file size is 100MB and 100k (100,000) images. If you need to upload more images, please wait for the previous 100k images to finish uploading before sending a new upload task.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/upload/ \
     -F "manifest_url=<manifest_url>" \
     -H "Authorization: Token <api_token>" \
     -X POST

The manifest JSON file must have the following structure.

[
    {
        "url":"https://your-publicly-accessible-bucket.com/obscure-path/image1.jpg",
        "split":"train"
    },
    {
        "url":"https://your-publicly-accessible-bucket.com/obscure-path/image2.jpg",
        "split":"valid"
    },
    {
        "url":"https://your-publicly-accessible-bucket.com/obscure-path/image3.jpg",
        "split":"test"
    }
]

You can optionally add an external_id field on each image so you can use it as a reference for later operations instead of using our internal IDs.

[
    {
        "url":"https://your-publicly-accessible-bucket.com/obscure-path/image1.jpg",
        "split":"train",
        "external_id":"d03ae221-523a-4edd-a525-64357af0bd5c"
    }
]

Keep in mind that this is a time-consuming task, as the Theos backend has to download, downscale (for thumbnails) and upload all your images to our infrastructure.

You will get as a response a task_id to follow the progress of your upload task.

curl https://api.theos.ai/v1/tasks/<task_id>/ \
     -H "Authorization: Token <api_token>"

Stop the upload

If you want to stop the upload task for any reason, you can run the following request.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/upload/stop/ \
     -H "Authorization: Token <api_token>" \
     -X POST

Create a new class on your dataset

There are 3 possible fields on this request.

  • name (required): the name of your new class.

  • color (required): the hexadecimal color of your new class.

  • with_text (optional): whether or not this class will be used for OCR labeling. Possible values are true or false, if not specified it will be false by default.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/classes/create/ \
     -F "name=Car" \
     -F "color=#4AFFBA" \
     -F "with_text=false" \
     -H "Authorization: Token <api_token>" \
     -X POST

Delete classes on your dataset

There is 1 possible field on this request.

  • db_ids (required): a list of dataset classes DB IDs to delete (1 or more). They're called db_id because the id field on classes refer to their order index, and db_id is their real ID in our database.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/classes/delete/ \
     -F "db_ids=[1,4,6]" \
     -H "Authorization: Token <api_token>" \
     -X POST

Keep in mind that this is a time-consuming task, as the Theos backend has to also delete all the labels of each deleted class from all of your dataset images.

You will get as a response a task_id to follow the progress of your deletion task.

curl https://api.theos.ai/v1/tasks/<task_id>/ \
     -H "Authorization: Token <api_token>"

Edit a class on your dataset

There are 2 possible fields on this request.

  • name (optional): the new name of your class.

  • color (optional): the new hexadecimal color of your class.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/classes/<class_db_id>/edit/ \
     -F "name=Ferrari" \
     -F "color=#4AFFBA" \
     -H "Authorization: Token <api_token>" \
     -X POST

Merge classes on your dataset

There are 3 possible fields on this request.

  • db_ids (required): a list of dataset classes DB IDs to merge (2 or more).

  • color (required): the hexadecimal color of your new merged class.

  • name (required): the name of your new merged class.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/classes/merge/ \
     -F "db_ids=[1,4,6]" \
     -F "name=Truck" \
     -F "color=#A8327F" \
     -H "Authorization: Token <api_token>" \
     -X POST

Keep in mind that this is a time-consuming task, as the Theos backend has to replace all the classes you specified with the new merged class in all of the images of your dataset.

You will get as a response a task_id to follow the progress of your edition task.

curl https://api.theos.ai/v1/tasks/<task_id>/ \
     -H "Authorization: Token <api_token>"

Get dataset examples

In Theos, dataset examples are the pair of image + labels. An example must always have an image, but may not have labels yet.

There are 2 possible fields in this request.

  • split (optional): the split of the dataset to get examples from. Possible values are train, valid and test. If not specified the default value is train.

  • page (optional): the page number from which to fetch examples (like on the dataset overview of the Theos UI). Each page contains a maximum of 100 images. Possible values range from 1 to N, where N = (number of examples on the split)/100. When you fetch the first page you will get as part of the response the page_count so you can know how many pages you can go though on your chosen split.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/examples/ \
     -F "split=train" \
     -F "page=1" \
     -H "Authorization: Token <api_token>"

Get a specific example

There are 2 possible requests. The first one is using our internal example_id.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/examples/<example_id>/ \
     -H "Authorization: Token <api_token>"

The second one is using the external_id that you specified when uploading your image.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/examples/external/<external_example_id>/ \
     -H "Authorization: Token <api_token>"

Set the labels of a specific example

There is 1 possible field on this request.

  • labels (required): the labels to set. They must be in Theos JSON format.

Can be called by specifying our internal example_id.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/examples/<example_id>/labels/set/ \
     -F "labels=<theos_json_labels>" \
     -H "Authorization: Token <api_token>" \
     -X POST

Or by specifying your external_id.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/examples/external/<external_example_id>/labels/set/ \
     -F "labels=<theos_json_labels>" \
     -H "Authorization: Token <api_token>" \
     -X POST

Skip a specific example

When labeling, some images won't contain any objects to be labeled, but to mark them as "labeled" you must skip them.

Can be called by specifying our internal example_id.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/examples/<example_id>/skip/ \
     -H "Authorization: Token <api_token>" \
     -X POST

Or by specifying your external_id.

curl https://api.theos.ai/v1/project/<project_key>/datasets/<dataset_id>/examples/external/<external_example_id>/skip/ \
     -H "Authorization: Token <api_token>" \
     -X POST

Last updated