LogoLogo
  • What is Theos AI?
  • Get Started
    • Object Detection
    • Pose Estimation
  • Library
    • Computer Vision
      • Object Detection
      • Semantic Segmentation
      • Image Classification
      • Pose Estimation
      • Face Recognition
    • Natural Language Processing
      • Language Translation
      • Question Answering
      • Sentiment Analysis
      • Text Generation
      • Text Summarization
    • Speech
      • Speech Recognition
      • Voice Cloning
      • Emotion Recognition
      • Speaker Verification
      • Speech Synthesis
  • Datasets
    • Image
      • Upload
      • Classes
      • Labels
        • Bounding Box
          • Labeling
          • Autolabeling
          • Formats
            • Theos JSON
            • COCO JSON
            • Darknet TXT
            • Pascal VOC
    • Text
    • Audio
  • Machines
    • Theos Cloud
    • Google Colab
    • On-Premise
  • Train
  • Deploy
    • OCR Languages
  • Rest API
    • Datasets
    • Machines
    • Train
    • Deploy
Powered by GitBook
On this page
  • Overview
  • Create a new training session
  • Configure your training session
  • Set your experiment's training configuration
  • Start training
  • Wait for the training experiment to finish
  • Monitor training progress and metrics
  • Resume training
  • Training has finished

Was this helpful?

Train

Train your AI to perform the cognitive task of your choice.

PreviousOn-PremiseNextDeploy

Last updated 1 year ago

Was this helpful?

Overview

Train is the fourth section of the platform. Each training session is composed of the neural network's algorithm version taken from the , your , and the that will perform the neural network training. Inside each training session there are of a set of experiments. Each experiment is an attempt of training your AI on your dataset.

Create a new training session

Configure your training session

Choose the algorithm

Choose the dataset

Select the dataset you want your AI to learn from.

Choose the machine

Set your experiment's training configuration

An Epoch is the act of your AI going through the entire dataset and attempting to predict all the labels you created during the labeling process. The first time this happens, your AI will likely fail to correctly predict the class, position and dimensions of almost all your labels. This is why we must let our AI make many attempts, so it will learn from its mistakes. It is common practice to set a few hundred epochs per experiment. For most cases 300 epochs will be fine for an initial training.

The Batch size is the number of images your AI will predict in parallel, the higher this number, the shorter time each epoch will take to complete, but also the more GPU memory it will require. For Theos Cloud Machines, that come with 16GB of GPU memory. If you happen to be in the free plan, you may need to test this value to don't overload your GPU memory. But don't worry, Theos will let you know about this and let you change it so you can restart your training experiment.

Finally, you can also set Initial weights if you want your AI to start with the knowledge of a previously trained AI, instead of starting from scratch. This will make it achieve good accuracy in fewer epochs if the previous knowledge is sufficiently transferable to your current dataset.

Start training

Click the Start training button to make your AI learn from your dataset examples.

Wait for the training experiment to finish

Now you are free for a while, you can go grab a cup of coffee or watch a movie, your AI started training and you just have to wait for it to finish.

Monitor training progress and metrics

If you want, you can check the training progress and metrics once in a while. New metric values will stream directly to your browser once per minute of training, so you can monitor your AI learning in real-time.

The main metric to watch is the fitness of your AI. This represents how good your AI is at predicting the class of your labels, as well as their position and dimensions. Its value goes from 0 to 1, and the higher is better. Generally, a good enough object detector requires a fitness of 0.5 or above. This is the value used to determine if a given weight file is the Best one of the experiment. You can safely ignore most other metrics for now, we will talk about them in a future neural network debugging guide.

Resume training

To resume your training do the following.

  1. Stop the current training by clicking the Stop training button on the bottom right corner.

  2. Delete the previously connected colab machine.

  3. Connect a new machine.

  4. Create a new experiment by clicking the + button on the top left.

  5. Set as initial weights the Last weights from the previous interrupted experiment.

  6. Set the number of epochs you had left to complete in the previous interrupted experiment.

  7. Click the Start training button to resume your training.

Training has finished

Your AI has finished training. You can now review all the training metrics one more time before deploying your AI into production to test it and finally integrate it with your software.

Go to the section of Theos and click the New training button. Write a name for it and click confirm to create your new training session.

Currently Theos supports all the versions of the state-of-the-art Object Detector. The extra large version is the most accurate one, but also demands more computational power, and therefore will take more time to train and will take longer perception time when deployed. For blazing fast, real-time speeds, choose a smaller version.

If you have one of our , choose one of your always connected and ready to use s that come with powerful NVIDIA GPUs for lightning fast training. Otherwise, click the + button inside the add machine card to connect your own on-premise GPU machine or click the Use google colab button to use one of google colab's free GPUs.

At the end of each completed epoch, the machine will upload to Theos a checkpoint of your AI's current knowledge, in what is called a Weights file. The weights are the representation of the strengths of all the neural connections in the brain of your AI. For each experiment, Theos saves the Last epoch's weights as well as the weights generated in the epoch where the Best performance was achieved (because your AI may reach maximum accuracy at, for example, epoch 185, but start to degrade its performance later due to ). Later, when you decide to deploy your AI, you will have to choose which weights you want your AI to use.

If you happen to be using a machine connected from , your training may be interrupted because Google shuts colab instances down after a few hours.

Train
YOLOv5
professional plans
Theos Cloud Machine
Overfitting
Google colab
Library
Dataset
Machine
Train section
Creating a new training session
Choosing the training session's algorithm version
Choosing the training session's dataset
Choosing the training session's default machine
Setting the experiment's training configuration
Start training your AI
Your AI started training
Setting the initial weights from a previous interrupted experiment
Training experiment finished