Object Detection
Learn how to easily train an AI to detect the exact position, dimension, and class of all the objects present within an image or video.
Last updated
Learn how to easily train an AI to detect the exact position, dimension, and class of all the objects present within an image or video.
Last updated
Before following this guide on Object Detection, make sure you completed the steps in the Get Started guide.
Go to the Datasets section of Theos and click the + button inside the new image dataset card, to add a new image dataset. Write a name for your new dataset and click confirm.
Collect example images that contain the classes of objects you want your AI to detect, and drop them into Theos. For almost all use cases, 100 images is a good starting point for an initial training and testing of your AI. You can always upload and label more images to keep increasing the accuracy and performance of your AI.
Click the Start upload button to upload your images to Theos.
Please, don't close this browser tab until the upload finishes and make sure your computer does not go to sleep if you happen to upload a large number of images.
Go to the bottom of the page and click the New class button to add a new class of object you want your AI to detect.
Write the name of the class and pick its color, finally click the Confirm button to create it.
After adding all the classes you want your AI to detect we can see their label statistics, or what is known as your dataset's class balance.
Click the Start labeling button on the top right corner to start teaching your AI what you want it to detect.
Click on the class you want to label or press the shortcut number in your keyboard to select it.
Place your mouse at one of the corners of the object you want to label. Then click and drag your mouse to create a new bounding box, finally release your mouse when you have encapsulated the object in the tightest possible manner.
Labels must encapsulate its object in the most tight and precise way possible, meaning no room has to be left between the bounding box and the contours of the object. If you accidentally made the bounding box bigger or smaller than the object, keep the space key pressed to enter into the transform mode (or click the hand icon in the top left corner) and fix your label. If the object is partially occluded by another object, make your best guess and draw the bounding box up until where you think the whole object contour will likely be.
Perfectly create all the labels you would want your AI to detect in this example image. Make sure no object is kept unlabeled, as this will confuse your AI and it won't perform well in production. This is very important. A single unlabeled object can significantly impact the accuracy of your AI.
After you finished labeling the whole image, press the E shortcut key to submit your labels or click the Submit button in the bottom left corner. If image happens to don't have any objects in it, press the Q shortcut key to skip it, or click the Skip button.
Finish labeling all the images in your dataset. You can inspect your dataset statistics in the bottom of its overview page. You should always strive to have a balanced dataset, meaning that all your classes have roughly the same number of labels. The more labels a class has, the more examples your AI will have to correctly learn from, and if another class has significantly fewer labels, your AI might mislabel it as the class with more labels or don't recognize it at all. But don't worry if the intrinsic distribution of your data does not allow for this. For example, in this use case people will always have twice more eyes than mouthes, noses, and faces.
You are now ready to train your AI. Go to the Train section of Theos and click the New training button. Write a name for it and click confirm to create your new training session.
Each training session is composed of the neural network's algorithm version taken from the Library, your Dataset, and the Machine that will perform the neural network training. Inside each training session there are of a set of experiments. Each experiment is an attempt of training your AI on your dataset.
Currently Theos supports all the versions of the YOLOv5 state-of-the-art Object Detector. The extra large version is the most accurate one, but also demands more computational power, and therefore will take more time to train and will take longer perception time when deployed. For blazing fast, real-time speeds, choose a smaller version.
Select the dataset you want your AI to learn from.
If you have one of our professional plans, choose one of your always connected and ready to use Theos Cloud Machines that come with powerful NVIDIA GPUs for lightning fast training. Otherwise, click the + button inside the add machine card to connect your own on-premise GPU machine or click the Use google colab button to use one of google colab's free GPUs.
An Epoch is the act of your AI going through the entire dataset and attempting to predict all the labels you created during the labeling process. The first time this happens, your AI will likely fail to correctly predict the class, position and dimensions of almost all your labels. This is why we must let our AI make many attempts, so it will learn from its mistakes. It is common practice to set a few hundred epochs per experiment. For most cases 300 epochs will be fine for an initial training.
The Batch size is the number of images your AI will predict in parallel, the higher this number, the shorter time each epoch will take to complete, but also the more GPU memory it will require. For Theos Cloud Machines, that come with 16GB of GPU memory. If you happen to be in the free plan, you may need to test this value to don't overload your GPU memory. But don't worry, Theos will let you know about this and let you change it so you can restart your training experiment.
At the end of each completed epoch, the machine will upload to Theos a checkpoint of your AI's current knowledge, in what is called a Weights file. The weights are the representation of the strengths of all the neural connections in the brain of your AI. For each experiment, Theos saves the Last epoch's weights as well as the weights generated in the epoch where the Best performance was achieved (because your AI may reach maximum accuracy at, for example, epoch 185, but start to degrade its performance later due to Overfitting). Later, when you decide to deploy your AI, you will have to choose which weights you want your AI to use.
Finally, you can also set Initial weights if you want your AI to start with the knowledge of a previously trained AI, instead of starting from scratch. This will make it achieve good accuracy in fewer epochs if the previous knowledge is sufficiently transferable to your current dataset.
Click the Start training button to make your AI learn from your dataset examples.
Now you are free for a while, you can go grab a cup of coffee or watch a movie, your AI started training and you just have to wait for it to finish.
If you want, you can check the training progress and metrics once in a while. New metric values will stream directly to your browser once per minute of training, so you can monitor your AI learning in real-time.
The main metric to watch is the fitness of your AI. This represents how good your AI is at predicting the class of your labels, as well as their position and dimensions. Its value goes from 0 to 1, and the higher is better. Generally, a good enough object detector requires a fitness of 0.5 or above. This is the value used to determine if a given weight file is the Best one of the experiment. You can safely ignore most other metrics for now, we will talk about them in a future neural network debugging guide.
Your AI has finished training. You can now review all the training metrics one more time before deploying your AI into production to test it and finally integrate it with your software.
Go to the Deploy section of Theos and click the New deployment button to deploy your AI into a highly scalable REST API. Write a name for your deployment and click confirm.
Choose the algorithm version you used to train your AI.
Choose which weights you want your AI to use.
Click the Finish button to deploy your AI to a highly scalable REST API. Your AI should be deployed within a few minutes.
Drag and drop an image to Theos and click the Detect button to try your AI.
Start using your AI in your software by making simple HTTP post requests to your deployment's URL.
The request has 6 possible fields:
image (required): the binary data of an image or video frame.
conf_thres (optional): is the Minimum confidence value configurable in the Playground, possible values go from 0 to 1.
iou_thres (optional): is the Detection recall value configurable in the Playground, possible values go from 0 to 1.
ocr_model (optional): is the Text Recognition Model value configurable in the Playground, possible values are small, medium or large
.
ocr_classes (optional): the class names on which to perform OCR on, they are comma separated. For example: license-plate, billboard, signature
.
ocr_language (optional): if the ocr_model is small it is possible to set the target language for reading special characters of particular languages. If unspecified, the default language is English. See the language code list to find your language of choice. Example for reading German: "ocr_language":"deu"
.
For Linux and MacOS.
For Windows.
We will use the requests package to make HTTP post requests.
Add the following code to your software to send an image to your AI and receive back its detections.
We will use the axios package to make HTTP post requests.
Finally, create the following component and import it in your app to send an image to your AI and receive back its detections.
Now you should continue to add more examples to your dataset and retrain your AI to improve its accuracy. After you deployed your AI, you can use our magical Autolabeler to label new images 100 times faster. Let your AI help you create a better version of itself.