Pose Estimation
Learn how to easily train an AI to estimate the pose of all the objects and beings present within an image or video.
Last updated
Learn how to easily train an AI to estimate the pose of all the objects and beings present within an image or video.
Last updated
Overview
Before following this guide on Pose Estimation, make sure you completed the steps in the Get Started guide.
Go to the Datasets section of Theos and click the + button inside the new image dataset card, to add a new image dataset. Write a name for your new dataset and click confirm.
Pick one example image you will use to create your template skeleton class. The first skeleton you make will serve as template for all other skeletons you label in the future for this particular class, that then you will have to modify for it to fit that particular object pose.
Make sure this image is representative of the class, in our example we will use the person class, so we chose an image of a man in a straight position, so we get a good start when labeling all future objects.
One you dropped this image, click start upload to upload this image.
Click the Start labeling button on the top right corner to start teaching your AI.
Click the + button to add a new class.
Write your class name and change its color.
Click on your class from the list and then click the pose estimation tool on the top bar.
Click the + button to enter the new keypoint tool.
Click on the image to set all the keypoints you want your model to detect.
Click on the new connection tool button.
Click on the color picker to change the color of the new connection.
Click on keypoint A and then click the keypoint B to create the new connection.
Finish adding all connections and then confirm your skeleton.
Click and drag your mouse to create a new bounding box around your object.
Now that your base skeleton is created, you have to make a bounding box around your objects and move the default keypoints to match the pose of this particular object.
Upload the rest of your images, we recommend starting with 250 images, label ALL of them before training an initial model. If you need more accuracy, just upload another 250 images and repeat the process. With more images your model will be more accurate.
You must label ALL your dataset images before training, because otherwise you will confuse your model and hurt its performance. Don't add just 30 images because it won't work at all, label at least 250 images train and keep adding more if needed.
Go to the train section of Theos AI and create a new training session.
Select your dataset.
Select Pose Estimation from the menu.
Select the algorithm version to use.
Click next and connect the Machine that will perform the neural network training (if you pay one of our professional plans you will have a cloud GPU always connected and ready to use).
Select your machine.
Click create and enter your new training session.
Epochs are the number of times the model will try to predict all the images in your dataset. If you set 300 epochs that means that the model will predict 300 times each image in your dataset, getting better and better with each iteration. Leaving 300 for an initial training is okay if you have 250 to 1K+ images. You can always create a new experiment and set as initial weights the ones created in the previous experiment to resume the training.
Batch size is the number of images that the model will predict in parallel, if you set 64 it means that your model will process 64 images at the same time, thus speeding up the training process, but requiring more GPU memory. We recommend leaving the default value.
Click start training and wait for the training metrics to come in to your browser.
The main metric you want to watch is the fitness and the Pose Estimation mAP.
Fitness is an overall metric that encompasses many other metrics, it should always go up.
Pose estimation mAP is a measure of how well the keypoints are predicted by the model.
Go to the Deploy section of Theos AI and create a new deployment, select the algorithm you used and the weights to deploy.
Go to the deployment's playground tab and drop a test image.
Click detect and see the results.
Start using your AI in your software by making simple HTTP post requests to your deployment's URL.
The request has 6 possible fields:
image (required): the binary data of an image or video frame.
conf_thres (optional): is the Minimum confidence value configurable in the Playground, possible values go from 0 to 1.
iou_thres (optional): is the Detection recall value configurable in the Playground, possible values go from 0 to 1.
ocr_model (optional): is the Text Recognition Model value configurable in the Playground, possible values are small, medium or large
.
ocr_classes (optional): the class names on which to perform OCR on, they are comma separated. For example: license-plate, billboard, signature
.
ocr_language (optional): if the ocr_model is small it is possible to set the target language for reading special characters of particular languages. If unspecified, the default language is English. See the language code list to find your language of choice. Example for reading German: "ocr_language":"deu"
.
For Linux and MacOS.
For Windows.
We will use the requests package to make HTTP post requests.
Add the following code to your software to send an image to your AI and receive back its detections.
We will use the axios package to make HTTP post requests.
Finally, create the following component and import it in your app to send an image to your AI and receive back its detections.
Now you should continue to add more examples to your dataset and retrain your AI to improve its accuracy. After you deployed your AI, you can use our magical Autolabeler to label new images 100 times faster. Let your AI help you create a better version of itself.