Pose Estimation

Learn how to easily train an AI to estimate the pose of all the objects and beings present within an image or video.

Overview

Before following this guide on Pose Estimation, make sure you completed the steps in the Get Started guide.

1. Create a new image dataset

Go to the Datasets section of Theos and click the + button inside the new image dataset card, to add a new image dataset. Write a name for your new dataset and click confirm.

2. Drop a single image to upload

Pick one example image you will use to create your template skeleton class. The first skeleton you make will serve as template for all other skeletons you label in the future for this particular class, that then you will have to modify for it to fit that particular object pose.

Make sure this image is representative of the class, in our example we will use the person class, so we chose an image of a man in a straight position, so we get a good start when labeling all future objects.

One you dropped this image, click start upload to upload this image.

3. Start labeling your dataset

Click the Start labeling button on the top right corner to start teaching your AI.

4. Create a new class

Click the + button to add a new class.

Write your class name and change its color.

5. Select the pose estimation tool

Click on your class from the list and then click the pose estimation tool on the top bar.

6. Select the new keypoint tool

Click the + button to enter the new keypoint tool.

7. Add all your keypoints

Click on the image to set all the keypoints you want your model to detect.

8. Select the new connection tool

Click on the new connection tool button.

9. Change the color of the new connection

Click on the color picker to change the color of the new connection.

10. Create the new connection

Click on keypoint A and then click the keypoint B to create the new connection.

11. Confirm your skeleton

Finish adding all connections and then confirm your skeleton.

12. Make a bounding box

Click and drag your mouse to create a new bounding box around your object.

13. Fix your keypoints

Now that your base skeleton is created, you have to make a bounding box around your objects and move the default keypoints to match the pose of this particular object.

14. Upload the rest of your images

Upload the rest of your images, we recommend starting with 250 images, label ALL of them before training an initial model. If you need more accuracy, just upload another 250 images and repeat the process. With more images your model will be more accurate.

15. Finish labeling ALL your dataset images

You must label ALL your dataset images before training, because otherwise you will confuse your model and hurt its performance. Don't add just 30 images because it won't work at all, label at least 250 images train and keep adding more if needed.

16. Create a new training session

Go to the train section of Theos AI and create a new training session.

Select your dataset.

Select Pose Estimation from the menu.

Select the algorithm version to use.

Click next and connect the Machine that will perform the neural network training (if you pay one of our professional plans you will have a cloud GPU always connected and ready to use).

Select your machine.

Click create and enter your new training session.

Epochs are the number of times the model will try to predict all the images in your dataset. If you set 300 epochs that means that the model will predict 300 times each image in your dataset, getting better and better with each iteration. Leaving 300 for an initial training is okay if you have 250 to 1K+ images. You can always create a new experiment and set as initial weights the ones created in the previous experiment to resume the training.

Batch size is the number of images that the model will predict in parallel, if you set 64 it means that your model will process 64 images at the same time, thus speeding up the training process, but requiring more GPU memory. We recommend leaving the default value.

17. Start training your pose estimation model

Click start training and wait for the training metrics to come in to your browser.

The main metric you want to watch is the fitness and the Pose Estimation mAP.

  • Fitness is an overall metric that encompasses many other metrics, it should always go up.

  • Pose estimation mAP is a measure of how well the keypoints are predicted by the model.

18. Deploy your pose estimation model

Go to the Deploy section of Theos AI and create a new deployment, select the algorithm you used and the weights to deploy.

19. Test your pose estimation model

Go to the deployment's playground tab and drop a test image.

Click detect and see the results.

20. Use your AI in your software

Start using your AI in your software by making simple HTTP post requests to your deployment's URL.

The request has 6 possible fields:

  • image (required): the binary data of an image or video frame.

  • conf_thres (optional): is the Minimum confidence value configurable in the Playground, possible values go from 0 to 1.

  • iou_thres (optional): is the Detection recall value configurable in the Playground, possible values go from 0 to 1.

  • ocr_model (optional): is the Text Recognition Model value configurable in the Playground, possible values are small, medium or large.

  • ocr_classes (optional): the class names on which to perform OCR on, they are comma separated. For example: license-plate, billboard, signature.

  • ocr_language (optional): if the ocr_model is small it is possible to set the target language for reading special characters of particular languages. If unspecified, the default language is English. See the language code list to find your language of choice. Example for reading German: "ocr_language":"deu".

Terminal

For Linux and MacOS.

curl PASTE_YOUR_URL_HERE \
     -F "image=@image.jpg" \
     -F "conf_thres=0.25" \
     -F "iou_thres=0.45" \
     -X POST

For Windows.

curl PASTE_YOUR_URL_HERE ^
     -F "image=@image.jpg" ^
     -F "conf_thres=0.25" ^
     -F "iou_thres=0.45" ^
     -X POST

Python

We will use the requests package to make HTTP post requests.

pip install requests

Add the following code to your software to send an image to your AI and receive back its detections.

import requests
import json
import time

URL = '' # copy and paste your URL here
FALLBACK_URL = '' # copy and paste your fallback URL here
IMAGE_PATH = './image.jpg'

def detect(image_path, url=URL, conf_thres=0.25, iou_thres=0.45, ocr_model=None, ocr_classes=None, ocr_language=None, retries=10, delay=0):
    response = requests.post(url, data={'conf_thres':conf_thres, 'iou_thres':iou_thres, **({'ocr_model':ocr_model, 'ocr_classes':ocr_classes, 'ocr_language':ocr_language} if ocr_model is not None else {})}, files={'image':open(image_path, 'rb')})
    if response.status_code in [200, 500]:
        data = response.json()
        if 'error' in data:
            print('[!]', data['message'])
        else:
            return data
    elif response.status_code == 403:
        print('[!] you reached your monthly requests limit. Upgrade your plan to unlock unlimited requests.')
    elif retries > 0:
        if delay > 0:
            time.sleep(delay)
        return detect(image_path, url=FALLBACK_URL if FALLBACK_URL else URL, retries=retries-1, delay=2)
    return []

detections = detect(IMAGE_PATH)

if len(detections) > 0:
    print(json.dumps(detections, indent=2))
else:
    print('no objects found.')

React

We will use the axios package to make HTTP post requests.

npm install axios

Finally, create the following component and import it in your app to send an image to your AI and receive back its detections.

import React, { useState } from 'react';
import axios from 'axios';

const URL = ''; // copy and paste your URL here
const FALLBACK_URL = ''; // copy and paste your fallback URL here

function sleep(seconds) {
  return new Promise((resolve) => setTimeout(resolve, seconds * 1000));
}

async function detect({imageFile, url=URL, confThres=0.25, iouThres=0.45, ocrModel=undefined, ocrClasses=undefined, ocrLanguage=undefined, retries=10, delay=0}={}) {
  const data = new FormData();
  data.append('image', imageFile);
  data.append('conf_thres', confThres);
  data.append('iou_thres', iouThres);
  if(ocrModel !== undefined){
    data.append('ocr_model', ocrModel);  
  }
  if(ocrClasses !== undefined){
    data.append('ocr_classes', ocrClasses);  
  }
  if(ocrLanguage !== undefined){
    data.append('ocr_language', ocrLanguage);  
  }
  try {
    const response = await axios({ method: 'post', url: url, data: data, headers:{'Content-Type':'multipart/form-data'}});
    return response.data;
  } catch (error) {
    if (error.response) {
      if(error.response.status === 0 || error.response.status === 413) throw new Error('image too large, please select an image smaller than 25MB.');
      else if(error.response.status === 403) throw new Error('you reached your monthly requests limit. Upgrade your plan to unlock unlimited requests.');
      else if(error.response.data) throw new Error(error.response.data.message);
    } else if (retries > 0) {
      if (delay > 0) await sleep(delay);
      return await detect(imageFile, url= FALLBACK_URL ? FALLBACK_URL : URL, confThres=0.25, iouThres=0.45, retries=retries-1, delay=2);
    } else {
      return [];
    }
  }
}

function TheosAPI() {
  const [detecting, setDetecting] = useState(false);
  const [detected, setDetected] = useState(false);
  const [detections, setDetections] = useState('');
  const [error, setError] = useState('');

  function onFileSelected(event) {
    const file = event.target.files[0];
    setDetecting(true);
    setDetected(false);
    setDetections([]);
    setError('');
    detect({imageFile:file})
      .then(detections => {
        setDetected(true);
        setDetecting(false);
        setDetections(detections.length > 0? `${detections.length} OBJECTS FOUND\n${detections.map((detection, index) => ` ${'_'.repeat(30)}\n|\n| ${index+1}. ${detection.class}\n|\n|${'‾'.repeat(30)}\n|  ‣ confidence: ${detection.confidence*100}%\n|  ‣ x: ${detection.x}\n|  ‣ y: ${detection.y}\n|  ‣ width: ${detection.width}\n|  ‣ height: ${detection.height}\n|${'text' in detection? '  ‣ text: ' + detection.text:''}\n ${'‾'.repeat(30)}\n`).join('')}`: 'No objects found.');
      })
      .catch(error => {
        setError(error.message);
        setDetecting(false);
      });
  }

  return (
    <div style={{ padding: '20px' }}>
      <h1>Theos API</h1>
      {detecting ? <h3>Detecting...</h3> : <div><label htmlFor='file-upload' style={{cursor:'pointer', display:'inline-block', padding:'8px 12px', borderRadius: '5px', border:'1px solid #ccc'}}>Click to select an image</label><input id='file-upload' type='file' accept='image/*' onChange={onFileSelected} style={{display:'none'}}/></div>}
      {detected && <h3><pre>{detections}</pre></h3>}
      {error && <h3 style={{color:'red'}}>{error}</h3>}
    </div>
  );
}

export default TheosAPI;

21. Improve your AI

Now you should continue to add more examples to your dataset and retrain your AI to improve its accuracy. After you deployed your AI, you can use our magical Autolabeler to label new images 100 times faster. Let your AI help you create a better version of itself.

Last updated