Detect any kind of object using MAX Object Detector

Buttan Butt

Young grasshopper
Jun 4, 2017
Reaction score
Sweden yeah!
Hi there! I'd like to share with you that I've successfully set up a local web service API in my LAN that will detect all kind of objects in an image. It can recognizes the objects present in an image from 80 different high-level classes of objects in the COCO Dataset. You simply supply a snapshot from one of your cameras and you will get a json file with all kind of objects that were found in the image.

Isn't that cool? Now, this thread is not a tutorial nor a support thread. I just like to give you an idea about what you can achieve quite easily.

It can be used to count persons, cats, dogs, teddy bears or whatever you are interested to detect.
If the object type isn't already pre-learned, you can train it to recognize anything, chocolate for example.

I've used Max Object Detector from IBM which is based on Tensorflow (open source platform for machine learning). It's easy to set up in Docker.

You can call the API like this:
curl -F "image=@samples/dog-human.jpg" -XPOST

You should see a JSON response like that below:

  "status": "ok",
  "predictions": [
          "label_id": "1",
          "label": "person",
          "probability": 0.944034993648529,
          "detection_box": [
          "label_id": "18",
          "label": "dog",
          "probability": 0.8645511865615845,
          "detection_box": [
You can use it to filter out false motion detections or maybe you only want to check if there is a car in the snapshot.

For my personal use I have set up motion detection on Blue Iris so that whenever a camera is triggered a MQTT message gets sent out. (I get false alerts sometimes) My Home Automation Server intercepts that message and checks a few other conditions before a python script is called. That python script will immediately pull a snapshot directly from the camera. (It's important that this happens immediately or the object that created the motion detection may not be present in the image any longer...) Then the script saves the snapshot file and queries the MAX Object Detector API. I then count how many persons are detected in the image and if more than zero I send a PushOver message to my cell phone with the attached image.

The python script is just serving as an example. It's not intended to work for your installation. I just made the script and I haven't given any time to beautify it.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import ftplib
import os
import errno
import sys
import time

import requests
from requests.auth import HTTPDigestAuth

if len(sys.argv) < 2:
    print('You must supply camera name as argument 1')

print('Getting image from camera: {}'.format(sys.argv[1]))

probability = 0.35 # Probability Threshold
model_endpoint = 'http://MYSERVER:5000/model/predict?threshold={}'.format(probability)
headers = {'Content-Type' : 'image/jpeg'}
image_url = 'http://{}.MYLANDOMAIN/cgi-bin/snapshot.cgi'.format(sys.argv[1])
tmp_image_dir = '/tmp/Wankers'

file_name = 'motion_wanker_{}.jpg'.format(time.strftime("%Y_%m_%d__%H_%M_%S"))

file_path = '{}/{}'.format(tmp_image_dir, file_name)

def mkdir_path(path):
    except os.error, e:
        if e.errno != errno.EEXIST:


with open(file_path, 'wb') as handle:
    response = requests.get(image_url, auth=HTTPDigestAuth('admin', 'PASSWORD'), stream=True)

    if not response.ok:

    for block in response.iter_content(1024):
        if not block:


with open(file_path, 'rb') as file:
    file_form = {'image': (file_path, file, 'image/jpeg')}
    r =, files=file_form)

assert r.status_code == 200
json_obj = r.json()

assert json_obj['status'] == 'ok'

persons = 0
for prediction in json_obj["predictions"]:
    if prediction["label"] == 'person':
        persons = persons + 1
print('Persons: {}'.format(persons))

if persons > 0:
    # Send the image in a PushOver message
    r ="", data={"token":"POSHOVERTOKEN","user":"PUSHOVERUSER","message":u"{} person upptäckt{} framför huset. {}".format(persons, ('a' if persons > 1 else ''), json_obj),"title":"My CCTV informerar"}, files={"attachment":open(file_path,"rb")})

#os.remove(file_path) # SAVE FOR DEBUG
The MAX Object Detector also features a web app that's great for testing your images and help you understand how the threshold will affect the object detection.

There is a demo site where you can try the app so that you can evaluate if this is useful or not using snapshot images from your existing surveiilance cameras.

Try it out here before you decide to get involved and install anything locally.

Did I mention that it's free? (
IBM/MAX-Object-Detector is licensed under the Apache License 2.0). Use it for as many cameras you like.

FWIW I actually did try the Sentry Smart Alerts person-detection initially (The one that Blue Iris has built in support for) but it didn't work out for me. The reason is that it takes a couple of seconds (many) before the Sentry app determines whether there are persons in the image or not. When it has decided to trigger, the object is probably not in the range of the camera any longer so it's too late to take a snapshot and send it to my cell phone using PushOver. In my solution presented here, the snapshot is taken immediately and evaluated later which is fine for me. If a person is found in the image I will get the PushOver alert within a few seconds. If Blue Iris has made a false detection I won't get the PushOver alert. However there will still be an alert entry in Blue Iris but I can live with that.

On the "wish list": That Blue Iris opens up for any third party object detection software. (Just a simple text box for entering the name of a script that will return True or False to Blue Iris). Having that would be cool opening up for detecting other objects than humans. E.g. "/home/me/ dog cat". Very versatile ;)

Last edited: