Saturday, August 25, 2018

Tracking a simple, marked object with OpenCV / Python

In this post, I will cover a simple case of object tracking problem, where I want to measure a speed of a robot. The robot is made by me, so it's markable and reshapeable. I want a simple, robust algorithm, which can work with a simple, USB webcam and on a low-cost computer.
Because of the speed measurement, it's important to find the exact coordinates of the robot. An approximate bounding box is not right, because it would add too much error to the measurement.
The tracking can be divided into two parts:

  1. We have to find the (marked) robot on the first frame
  2. We have to follow the robot during fast movements
The first step is very easy, but the second can be hard because of the blur during the movement.
During this post, I will use Python 2.7 and OpenCV.

Use difference between frames

Firstly, I simply detected the background and found the searched object. It's fast and robust, but it can be hard to ensure the static background. You have to stabilize the camera, keep the lighting and extract the background. Even if you do all of this, you still have to use some filter on the difference (between the background and the new frames.)
Here is a simple code which write the (assumed) center of your object:

 #!/usr/bin/python  
 # -*- coding: utf-8 -*-  
 import numpy as np  
 import cv2   
 cap = cv2.VideoCapture(1) # This is my USB camera.   
              # If you want to use your built-in camera,  
              # you propably have to write here zero  
 background = None   
 np.set_printoptions(threshold=np.nan) # If I print an array, I would like to see it...  
 while(True):  
   # Capture frame-by-frame  
   ret, frame = cap.read()  
   #print(frame.shape) #480x640  
   # Our operations on the frame come here  
   gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)  
   if background is None:  
     cv2.imshow('Main', gray)  
   else:  
     # np.abs(gray-background) is not good for the difference, because our array contains unsigned bytes  
     # If we calculate 1-3, we got not -2 but 254  
     # But OpenCV got our back  
     diff = cv2.absdiff(gray, background)  
     ret, thres = cv2.threshold(diff, 20, 255, cv2.THRESH_BINARY) # Usually, it's easier to work with binary images  
     kernel = np.ones((3,3),np.uint8) # You can make some nice trick by a good kernel matrix, but in this case, we would like  
                      # just clear the single pixels  
     filtered = cv2.erode(thres, kernel, iterations = 1)  
     # Now, we find the mass center of the image  
     nonzeros = np.nonzero(filtered)  
     try:  
       cx = int(nonzeros[1].mean()) # If we have an all-black image, this will cause ValueError  
       cy = int(nonzeros[0].mean())  
       print "The coords are:", cx, cy  
     except ValueError:  
       print "I can't find the object"  
       cx, cy = 0,0  
       valid = False  
     # I show every images  
     horizontalConcat = np.concatenate((background, gray, diff, filtered), axis=0)  
     cv2.imshow('Main', horizontalConcat)  
   pressedKey = cv2.waitKey(1) & 0xFF   
   if pressedKey == ord('q'):  
     break  
   elif pressedKey == ord('b'):  
     background = gray  
 # When everything done, release the capture  
 cap.release()  
 cv2.destroyAllWindows()  

You can see a sample output here, where the script finds a glass:


In this image, you can see the basic problems with this method.
Firstly, I had to create an image without my object. Sometimes, this can be hard.
After that, the object modified its environment. It created shades and reflections.
And last, you can't find the exact point of the glass  twice with this method during movement.

Regardless its drawbacks, this algorithm is simple and fast enough to use it in some special cases. For my use case, it wasn't good because the moving robot changes its contours which changes its center of mass, which added too much noise for my speed measurement.

No comments:

Post a Comment