In this tutorial we'll learn how to leverage Background Subtraction and Contours in order to detect moving objects.
Let's First start by importing the libraries.
import cv2 import numpy as np import matplotlib.pyplot as plt %matplotlib inline
Background subtraction is a simple yet effective technique to extract objects from an image/video. Consider a highway on which cars are moving, and you want to extract each car. One easy way can be that you take a picture of the highway with the cars (called foreground image) then you also have an image saved in which the highway does not contain any cars (background image) and then you subtract the background image from the foreground to get the segmented mask of the cars and then use that mask to extract the cars.
But in many cases you don't have a clear background image, an example of this can be a highway that is always busy, or maybe a walking destination that is always crowded. So in those cases, you can subtract the background by other means, for example, in the case of a video you can detect the movement of the object, so the objects which move can be foreground and the other part that remain static can be the background.
Several algorithms have been invented for this purpose. OpenCV has implemented a few such algorithms which are very easy to use. Let's see one of them.
It is a Background/Foreground Segmentation Algorithm, based on two papers by Z.Zivkovic, "Improved adaptive Gaussian mixture model for background subtraction" (IEEE 2004) and "Efficient Adaptive Density Estimation per Image Pixel for the Task of Background Subtraction" (Elsevier BV 2006). One important feature of this algorithm is that it provides better adaptability to varying scenes due to illumination changes etc.
history(optional) - It is the length of the history. Its default value is 500.
varThreshold(optional) - It is the threshold on the squared distance between the pixel and the model to decide whether a pixel is well described by the background model. It does not affect the background update and its default value is 16.
detectShadows(optional) - It is a boolean that determines whether the algorithm will detect and mark shadows or not. It marks shadows in gray color. Its default value is True. It decreases the speed a bit, so if you do not need this feature, set the parameter to false.
object- It is the MOG2 Background Subtractor.
# load a video cap = cv2.VideoCapture('media/videos/vtest.avi') # you can optionally work on the live web cam cap = cv2.VideoCapture(0) # create the background object, you can choose to detect shadows or not (if True they will be shown as gray) backgroundobject = cv2.createBackgroundSubtractorMOG2( history = 2, detectShadows = True ) while(1): ret, frame = cap.read() if not ret: break # apply the background object on each frame fgmask = backgroundobject.apply(frame) # also extracting the real detected foreground part of the image (optional) real_part = cv2.bitwise_and(frame,frame,mask=fgmask) # making fgmask 3 channeled so it can be stacked with others fgmask_3 = cv2.cvtColor(fgmask, cv2.COLOR_GRAY2BGR) # Stack all three frames and show the image stacked = np.hstack((fgmask_3,frame,real_part)) cv2.imshow('All three',cv2.resize(stacked,None,fx=0.65,fy=0.65)) k = cv2.waitKey(30) & 0xff if k == 27: break cap.release() cv2.destroyAllWindows()
To Perform the complete background Subtraction based contour detection, we'll be performing these simple steps
1) First, we will load a video using the function
cv2.VideoCapture() and create a background subtractor object using the function
3) We will then apply thresholding on the mask using the function
cv2.threshold() to get rid of shadows and then perform Erosion and Dilation to improve the mask further using the functions
4) Then we will use the function
cv2.findContours() to detect the contours on the mask image and convert the contour coordinates into bounding box coordinates for each car in the frame using the function
cv2.boundingRect(). We will check if the area of the contour is greater than a threshold to make sure that it's a car which we will find using the function
5) After that we will use the functions
cv2.putText() to draw and label the bounding boxes on each frame and then we will extract the foreground part of the video with the help of the segmented mask using the function
# load a video video = cv2.VideoCapture('media/videos/carsvid.wmv') # You can set custom kernel size if you want. kernel = None # Initialize the background object. backgroundObject = cv2.createBackgroundSubtractorMOG2(detectShadows = True) while True: # Read a new frame. ret, frame = video.read() # Check if frame is not read correctly. if not ret: # Break the loop. break # Apply the background object on the frame to get the segmented mask. fgmask = backgroundObject.apply(frame) #initialMask = fgmask.copy() # Perform thresholding to get rid of the shadows. _, fgmask = cv2.threshold(fgmask, 250, 255, cv2.THRESH_BINARY) #noisymask = fgmask.copy() # Apply some morphological operations to make sure you have a good mask fgmask = cv2.erode(fgmask, kernel, iterations = 1) fgmask = cv2.dilate(fgmask, kernel, iterations = 2) # Detect contours in the frame. contours, _ = cv2.findContours(fgmask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) # Create a copy of the frame to draw bounding boxes around the detected cars. frameCopy = frame.copy() # loop over each contour found in the frame. for cnt in contours: # Make sure the contour area is somewhat higher than some threshold to make sure its a car and not some noise. if cv2.contourArea(cnt) > 400: # Retrieve the bounding box coordinates from the contour. x, y, width, height = cv2.boundingRect(cnt) # Draw a bounding box around the car. cv2.rectangle(frameCopy, (x , y), (x + width, y + height),(0, 0, 255), 2) # Write Car Detected near the bounding box drawn. cv2.putText(frameCopy, 'Car Detected', (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.3, (0,255,0), 1, cv2.LINE_AA) # Extract the foreground from the frame using the segmented mask. foregroundPart = cv2.bitwise_and(frame, frame, mask=fgmask) # Stack the original frame, extracted foreground, and annotated frame. stacked = np.hstack((frame, foregroundPart, frameCopy)) # Display the stacked image with an appropriate title. cv2.imshow('Original Frame, Extracted Foreground and Detected Cars', cv2.resize(stacked, None, fx=0.5, fy=0.5)) #cv2.imshow('initial Mask', initialMask) #cv2.imshow('Noisy Mask', noisymask) #cv2.imshow('Clean Mask', fgmask) # Wait until a key is pressed. # Retreive the ASCII code of the key pressed k = cv2.waitKey(1) & 0xff # Check if 'q' key is pressed. if k == ord('q'): # Break the loop. break # Release the VideoCapture Object. video.release() # Close the windows.q cv2.destroyAllWindows()