fbpx

5 Easy & Effective Face Detection Algorithms in Python

By Taha Anwar and Rizwan Naeem

On August 30, 2021

Watch Video Here

In this post, you’ll learn in-depth about the five of the most easiest and effective face detection options available in python, along with the pros and cons of each one of them. You will become capable of obtaining the required balance in accuracy, speed, and efficiency in any given scenario.

The face detection methods we will be covering are:

Face Detection is one of the most common and simplest vision techniques out there, as the name implies, it detects (i.e., locates) the faces in the images and is the first and essential step for almost every face application like Face Recognition, Facial Landmarks Detection, Face Gesture Recognition, and Augmented Reality (AR) Filters, etc.

Other than these, one of its most common applications, that you must have used, is your mobile camera which detects your face and adjusts the camera focus automatically in real-time.

Also, for what it’s worth Tony Stark’s EDITH (Even Dead I’m The Hero) glasses, inherited by Peter Parker in the Spider-Man Far From Home movie, also uses Face Detection as an initial step to perform its functionalities. Cool 😊 … right?  

Yeah I know .. I know, I needed to add a marvel reference into it, the whole post get’s cooler.

Face detection also serves as a ground for a lot of exciting face applications for e.g.  You can even appoint Mr. Beans as the President 😂 using Deepfake.

But for now, let’s just go back to Face Detection.

The idea behind face detection is to make the computer capable of identifying what human face exactly is and detecting the features that are associated with the faces in images/videos which might not always be easy because of changing facial expression, orientation, lighting conditions, and occlusions due to face masks, glasses, etc.

But with enough training data covering all the possible scenarios, you can create a very robust face detector.

And people throughout the years have done just that, they have designed various algorithms for facial detection and in this post, we’ll explore 5 such algorithms.

As this is the most common and widely used technique, there are a lot of face detectors out there.

But which Algorithm is the best?

If you’re looking for a single solution then it’s a hard answer as each of the algorithms that we’re going to cover has its own pros and cons, take a look at the demos at the end for some comparison, and make sure to read the summary for the final verdict.

Alright, so without further ado, let’s dive in.

Import the Libraries

We will first import the required libraries.

Algorithm 1: OpenCV Haar Cascade Face Detection

This face detector was introduced in 2001 and remained the state-of-the-art face detection algorithm for many years. Other than just this face detector, OpenCV provides some other detectors (like eye, and smile, etc) too, which use the same haar cascade technique.

Load the OpenCV Haar Cascade Face Detector

To perform the face detection using this algorithm, first, we will have to load the pre-trained Haar cascade face detection model around 900 KBs from the disk, stored in a .xml file format, using the function CascadeClassifier().

Create a Haar Cascade Face Detection Function

Now we will create a function haarCascadeDetectFaces() that will perform haar cascade face detection using the function cv2.CascadeClassifier.detectMultiScale() on an image/frame and will visualize the resultant image along with the original image (when working with images) or return the resultant image along with the output of the model (when working with videos) depending upon the passed arguments.

Function Syntax:

results = cv2.CascadeClassifier.detectMultiScale(image, scaleFactor, minNeighbors, minSize, maxSize)

Parameters:

  • image – It is the input grayscale image containing the faces.
  • scaleFactor (optional) – It is the image size that is reduced at each image scale. Its default value is 1.1 which means a decrease of 10%.
  • minNeighbors (optional) – It is the number of minimum neighbors each predicted face should have, to retain. Otherwise, the prediction is ignored. Its default value is 3.
  • minSize (optional) – It is the minimum possible face size, the faces smaller than that size are ignored.
  • maxSize (optional) – It is the maximum possible face size, the faces larger than that are ignored. If maxSize == minSize then only the faces of a particular size are detected.

Returns:

  • results – It is an array of bounding boxes coordinates (i.e., x1, y1, bbox_width, bbox_height) where each bounding box encloses the detected face, the boxes may be partially outside the original image.

Note: When the value of the minNeighbors parameter is decreased, false positives are increased, and when the value of scaleFactor is decreased the large faces in the image become smaller and detectable by the algorithm at the cost of speed.

So the algorithm can detection very large and very small faces too by appropriately utilizing the scaleFactor argument.

Now we will utilize the function haarCascadeDetectFaces() created above to perform face detection on a few sample images and display the results.

The time taken by the algorithm to perform detection is pretty impressive, so yeah, it can work in real-time on a CPU.

A major drawback of this algorithm is that it does not work on non-frontal and occluded faces.

And also gives a lot of false positives But that can be controlled by increasing the value of the minNeighbors argument in the function cv2.CascadeClassifier.detectMultiScale().

Algorithm 2: Dlib HoG Face Detection

This face detector is based on HoG (Histogram of Oriented Gradients), and SVM (Support Vector Machine) and is significantly more accurate than the previous one. The technique used in this one is not invariant to changes in face angle, so it uses five different HOG filters that are for:

  1. Frontal face
  2. Right side turned face
  3. Left side turned face
  4. Frontal face but rotated right
  5. Frontal face but rotated left

So it can work on slightly non-frontal and rotated faces as well.

Load the Dlib HoG Face Detector

Now we will use the dlib.get_frontal_face_detector() function to load the pre-trained HoG face detector and we will not need to pass the path of the model file for this one as the model is included in the dlib library.

Create a HoG Face Detection Function

Now we will create a function hogDetectFaces() that will perform HoG face detection by inputting the image/frame into the loaded hog_face_detector and will visualize the resultant image along with the original image or return the resultant image along with the output of HoG face detector depending upon the passed arguments.

Function Syntax:

results = hog_face_detector(image, upsample)

Parameters:

  • image – It is the input image containing the faces in RGB format.
  • upsample (optional) – It is the number of times to upsample an image before performing face detection.

Returns:

  • results – It is an array of rectangle objects containing the (x, y) coordinates of the corners of the bounding boxes enclosing the faces in the input image.

Note: The model is trained to detect a minimum face size of 80×80, so to detect small faces in the images, you will have to upsample the images that increase the resolution of the input images, thus increases the face size at the cost of computation speed of the detection process.

Now we will utilize the function hogDetectFaces() created above to perform HoG face detection on a few sample images and display the results.

So this too can work in real-time on a CPU. You can also resize the images before passing them to the model, as the smaller the images are, the faster the detection process will be. But this also increases the probability of faces smaller than 80×80 in the images.

As you can see, it works on slightly rotated faces but will fail on extremely rotated and non-frontal ones and the bounding box often excludes some parts of the face like the chin and forehead.

And also works on small occlusions but will fail on massive ones.

As mentioned above, it cannot detect faces smaller than 80x80. Now, if you want, you can increase the upsample argument value of the loaded hog_face_detector in the function hogDetectFaces() created above, to detect the face in the image above, but that will also tremendously increase the time taken by the face detection process.

Algorithm 3: OpenCV Deep Learning based Face Detection

This one is based on a deep learning approach and uses ResNet-10 Architecture to detect multiple faces in a single pass (Single Shot Detector SSD) of the image through the network (model). It has been included in OpenCV since August 2017, with the official release of version 3.3, still, it is not as popular as the OpenCV Haar Cascade Face Detector but surely is highly more accurate.

Load the OpenCV Deep Learning based Face Detector

Now to load the face detector, OpenCV provides us with two options, one of them is in the Caffe framework’s format and takes around 5.10 MBs in memory and the other one is in the TensorFlow framework’s format and acquires only 2.7 MBs in memory.

To load the first one from the disk, we can use the cv2.dnn.readNetFromCaffe() function and to load the other one we will have to use the cv2.dnn.readNetFromTensorflow() function with appropriate arguments.

Create an OpenCV Deep Learning based Face Detection Function

Now we will create a function cvDnnDetectFaces() that will perform Deep Learning-based face detection using OpenCV. First, we will pre-process the image/frame using the cv2.dnn.blobFromImage() function and then we will set the pre-processed image as an input to the network by using the function opencv_dnn_model.setInput().

And after that, pass the input image into the network by using the opencv_dnn_model.forward() function to get an array containing the bounding boxes coordinates normalized to ([0.0, 1.0]) and the detection confidence of each faces in the image.

After performing the detection, the function will also visualize the resultant image along with the original image or return the resultant image along with the output of the dnn face detector depending upon the passed arguments.

Note: Higher the face detection confidence score is, the more certain the model is about the detection.

Now we will utilize the function cvDnnDetectFaces() created above to perform OpenCV deep learning-based face detection on a few sample images and display the results.

So it is highly more accurate than both of the above and works great even under massive occlusions and on non-frontal faces. And the reason for its significantly higher speed is that it can detect faces across various scales, allowing us to resize the images to a smaller size which decreases computations.

Also, the bounding box encloses the whole face, unlike the HoG Face Detector, making it easier to crop regions of interest (i.e., faces) from the images.CodeText

Also, the bounding box encloses the whole face, unlike the HoG Face Detector, making it easier to crop regions of interest (i.e., faces) from the images.CodeText

So even the faces with masks are detectable with this one.

Algorithm 4: Dlib Deep Learning based Face Detection

This detector is also based on a Deep learning (Convolution Neural Network) approach and uses Maximum-Margin Object Detection (MMOD) method to detect faces in images. This one is also trained for a minimum face size of 80×80 and provides the option of upsampling the images. This one is very slow on a CPU but can be used on an NVIDIA GPU and outperforms the other detectors in speed on the GPU.

Load the Dlib Deep Learning based Face Detector

Now first, we will use the dlib.cnn_face_detection_model_v1() function to load the pre-trained maximum-margin cnn face detector around 700 KBs from the disk, stored in a .dat file format.

Create a Dlib Deep Learning based Face Detection Function

Now we will create a function dlibDnnDetectFaces() in which we will perform deep Learning-based face detection using dlib by inputting the image/frame and the number of times to upsample the image to the loaded cnn_face_detector as we had done for the HoG face detection.

The only difference is that we are loading a different model, and it will return a list of objects, where each object will be a wrapper around a rectangle object (containing the bounding box coordinates) and a detection confidence score. As our every other function, this one will also visualize the results or return them depending upon the passed arguments.

Now we will utilize the function dlibDnnDetectFaces() created above to perform dlib deep learning-based face detection on a few sample images and display the results.

Interesting! this one is also far more accurate and robust than the first two and is also capable of detecting faces under occlusion. But as you can see, the time taken by the detection process is very high, so this detector cannot work in real-time on a CPU.

Also, the varying face orientations and lighting do not stop it from detecting faces accurately.

Similar to the HoG face detector, the bounding box for this one is also small and does not enclose the whole face.

Algorithm 5: Mediapipe Deep Learning based Face Detection

The last one is also based on Deep learning approach and uses BlazeFace that is a very lightweight and highly accurate face detector inspired and modified from Single Shot MultiBox Detector (SSD) & MobileNetv2. The detector provided by Mediapipe is capable of running at a speed of 200-1000+ FPS on flagship devices.

Load the Mediapipe Face Detector

To load the model, we first have to initialize the face detection class using the mp.solutions.face_detection syntax and then we will have to call the function mp.solutions.face_detection.FaceDetection() with the arguments explained below:

  • model_selection – It is an integer index ( i.e., 0 or 1 ). When set to 0, a short-range model is selected that works best for faces within 2 meters from the camera, and when set to 1, a full-range model is selected that works best for faces within 5 meters. Its default value is 0.
  • min_detection_confidence – It is the minimum detection confidence between ([0.0, 1.0]) required to consider the face-detection model’s prediction successful. Its default value is 0.5 ( i.e., 50% ) which means that all the detections with prediction confidence less than 0.5 are ignored by default.

We will also have to initialize the mp.solutions.drawing_utils class which is used to visualize the detection results on the images/frames.

Create a Mediapipe Deep Learning based Face Detection Function

Now we will create a function mpDnnDetectFaces() in which we will use the mediapipe face detector to perform the detection on an image/frame by passing it into the loaded model by using the function mp_face_detector.process() and get a list of a bounding box and six key points for each face in the image. The six key points are on the:

  1. Right Eye
  2. Left Eye
  3. Nose Tip
  4. Mouth Center
  5. Right Ear Tragion
  6. Left Ear Tragion

The bounding boxes are composed of xmin and width (both normalized to [0.0, 1.0] by the image width) and ymin and height (both normalized to [0.0, 1.0] by the image height). Each key point is composed of x and y, which are normalized to [0.0, 1.0] by the image width and height respectively. The function will work on images and videos as well as this one will also display or return the results depending upon passed arguments.

Now we will utilize the function mpDnnDetectFaces() created above to perform face detection using Mediapipe’s detector on a few sample images and display the results.

You can get an idea of its super-realtime performance from the time taken by the detection process. After all, this is what differentiates this detector from all the others.

It can detect the non-frontal and occluded faces but fails to accurately detect the key points in such scenerios.

The size of the bounding box returned by this detector is also quite appropriate.

By using the short-range model, one can easily ignore the faces in the background, which is normally required in most of the applications out there, like face gesture recognition.

Face Detection on Real-Time Webcam Feed

We have compared the face detection algorithms on the images and discussed the pros and cons of each of them, but now the real test begins, as we will test the algorithms on a real-time webcam feed. First, we will select the algorithm we want to use as one of them will be used at a time. We have designed the code below to switch between different face detection algorithms in real-time, by pressing the key s.

We will utilize the functions created above to perform face detection on the real-time webcam feed using the selected algorithm and will also calculate and display the number of frames being updated in one second to get an idea of whether the algorithms can work in real-time on a CPU or not.

Output

As expected! all of them can work in real-time on a CPU except for the Dlib Deep Learning-based Face Detector.

Join My Upcoming Computer Vision For Building Cutting Edge Applications Course

A Course that goes beyond basic applications and teaches you how to create some next-level apps that utilize physics, deep learning (LSTM + CNN) + classical image processing, hand and body gestures to do a variety of very interesting things.

Further Resources

Face Detection – OpenCV, Dlib and Deep Learning ( C++ / Python )
Dlib 18.6 released: Make your own object detector!
Easily Create High-Quality Object Detectors with Deep Learning

Bleed Face DetectorIt is a python package that allows using 4 different face detectors (OpenCV Haar Cascade, Dlib HoG, OpenCV Deep Learning-based, and Dlib Deep Learning-based) by just changing a single line of code.


Summary:

In this tutorial, you have learned about the five most popular and effective face detectors along with the best tips, and suggestions. You have become capable of acquiring the required balance in accuracy, speed, and efficiency in any given scenario. Now to summarize; 

If you have a low-end device or an embedded device like the Raspberry Pi and are expecting faces under substantial occlusion and with various sizes, orientations, and angles then I will recommend you to go for the Mediapipe Face Detector, as it is the fastest one and also pretty accurate. In fact, this one has the best trade-off between speed and accuracy and also gives a few facial landmarks (key points).

Otherwise, if you have some environmental restrictions and cannot use the Mediapipe face detector, then the next best option will be OpenCV DNN Face Detector as this one is also pretty accurate but has higher latency.

For applications in which the face size can be controlled (> 80×80), and you want to skip the people (small faces) that are far away from the camera, the Dlib  HoG Face Detector can be used but surely is not the best option and for flag-ship devices with NVIDIA GPU in the same scenario, Dlib DNN Face Detector can be a good alternative to the HoG Face Detector, but try to use it on a CPU.

And If you are only working with frontal faces and want to skip all the non-frontal and rotated faces, then the Haar Cascade detector can be an option but remember you will have to manually tune the parameters to get rid of false positives.

So generally, you should just go with the Mediapipe Face Detector for super real-time speed and high accuracy.

Let me know in the comments, you can also reach out to me personally for a 1 on 1 Coaching/consultation session in AI/computer vision regarding your project or your career.

Ready to seriously dive into State of the Art AI & Computer Vision?
Then Sign up for these premium Courses by Bleed AI

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *