fbpx

Reading Video Sources in OpenCV: IP Camera, Webcam, Videos & GIFS

By Taha Anwar and Rizwan Naeem

On November 20, 2021

Watch Video Here

Processing videos in OpenCV is one of the most common jobs, many people already know how to leverage the VideoCapture function in OpenCV to read from a live camera or video saved on disk. 

But here’s some food for thought, do you know that you can also read other video sources e.g. read a live feed from an IP Camera (Or your phone’s Camera) or even read GIFS.

Yes, you’ll learn all about reading these sources with videoCapture in today’s tutorial and I’ll also be covering some very useful additional things like getting and setting different video properties (height, width, frame count, fps, etc), manually changing current frame position to repeatedly display the same video, and capturing different key events.

This will be an excellent tutorial to help you properly get started with video processing in OpenCV. 

Alright, let’s first rewind a bit and go back to the basics, What is a video? 

Well,  it is just a sequence of multiple still images (aka. frames) that are updated really fast creating the appearance of a motion. Below you can see a combination of different still images of some guy (You know who xD) dancing.

And how fast these still images are updated is measured by a metric called Frames Per Second (FPS). Different videos have different FPS and the higher the FPS, the smoother the video is. Below you can see the visualization of the smoothness in the motion of the higher FPS balls. The ball that is moving at 120 FPS has the smoothest motion, although it’s hard to tell the difference between 60fps and the 120fps ball.

Note: Consider each ball as a separate video clip.

So, a 5-second video with 15 Frames Per Second (FPS) will have a total of 75 (i.e., 15*5) frames in the whole video with each frame being updated after 60 milliseconds. While a 5-second video with 30 FPS will have 150 (i.e., 30*5) frames with each frame being updated after 30 milliseconds. 

So a 30 FPS will display the same frame (still image) only for 30 milliseconds, while a 15 FPS video will display the same frame for 60 milliseconds (longer period) which will make the motion jerkier and slower and in extreme cases (< 10 FPS) may convert a video into a slideshow.

Other than FPS, there are some other properties too which determine the quality of a video like its resolution (i.e., width x height), and bitrate (i.e., amount of information in a given unit of time), etc. The higher the resolution and bitrate of a video are, the better the quality is.

This tutorial also has a video version that you can go and watch for a detailed explanation, although this blog post alone can also suffice.

Alright now we have gone through the required basic theoretical details about videos and their properties, so without further ado,  let’s get started with the code.

Download Code:

Import the Libraries

We will start by importing the required libraries.

Loading a Video

To read a video, first, we will have to initialize the video capture object by using the function cv2.VideoCapture().

Function Syntax:

Parameters:

  • filename – It can be:
    1. Name of video file (eg. video.avi)
    2. or Image sequence (eg. img_%02d.jpg, which will read samples like img_00.jpg, img_01.jpg, img_02.jpg, ...)
    3. or URL of video stream (eg. protocol://host:port/script_name?script_params|auth). You can refer to the documentation of the source stream to know the right URL scheme.
  • index – It is the id of a video capturing device to open. To open the default camera using the default backend, you can just pass 0. In case of multiple cameras connected to the computer, you can select the second camera by passing 1, the third camera by passing 2, and so on.
  • apiPreference – It is the preferred capture API backend to use. Can be used to enforce a specific reader implementation if multiple are available: e.g. cv2.CAP_FFMPEG or cv2.CAP_IMAGES or cv2.CAP_DSHOW. Its default value is cv2.CAP_ANY. Check cv2.VideoCaptureAPIs for details.

Returns:

  • video_reader – It is the video loaded from the source specified.

So to simply put, this cv2.VideoCapture() function opens up a webcam or a video file/images sequence or an IP video stream for video capturing with API Preference. After initializing the object, we will use .isOpened() function to check if the video is accessed successfully. It returns True for success and False for failure.

Reading a Frame

If the video is accessed successfully, then the next step will be to read the frames of the video one by one which can be done using the function .read().

Function Syntax:

ret, frame = cv2.VideoCapture.read()

Returns:

  • ret – It is a boolean value i.e., True if the frame is read successfully otherwise False.
  • frame – It is a frame/image of our video.

Note: Every time we run .read() function, it will give us a new frame i.e., the next frame of the video so we can put .read() in a loop to read all the frames of a video and the ret value is really important in such scenarios since after reading the last frame, from the video this ret will be False indicating that the video has ended.

Get and Set Properties of the Video

Now that we know how to read a video, we will now see how to get and set different properties of a video using the functions:

Here, propId is the Property ID and new_value is the value we want to set for the property.

Property IDEnumeratorProperty
0cv2.CAP_PROP_POS_MSECCurrent position of the video in milliseconds.
1cv2.CAP_PROP_POS_FRAMES0-based index of the frame to be decoded/captured next.
3cv2.CAP_PROP_FRAME_WIDTHWidth of the frames in the video stream.
4cv2.CAP_PROP_FRAME_HEIGHTHeight of the frames in the video stream.
5cv2.CAP_PROP_FPSFrame rate of the video.
7cv2.CAP_PROP_FRAME_COUNTNumber of frames of the video.

I have only mentioned the most commonly used properties with their Property ID and Enumerator. You can check cv2.VideoCaptureProperties for the remaining ones. Now we will try to get the width, height, frame rate, and the number of frames of the loaded video using the .get() function.

Width of the video: 1280.0

Height of the video: 720.0

Frame rate of the video: 29

Total number of frames of the video: 166

Now we will use the .set() function to set a new height and width of the loaded video. The function .set() returns False if the video property is not settable. This can happen when the resolution you are trying to set is not supported by your webcam or the video you are working on. The .set() function sets to the nearest resolution if that resolution is not settable like if I try to set the resolution to 500x500, it might fail to happen and the function set the resolution to something else, like 720x480, which is supported by my webcam.

Failed to set the width!

Failed to set the height!

So we cannot set the width and height to 1920x1080 of the video we are working on. An easy solution to this type of issue can be to use the cv2.resize() function on each frame of the video but it is a little less efficient approach.

Now we will put all this in a loop and read and display all the frames sequentially in a window using the function cv2.imshow(), which will look like we are playing a video, but we will be just displaying frames one after the other. We will use the function cv2.waitKey(milliseconds) to wait for one millisecond before updating a frame with the next one.

We will use the functions .get() and .set() to keep restarting the video when every time we will reach the last frame until the key q is pressed, or the close X button on the opened window is pressed. And finally, in the end, we will release the loaded video using the function cv2.VideoCapture.release() and destroy all of the opened HighGUI windows by using cv2.destroyAllWindows().

You can increase the delay specified in cv2.waitKey(delay) to be higher than 1 ms to control the frames per second.

Join My Upcoming Computer Vision For Building Cutting Edge Applications Course

A Course that goes beyond basic applications and teaches you how to create some next-level apps that utilize physics, deep learning (LSTM + CNN) + classical image processing, hand and body gestures to do a variety of very interesting things.

Summary

In this tutorial, we learned what exactly videos are, how to read them from sources like IP camera, webcam, video files & gif, and display them frame by frame in a similar way an image is displayed. We also learned about the different properties of videos and how to get and set them in OpenCV.

These basic concepts we learned today are essential for many in-demand Computer Vision applications such as intelligent video analytics systems for intruder detection and much more.

Let me know in the comments, you can also reach out to me personally for a 1 on 1 Coaching/consultation session in AI/computer vision regarding your project or your career.

Ready to seriously dive into State of the Art AI & Computer Vision?
Then Sign up for these premium Courses by Bleed AI

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *