Image Preprocessing with Open Source Tools

15 min readApr 8, 2020

A Brief Look into Image Preprocessing for doing AI in Computer Vision

Andi Sama — CIO, Sinergi Wahana Gemilang with Andrew Widjaja and Cahyati S. Sangaji

#SinergiWahanaGemilang #ArtificialIntelligence #ComputerVision #IBMVisualInsights #GoogleAutoML #ImagePreprocessing #PythonImageLibrary #Matplotlib #Keras #OpenCV

Supporting files (Python Notebook file + Images) for this article can be found in github.

When we are building a machine learning (or deep learning) model in computer vision with Python, we often use dataset in the form of images or videos. By doing it by hand without any automated tools such as IBM Visual Insights or Google AutoML for example, most of the time we need to preprocess the data before it can be consumed by our machine learning algorithm.

This article focuses on introducing approaches in doing preprocessing on source image by transforming image to numpy array and vice versa (JPEG, PNG format). It is implemented in Python programming language by using common open source libraries such as Python Image Library (PIL), Matplotlib, Keras and OpenCV.

The following shows the sample of a source image (resides in ‘data/smurf.jpg’) that we use in this article (Avforums, 2017). The source image file is encoded in RGB sequence — Red, Green, Blue. This sequence is important since in OpenCV for example, the encoded sequence is BGR (Blue, Green, Red) instead of RGB.

*Bunch of Smurfs, the base sourceimage for us to work on.*

Typically, these approaches (implemented with multiple libraries available as open source) will also apply to video files or video stream from camera (as well as still image from camera) as video stream will most likely be converted to series of images first, meaning that we will process one image at a time anyway.

Preparing the Environment

The following shows the environment that we are using on Windows 10 Operating System. Previous articles were mostly implemented on Ubuntu or RHEL linux distribution — either installed locally (on-premise) or in the cloud: IBM IaaS Cloud, GCP (Google Cloud Platform IaaS) or AWS EC2 (Amazon Web Services, Elastic Compute Cloud). We are doing something different now, it’s full running on Windows 10.

import os, platformprint('OS name:', os.name, ', system:', platform.system(), ', release:', platform.release())import sysprint("Anaconda version:")!conda list anacondaprint("Python version: ", sys.version)print("Python version info: ", sys.version_info)import PILprint("PIL version: ", PIL.__version__)import matplotlibprint("Matplotlib version: ", matplotlib.__version__)import tensorflow as tfprint("Keras version:", tf.keras.__version__)import cv2print("OpenCV version: ", cv2.__version__)

The output:

OS name: nt , system: Windows , release: 10Anaconda version:# packages in environment at C:\Users\andis\anaconda3:#Python version:  3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)]Python version info:  sys.version_info(major=3, minor=7, micro=6, releaselevel='final', serial=0)PIL version:  7.0.0Matplotlib version:  3.1.3Keras version: 2.2.4-tfOpenCV version:  4.2.0# Name                    Version                   Build  Channel_anaconda_depends         2019.03                  py37_0anaconda                  custom                   py37_1anaconda-client           1.7.2                    py37_0anaconda-navigator        1.9.12                   py37_0anaconda-project          0.8.4                      py_0

In order to experiment with the codes in this article, the following needs to be prepared:

First of all, install Anaconda for Windows 10 (Anaconda, 2020) followed by OpenCV (Pranav Sreedhar, 2019). By default Python, Matplotlib and Jupyter Notebook (Interactive Integrated Development Environment) will be installed with Anaconda.

First thing first, It may be wise to create a virtual environment before doing anything further, right from the installation of Anaconda. In Anaconda Prompt, use ‘conda create’ command

(base) C:\Users\andis>conda create -n myenv

The output:

Collecting package metadata (current_repodata.json): doneSolving environment: done## Package Plan ##environment location: C:\Users\andis\anaconda3\envs\myenvProceed ([y]/n)? yPreparing transaction: doneVerifying transaction: doneExecuting transaction: done## To activate this environment, use##     $ conda activate myenv## To deactivate an active environment, use##     $ conda deactivate

then ‘conda activate’ command.

(base) C:\Users\andis>conda activate myenv

The rest of required modules can mostly be installed through ‘conda install package_name’ from Anaconda Prompt.

(myenv) C:\Users\andis>conda install pip

Selected few packages may need to be installed through pip ‘pip install package_name’.

(myenv) C:\Users\andis>pip install package_name

Once all the environment preparation is done, we can start Jupyter Notebook from Anaconda Navigator, then we can work with the Python codes.

In general, especially in Python — we need to convert an image to an internal format that we can work on, and typically this will be converted to a numpy array (Justin Johnson, Spring 2020).

An RGB image contains data in 3 dimensions (height, width, channel) like (768, 1024, 3) with 2,359,296 pixels in total (768 * 1024 * 3). Each of this pixel per channel has 8 bits (1 Byte) value ranging from 0–255. It means for each RGB pixel, it has 3 bytes (24 bits) of data (1 Byte for each channel: R, G and B).

When these pixels data are converted to a numpy array, they are easier to manipulate. This is the purpose of some of the tools that we are discussing here: PIL, Matplotlib, Keras and OpenCV.

Image Preprocessing with PIL — Python Image Library

The following shows how to open and display an image with Python Image Library (PIL). The image is displayed in a separate window following the execution of ‘img.show()’ function. Note that we also view some basic information regarding the image source: it’s format: JPEG and mode: RGB, although seems internally PIL is converting the image to PNG, for displaying to screen.

# A. IMAGE MANIPULATION with PIL - Python Image Library# First of all, VIEWing an IMAGE# - first import necessary librariesimport numpy as npfrom PIL import Image# - set a few generic variablesFilePath = 'data/'FileName = 'smurf'FileExt = '.jpg'ImageFile = FilePath + FileName + FileExt# - open the imagefileimg = Image.open(ImageFile)# - inspect necessary informationprint('Image format: ', img.format)print('Image mode: ', img.mode)# - displayimg.show()

The output:

Image format:  JPEGImage mode:  RGB

In Python, package, sub-package or modules including certain functions that we want to use are loaded with import keyword. The code ‘import numpy as np’ loads all numpy library’s functions and we can refer it just as ‘np’ later. The code ‘from PIL import Image’ is importing only the specific ‘Image’ function from the PIL library.

We convert the loaded image to nparray by using numpy’s asarray function. The converted image (data) is now in nparray format with data type: <class ‘numpy.ndarray’>, and data shape (height: 393, width: 700, channel: 3).

# A.1. CONVERT an IMAGE to NP ARRAY# - first import necessary librariesfrom numpy import asarray# - convert image format to numpy arraydata = asarray(img)# - inspect necessary informationprint('data type: ', type(data))print('data shape: ', data.shape)

The output:

data type:  <class 'numpy.ndarray'>data shape:  (393, 700, 3)

And then

print(data)

The output:

[[[ 82  99 143][ 93 107 156][ 99 105 165]...[215 234 248][215 234 248][215 234 248]]...[[116 160 223][ 98 142 205][103 147 208]...[ 69  54 139][ 72  54 140][ 74  54 139]]]

Once the data is in nparray format, we can apply any operations. Treating the nparray data as n-dimension matrices for example.

When we are done working with the data (doing data cleansing & transformation for example), we can convert the data (in nparray format) back to image format. Numpy’s fromarray is a function that we use exactly for this purpose.

# A.2. CONVERT NP ARRAY back to IMAGE format# - convert numpy array to image formatimg_fromnparray = Image.fromarray(data)img_fromnparray.show()# - inspect necessary informationprint('Image size: ', img_fromnparray.size)print('Image format: ', img_fromnparray.format)print('Image mode: ', img_fromnparray.mode)

The output:

Image size:  (700, 393)Image format:  NoneImage mode:  RGB

Let’s see how we can save the loaded RGB image (in nparray format) back to an external file, this time we also change the image format to PNG (file name’s suffix added with ‘_pil-rgb’). Then, we try another one to convert the loaded RGB image to grey with the function ‘convert(‘L’)’ (the data is then saved to a file with added suffix ‘_pil-grey’ to its filename, still in PNG). See how we do these below.

# A.3. SAVE the IMAGE to ANOTHER FORMAT, e.g. PNG (source was JPG before)#  - 1st reopen image, and save is as PNGdata = asarray(Image.open(ImageFile))print("Saving as PNG (RGB)...", FilePath + FileName)print(type(data), data.shape)img_png = Image.fromarray(data).save(FilePath + FileName + '_pil-rgb' + '.png')#  - 2nd reopen image and convert it to grey, and save is as PNGdata = asarray(Image.open(ImageFile).convert('L'))print("Saving as PNG (Grey)...", FilePath + FileName)print(type(data), data.shape)img_png = Image.fromarray(data).save(FilePath + FileName + '_pil-grey' + '.png')

The output:

Saving as PNG (RGB)... data/smurf<class 'numpy.ndarray'> (393, 700, 3)Saving as PNG (Grey)... data/smurf<class 'numpy.ndarray'> (393, 700)

The following shows the generated PNG-format image files from these steps.

*Generated image files (smurf_pil-grey.png and smurf_pil-rgb.png) in ‘data/’ directory.*

Image Preprocessing with Matplotlib

One of the quickest way to display an inline image within Jupyter Notebook is using matplotlib’s imshow() function. Just use ‘image.imread’ to read the image, then use ‘pyplot.imshow()’ to show the loaded image. An inline image is the image that is displayed within Jupyter Notebook as a result execution of ‘pyplot.imshow()’.

# B. IMAGE MANIPULATION with Matplotlib# - first import necessary librariesfrom matplotlib import image, pyplotimg = image.imread(ImageFile)# - inspect necessary informationprint('Image dtype: ', img.dtype)print('Image shape: ', img.shape)# - then show the imagepyplot.imshow(img)# pyplot.show()

The output:

*An inline image is displayed within Jupyter Notebook as a result execution of matplotlib’s pyplot.imshow() function.*

Image Preprocessing with PIL (continued)

There are still many more things that we can do with PIL (Fredrik Lundh, 1995–2011 and Alex Clark and Contributors, 2010–2020). Among others are Image Transformation as shown below (rotation, crop).

# A.4 Continuing IMAGE MANIPULATION with PIL#     Rotate, Crop, Contrast & Brightness#  - Image rotation with PILRotation_Degree = 30im = Image.open(ImageFile)im.rotate(Rotation_Degree).show()#  - Image crop with PIL#    Looking at the pixel locations in above image, we can define the coordinate (box) to cropbox = (350, 100, 500, 200) #left, upper, right, lowerim.crop(box).show()

The output:

*Image Rotation (30 degrees) and Image Cropping using PIL (based on defined area in a box). Both operations are based on original image.*

Then, Image Transformation (contrast, brightness) as shown below. Sharpness and color enhancements, as well as Image Filtering (e.g. blur, emboss, sharpen, smooth, edge enhancement) are additional functions for instance that we can use with PIL.

#  - Image Enhancements with PIL: Contrastfrom PIL import ImageEnhanceenh = ImageEnhance.Contrast(im)enh.enhance(1.75).show("75% more contrast")#  - Image Enhancements with PIL: Brightnessenh = ImageEnhance.Brightness(im)enh.enhance(1.5).show("50% more brightness")

The output:

Image Enhancements (contrast and brightness). Contrast and Brightness are set in quite significant values for us to clearly see the effect of image enhancements. Both operations are based on original image.

Image Preprocessing with Keras

Keras is the high level framework for working with deep learning. It is built on top of Tensorflow, one of the well known top libraries for doing deep learning. The other library is Pytorch.

The following command install Keras within conda environment. Tensorflow, the supporting library should be automatically installed also.

# C. IMAGE MANIPULATION with Keras (built on top of Tensorflow)# - first import necessary librariesfrom keras.preprocessing.image import load_imgimport warnings# - load and view the imageimg = load_img(ImageFile)print('Image type:', type(img), ', image format:', img.format, ', image mode:', img.mode, ', image size:', img.size)img.show()

The output:

Using TensorFlow backend.
Image type: <class 'PIL.JpegImagePlugin.JpegImageFile'> , image format: JPEG , image mode: RGB , image size: (700, 393)

Then:

from keras.preprocessing.image import img_to_array, array_to_imgprint('Original type of image:', type(img))# - convert the image to numpy arrayimg_nparray = img_to_array(img)print('Numpy array info:', type(img_nparray))print('type:', type(img_nparray.dtype))print('shape:', type(img_nparray.shape))

The output:

Original type of image: <class 'PIL.JpegImagePlugin.JpegImageFile'>Numpy array info: <class 'numpy.ndarray'>type: <class 'numpy.dtype'>shape: <class 'tuple’>

And then:

print(img_nparray)

The output:

[[[ 82.  99. 143.][ 93. 107. 156.][ 99. 105. 165.]...[215. 234. 248.][215. 234. 248.][215. 234. 248.]]...[[116. 160. 223.][ 98. 142. 205.][103. 147. 208.]...[ 69.  54. 139.][ 72.  54. 140.][ 74.  54. 139.]]]

Image Preprocessing with Keras

Keras is the high level framework for working with deep learning. It is built on top of Tensorflow, one of the well known top libraries for doing deep learning. The other library is Pytorch.

The following command install Keras within conda environment. Tensorflow, the supporting library should be automatically installed also.

(myenv) C:\Users\andis>conda install –c anaconda keras

Keras supports the functions for loading, converting, and saving image data. The following shows Keras’ functions to load and view the source image. Please note that Keras is using tensorflow library in the backend (tensorflow will be using GPU — Graphic Processing Unit, to speedup overall process if the hardware is installed). The outputs says that the loaded image is in PIL format (class ‘PIL.JpegImagePlugin.JpegImageFile’) as well as in JPEG format in RGB color space. When it is converted using the Keras’ img_to_array() function, the format is changed to numpy’s ndarray (class ‘numpy.ndarray’).

img_pil = array_to_img(img_nparray)print("Converted (back) type of the image:", type(img_pil))# - saving an image with kerasfrom keras.preprocessing.image import load_img, save_imgImageFile = FilePath + FileName + '_keras' + '.png'print("Saving Image File with Keras...", FilePath + FileName)save_img(ImageFile, img_nparray)# - loading an image with kerasprint("Loading Image File with Keras...", FilePath + FileName)img = load_img(ImageFile)print('Image type:', type(img), ', image format:', img.format, ', image mode:', img.mode, ', image size:', img.size)img.show()

The output:

Converted (back) type of the image: <class 'PIL.Image.Image’>Saving Image File with Keras... data/smurfLoading Image File with Keras... data/smurfImage type: <class 'PIL.PngImagePlugin.PngImageFile'> , image format: PNG , image mode: RGB , image size: (700, 393)

At the time that we want to save the image, Keras’ array_to_img() function converts the ndarray back to image format. The following shows the image file generated from Keras’ save_img() function.

*A PNG file generated by Keras, converted from the source image.*

Keras as one of the leading deep learning frameworks in computer vision, can perform many more variations of image preprocessing. Let’s take a look at a few examples using ImageDataGenerator class that can generate batches of tensor image data with real-time data augmentation.

Data Augmentation is the feature that’s very important when preparing images for training before doing training to generate a trained-model in deep learning. Rotation, horizontal/vertical flip, brightness adjustment, rescaling, zooming are some of operation that we can do with ImageDataGenerator.

At first, we initialize ImageDataGenerator class with ‘Datagen = ImageDataGenerator()’, then use the apply_transform() function to perform various image preprocessing.

apply_transform() function takes 2 arguments (Chollet, Francois and others, 2015–2020). First is x, the input image (must be in the form of 3D tensor data). The 2nd argument is transform_parameters which can be ‘theta’: Rotation angle in degrees, ‘tx’: Shift in the x direction, ‘ty’: Shift in the y direction, ‘shear’: Shear angle in degrees, ‘zx’: Zoom in the x direction, ‘zy’: Zoom in the y direction, ‘flip_horizontal’: Horizontal flip, ‘flip_vertical’: Vertical flip, ‘channel_shift_intencity’: Channel shift intensity, and ‘brightness’: Brightness shift intensity.

# - do image flip (horizontal)from keras.preprocessing.image import ImageDataGeneratorimg = load_img(ImageFile)img_nparray = img_to_array(img)Datagen = ImageDataGenerator()flip_horizontal = Datagen.apply_transform(x=img_nparray, transform_parameters={'flip_horizontal':True})array_to_img(flip_horizontal).show()# - do image flip (vertical)flip_vertical = Datagen.apply_transform(x=img_nparray, transform_parameters={'flip_vertical':True})array_to_img(flip_vertical).show()

The following shows the examples on how to perform image horizontal flip and vertical flip.

*Doing Image Horizontal Flip and Vertical Flip with ImageDataGenerator apply_transform() function in Keras.*

While the following shows how to perform image rotation, zoom-out and zoom-in. Again, all preprocessing operations are applied from the original source image.

# - do image rotationrotate = Datagen.apply_transform(x=img_nparray, transform_parameters={'theta':-25})array_to_img(rotate).show()# - zoom-out in x and y directionzoom_out = Datagen.apply_transform(x=img_nparray, transform_parameters={'zx':2, 'zy':2})array_to_img(zoom_out).show()# - zoom-iin in x and y directionzoom_in = Datagen.apply_transform(x=img_nparray, transform_parameters={'zx':.5, 'zy':.5})array_to_img(zoom_in).show()

The output:

*Doing Image Rotation, Zoom-Out and Zoom-In with ImageDataGenerator apply_transform() function in Keras.*

Image Preprocessing with OpenCV

At the end, we discuss Image Preprocessing with OpenCV — The Open Source Computer Vision and Machine Learning Library. OpenCV provides a common infrastructure for computer vision applications as well as to accelerate the use of machine perception in the commercial products (OpenCV Team, 2020).

In Artificial Intelligence, especially in deep learning — OpenCV has been extensively supporting the implementation of many real applications related to computer vision, either Tensorflow or Pytorch-based among others. Face Recognition is one of its common use-case.

In Python, we use OpenCV by importing cv2. Basic image manipulations with OpenCV are shown below. We start by reading the RGB-formatted source image file and save it into PNG-format file. After that, convert the color space of the image to GREY and HSV by using cvtColor() function. HSV color space is commonly used in doing Object Detection and Object Tracking like implemented in Self-Driving-Car.

# *********************************# D. IMAGE MANIPULATION with OpenCV# *********************************# - first import necessary libraries# import cv2# - set ImageFile to our original image, then read the imageImageFile = FilePath + FileName + FileExtim = cv2.imread(ImageFile)#cv2.imshow("my image", im)# - save the file as PNG before conversion (internally it's in BGR order)cv2.imwrite(FilePath + FileName + '_opencvRGB' + '.png', im)print(type(img))# - convert to GRAY color spaceimg = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)# - save the file as PNG after conversioncv2.imwrite(FilePath + FileName + '_opencvGREY' + '.png', img)# - convert to HSV color space (typical for object tracking)img = cv2.cvtColor(im, cv2.COLOR_BGR2HSV)# - save the file as PNG after conversioncv2.imwrite(FilePath + FileName + '_opencvHSV' + '.png', img)

The output:

<class 'numpy.ndarray'>

*Using cvtColor() function in OpenCV to do basic color space manipulations.*

In the following, we perform another image manipulations: blur and edge detection by calling medianBlur() and Canny() functions respectively (Mokhtar Ali Ebrahim, 2019).

# - apply 50% blur#   alternatively we can use cv2.imshow() to display the image instead of writing it to a file#   e.g.:#     cv2.imshow("OpenCV: apply 50% blur", img)#     cv2.waitKey(0)img = cv2.medianBlur(im, 5)cv2.imwrite(FilePath + FileName + '_opencvBLUR' + '.png', img)# - apply Edge Detectionimg = cv2.Canny(im, 100, 200)cv2.imwrite(FilePath + FileName + '_opencvEdgeDetect' + '.png', img)

The output:

*More image manipulations with OpenCV — performing Blur and Edge Detection on the source image.*

The following shows an enlarged view of the file on edge detection.

*An enlarged view of Edge Detection with OpenCV.*

Note that we keep using cv2.imwrite() function each time to write the result to an external image file. Alternatively we can use cv2.imshow() followed by cv2.waitKey() function to display the image instead of writing it to a file.

What’s next then?

We have touched quickly on Python Image Library (PIL), Matplotlib, Keras and OpenCV with Python programming language to illustrate its basic functionalities in image preprocessing. The tools have much more to offer rather than discussed in this article, just waiting for us to explore.

Being as a Data Scientist (or just as an aspiring Data Scientist) requires continuous long-life learning in exploring and experimenting with datasets, as well as having lots of curiosity to keep ourselves updated on various types of algorithms and tools for multi-industry use-cases implementations.

As any other scientists in other disciplines often dedicate their lifetime in pursuing things in their area of research focus in order to keep updated and strive to be better than state-of-the-arts. It may seem almost impossible to achieve the defined goals at the beginning of the journey — however, with strong persistence and a lot of patient, at the end of the road, although not always true, all of the efforts that we have been doing, will be worth it J.

Becoming handy and having updated knowledge and experience with practical tools in preprocessing datasets (in this case, working and manipulate image or video stream) before doing any analytics or machine learning / deep learning modeling would be an invaluable skills for a Data Scientist.

Well, let’s get started by doing something. And the right time, is Now!.