Image Enhancement in Machine Learning: the Ultimate Guide

Image enhancement improves an image’s visual quality by adjusting its features like brightness, contrast, sharpness, color, etc. The main goal of image enhancement is to make the image more visually appealing and easier to interpret - both for humans and machine learning models. This article serves as an ultimate guide for image enhancement in 2023.

There are two main methods for performing image enhancement:
Spatial Domain Methods
Frequency Domain Methods

Spatial Domain Methods deal with the pixels in an image, while Frequency Domain Methods work by moving the image to a frequency domain. Frequency Domain Methods work by computing the Fourier transform of the image. The Inverse Fourier transform is used to obtain the final image.

As previously mentioned, image enhancement changes the attributes of an image, such as contrast and brightness. Hence, changing various image characteristics leads to new pixel values resulting in a different picture. For example, given a dark image, you can make it clearer by stretching the grayscale at the dark levels.

Besides the traditional image enhancement techniques, enhancement can also be done using machine learning (which is why you might be reading this now). Traditional methods take a long while machine learning techniques to achieve reasonably great results in a shorter period.

This article will explore various image enhancement techniques and how they have been implemented using machine learning.

Let’s dive in!

What is image augmentation?

Image augmentation is a technique in computer vision to supplement the dataset with artificial variations of existing images. The goal of image augmentation is to prevent overfitting, increase the diversity of the data, and make the model more robust to different types of data. Augmentation can be conducted with transformations such as rotation, scaling, flipping, cropping, or adding noise to the images. The newly synthesized images are used to train the machine learning models, leading to better generalization and improved performance on unseen data.

For example, consider a deep learning model trained to recognize handwritten digits. Suppose, like in the MNIST dataset, the training data consists of only 28x28 pixel grayscale images of numbers centered in the picture. In that case, the model may only be able to recognize digits that appear in a similar fashion. The training data can be augmented by rotating, scaling, skewing the images, adding random noise, or blurring to simulate different lighting conditions to reduce the likelihood of this happening. This helps the ML model be more robust and recognize handwritten digits in various formats. That is why, specifically for the MNIST dataset, a handful of enhanced versions of MNIST alternatives like EMNIST or NOT MNIST were introduced.

Examples of image augmentation include:

Flipping the image
Random cropping
Reducing the contrast or brightness
Shearing the image

Image augmentation is also widely applied when the amount of training data is not enough.

What is image enhancement?

Image enhancement encompasses a wide range of techniques aimed at improving the quality and visual appeal of an image. It involves the manipulation of either the entire image or specific attributes within it. For example, adjusting the contrast and brightness can bring out details and make the image more vibrant, while fine-tuning the range of the RGB color pattern can enhance the color accuracy and overall tonality.

The nature and extent of these image modifications can vary significantly depending on the specific goal of the enhancement. Image enhancement is, to some extent, a subjective process as the desired outcome heavily relies on the intended purpose of the image. For instance, in the realm of medical imaging, the objective may be to highlight and emphasize specific structures or anomalies within the body. In contrast, when it comes to headshot photography, the primary focus might be on accentuating facial features while fine-tuning brightness and contrast levels to achieve a pleasing aesthetic.

Traditionally, image enhancement has been accomplished through the use of specialized image editing software like Photoshop and Affinity. However, advancements in generative AI technology have opened up new possibilities. Innovative products such as Midjourney, LAION’s Stable Diffusion model, DreamStudio, and Adobe’s Firefly have emerged, enabling image enhancement through the assistance of large language models. While these AI-based approaches are gaining traction, the predominant methods of image enhancement currently involve programmatically manipulating images using libraries and tools such as Pillow, OpenCV, and machine learning algorithms.

Moreover, image enhancement technology has proven to be invaluable in the field of medical imaging. By applying enhancement techniques to radiographs, CT scans, and other medical images, the aim is to facilitate the interpretation of results by medical professionals. These enhancements can improve the visibility of critical structures, highlight anomalies or abnormalities, and enhance the overall diagnostic accuracy, ultimately aiding in better patient care.

What are some examples of image enhancement?

Histogram Equalization: Histogram equalization is a method of image enhancement that stretches the contrast of an image. In code, this is most often achieved using the OpenCV or Pillow libraries with the following functions: cv2.equalizeHist() or PIL.ImageOps.equalize().
Gamma Correction: Gamma correction is used to brighten or darken an image. In code, this can be accomplished easily using the OpenCV or Pillow libraries with the following function calls: cv2.LUT() or PIL.ImageEnhance.Brightness().
Contrast Stretching: Contrast stretching is a technique that stretches the contrast of an image in order to increase its visibility. In code, contrast stretching is most often done using the OpenCV or Pillow libraries with the following functions: cv2.resize() or PIL.ImageOps.autocontrast().
Sharpening: Sharpening is a technique used to enhance certain features in an image. In code, this can be achieved using the OpenCV or Pillow libraries with the following functions: cv2.filter2D() or PIL.ImageEnhance.Sharpness().
Noise Reduction: Noise reduction is an image enhancement technique used to reduce the amount of noise in an image. In code, this can be accomplished using the OpenCV or Pillow libraries with the following functions: cv2.blur() or PIL.ImageFilter.GaussianBlur().
Image Dehazing: is a common image enhancement technique used to reduce the amount of haze in an image. In code, this can be accomplished using the OpenCV or Pillow libraries with the following functions: cv2.bilateralFilter() or PIL.ImageTweak.Dehaze().

What is the difference between image enhancement and image augmentation?

Yes, DataChad supports chatting with many files at the same time. It allows you to chat with PDFs, Word docs, text files, and CSVs all at once.

While image enhancement is mainly focused on changing the visual appeal of a single image, image augmentation aims to generate additional image files to expose a deep learning model to more aspects of the same image.

Advanced examples of image enhancement

Image quality enhancement techniques such as Super Resolution and Low Light Image Enhancement are used to recover as much detail as possible from low-resolution images. Accurate Image Super-Resolution Using Very Deep Convolutional Networks and the Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement (Zero-DCE) utilize both traditional methods and deep learning approaches such as AI upscaling and Zero-DCE respectively to reduce noise, increase contrast and brightness, allowing them to be used in applications such as security cameras and object detection. LIME: A Method for Low-Light Image Enhancement also solves this problem by creating an illumination map from the R, G, and B channels and does not require paired or unpaired data, making it less prone to overfitting and to be used in various applications such as security cameras, object detection, and scene understanding.
Low-light examples
Low-light enhancement

Joint image filtering

Image filtering with guidance signals is known as joint or guided filtering. Joint image filtering involves transferring critical structural details from the guidance image to the target image. It aims at enhancing the target image while not passing along structures that did not exist in the image. It is typically applied in various computer vision tasks such as:

Structure-texture separation.
Joint upsampling
Cross-modality noise reduction
Depth map enhancement

The Joint Image Filtering with Deep Convolutional Networks method proposes a CNN network with three-sub networks and skip connections. The first two sub-networks are responsible for extracting informative features from the guidance and input image. These features are concatenated and used as features for the third sub-network. Skip connections are used so that the network learns to predict residuals between the target image and the ground truth output. The parameters of the three sub-networks are updated simultaneously during training.
Joint image filtering example

Image shifting

Image shifting involves shifting the pixels in an image to new locations. For example, you can shift an image vertically or horizontally. When shifting an image, all the image’s pixels are moved to a new position while the dimensions of the image are maintained.

Changing colors/recoloring

Colorization is commonly applied in changing black and white images to colored images. This can be achieved using photo editing tools or neural networks. It is applicable in photography, where one wants to color old images and videos.

Coloring images manually is a cumbersome and expensive process that takes long. Fortunately, image recoloring can be automated using deep learning. This is done using a Convolutional Neural Network that takes in an input image and predicts the colors to fill the image.

Image recoloring can also be applied to colored images where you want to change the colors in an image. Recoloring images can be used to achieve various artistic goals. Recoloring is also applied in photography to ensure an image looks professional. The example below shows recoloring a flower to purple, green, red, and orange using Generative Adversarial Networks (GANs).
Image recoloring

Image denoising

Denoising convolutional neural networks (DnCNNs) is a technique for reducing image noise using feedforward neural networks. It uses residual learning and batch normalization to speed up training and improve denoising performance. The method tackles denoising with unknown levels of Gaussian noise –– blind Gaussian denoising. The first image below shows a noisy image, while the second one shows the result generated by DnCNNs.
Image denoising example
Image denoising cleaned example

Image demosaicing

Color image demosaicing involves interpolating missing color values using nearby pixels in raw images captured by cameras. The images are typically captured with 12 bits per pixel, and Color Filter Arrays (CFA) are placed in front of the camera’s image sensor array. CFAs allow different wavelengths of light to pass through the camera, and one common type is the Bayer pattern, which captures information at red, green, and blue wavelengths.

The output of the Bayer pattern is a mosaiced image, which needs to be converted to a standard RGB image through the process of color image demosaicing. During image capture, a camera runs a demosaicing algorithm to generate the full RGB image. Bilinear interpolation is a commonly used method in demosaicing, where missing pixels are assumed to be similar to adjacent ones, and the missing values are replaced by the average of neighboring pixels. However, demosaicing itself does not remove noise from the image, so additional denoising may be required, either through a separate denoising algorithm or by using an algorithm that combines denoising and demosaicing.

Traditional demosaicking methods include bilinear interpolation and Malvar interpolation. In addition to these conventional techniques, machine learning methods can also be applied to demosaicing. Demosaicing is similar to super resolution in that both involve filling in missing pixel information, so super-resolution techniques can be used in demosaicing as well. Some machine learning methods suitable for image demosaicing are k-Nearest Neighbors, Support Vector Regression, and Super-Resolution Convolutional Neural Network.

Nearest neighbor Bayer image demosaicing can be performed in the following steps:

Split the bayer-pattern image into three color channels.
Fill in missing pixel information for each color channel using bilinear interpolation.
Merge the interpolation result with the detailed information predicted by the k-Nearest Neighbors model.

Watermark Removal

Adding watermarks to images serves various purposes, such as preventing unauthorized copying or reproduction. However, watermarks can sometimes
obscure details in the image. In such cases, it becomes necessary to remove the watermark to reveal those hidden details.

Defading

Defading is a process used to recover information from faint or degraded images. For instance, the ink on old documents can wear off over time, making the text difficult to read. Fading can also occur due to overexposure during the digitization of documents. Defading methods aim to restore a more visible and legible version of the image by mitigating the effects of fading. Defading example

Unblur

Unblur is the technique employed to remove blur from an image, enhancing its clarity. Blur in images can be caused by subject movement during capture, camera focus issues, or camera shake. Various methods can be used to deblur images, including prior-based methods that estimate blur kernels and parameters, as well as learning-based methods that utilize deep learning algorithms to learn deblurring models. Unbluring

Binarization

Binarization is a process that separates an image into foreground and background, resulting in a binary image with black-and-white regions. It is commonly used to eliminate degradations like noise and is crucial for tasks such as optical character recognition and document layout analysis. Binarization can be performed using different approaches, including global thresholding and local thresholding based on neighboring pixels. Binarization example

Histogram Matching

Histogram matching, also known as histogram specification, is an image
processing technique used to generate an image based on a specified histogram. It involves obtaining histograms for both a reference image and a target image, computing cumulative distribution functions, and applying a histogram matching function to each pixel of the reference image. Histogram matching is employed to normalize images taken under different conditions and can be performed using libraries like skimage, OpenCV, or deep learning techniques.

Contrast-Limited Adaptive Histogram Equalization (CLAHE)

CLAHE is a method for contrast enhancement that works on small regions or tiles of an image. It improves local contrast, which is particularly useful in areas such as microscopic imaging, X-ray imaging, medical image analysis, and high-definition television (HDTV). CLAHE utilizes parameters like the number of tiles and clip limit to control noise amplification and enhance visibility in various applications.

Wiener Filter

Wiener filter is an image processing technique that removes additive noise and inversely deblurs an image. It outperforms the inverse filter by accounting for both degradation functions and the statistical properties of noise, resulting in a better restoration of the original image. Wiener filtering minimizes mean squared error in noise smoothing and inverse filtering, providing a linear estimation of the original image.

Median Filter

Median filter is a non-linear image processing technique used to remove noise while preserving edges. It replaces each pixel in an image with the median value of neighboring pixels within a window or neighborhood. The median filter is particularly effective in removing salt and pepper noise, producing better results than the mean filter due to its robustness and preservation of edge information.

Linear Contrast Enhancement

Linear contrast enhancement, also known as contrast stretching, is a method for stretching the pixel values in an image to a new distribution. It can be achieved through different approaches such as Min-Max Linear Contrast Stretch, Percentage Linear Contrast Stretch, or Piecewise Linear Contrast Stretch. Linear contrast enhancement improves the visibility of attributes in an image by expanding the range of contrast.

Unsharp Mask Filtering

Unsharp mask filtering is an image processing technique used to enhance image sharpness and reveal details that may not be clear in the original image. It involves removing low-frequency spatial information by creating an unsharp mask through the application of a Gaussian low-pass filter. Combining the unsharp mask with the original image results in a less blurry image, although it may increase noise.

Using Deeplake + Pillow for image enhancement

Finding suitable datasets for image enhancement tasks can be challenging. However, several datasets are available for specific purposes, such as the LOL (Low-Light) dataset for low-light image enhancement, ARID dataset for activity recognition in videos, Bickley diary dataset containing images of a diary, and NoisyOffice dataset with noisy images of an office. These datasets can be used for training machine learning models or evaluating image enhancement algorithms.

The LOL (Low-Light) dataset can be easily downloaded using deeplake from Activeloop.

 
      
        1import deeplake
2ds = deeplake.load('hub://activeloop/lowlight-train')
3

Next, let’s look at a summary of the data.

ds.summary()

Activeloop also makes it easy to visualize the data using the visualize function.

ds.visualize()

With some datasets at hand and having learned various image enhancement techniques, let’s now look at how to implement them in Python using Pillow and OpenCV.

First, copy the dataset from Activeloop to gain write access. This is done using the deepcopy function, which expects:

The data source
The dataset destination, that is, your personal Activeloop account
Your API token

The function signature is shown below:

 
      
        1ds = deeplake.deepcopy('hub://activeloop/lowlight-train', dest="hub://mwitiderrick/lowlight-data", token="YOUR_TOKEN")`
2
3

Next, load your version of the low-light dataset. Passing your API token and read_only=False gives you write access to the dataset.

 
      
        1
2ds = deeplake.load('hub://mwitiderrick/lowlight-data', token="YOUR_TOKEN",read_only=False)`
3

Make images sharper with Pillow

Let’s start by looking at how to make all the images sharper—checkout to a different branch. You will then commit the changes you make to this branch and later merge it to main if you want to.

 
      
        1from PIL import Image, ImageEnhance, ImageFilter
2import numpy as np
3from PIL import Image
4import numpy as np
5
6ds.checkout('sharp_image', create = True)
7

Next, loop through the images, make them sharp using Pillow, and commit the changes.

 
      
        1with ds:
2  for  i, sample  in  enumerate(ds):
3    image = ds['highlight_images'][i].numpy()
4     im = Image.fromarray(image).convert('RGB')
5     enhancer = ImageEnhance.Sharpness(im)
6     ds.highlight_images[i] = np.asarray(enhancer.enhance(10.0))
7  sharpen_commit_id = ds.commit('Sharpen the images')
8

Sharpen commit
Head back to your Activeloop account to see the new images. Click Version and switch to the sharp_image branch.

Increase the brightness of images

You can increase the brightness of all the images in a similar way. The steps are:

Create a new branch for the new images
Brighten the images
Commit the changes

This can be done with the following code.

 
      
        1    ds.checkout('bright_image', create = True)
2    with ds:
3        for i, sample in enumerate(ds):
4        image = ds['highlight_images'][i].numpy()
5        im = Image.fromarray(image).convert('RGB')
6        enhancer = ImageEnhance.Contrast(im)
7        ds.highlight_images[i] = np.asarray(enhancer.enhance(3))
8    bright_commit_id = ds.commit('increase contrast')
9

Sharpened example
Switch to the `’bright_image’ branch on Activeloop to see the brightened images.

Find edges in an image

Finally, let’s look at how you can use Pillow to find edges in the images. Create a branch for the new images and commit the changes there.

 
      
        1    ds.checkout('find_edges_image', create = True)
2
3    with ds:
4        for i, sample in enumerate(ds):
5        image = ds['highlight_images'][i].numpy()
6        im = Image.fromarray(image).convert('RGB')
7        enhancer = ImageEnhance.Contrast(im)
8        ds.highlight_images[i] = np.asarray(im.filter(ImageFilter.FIND_EDGES))
9    edges_commit_id = ds.commit('find the edges')
10

Using Deep Lake + OpenCV for image enhancement

OpenCV is another Python package for processing images. Let’s apply the median filter to all images using OpenCV. To do that, call the medianBlur method. It expects the image and the size of the kernel.

 
      
        1    import cv2 
2
3    ds.checkout('median_filter', create = True)
4
5    with ds:
6        for i, sample in enumerate(ds):
7        image = ds['highlight_images'][i].numpy()
8        ds.highlight_i
9        images[i] = cv2.medianBlur(image, 11)
10    median_commit_id = ds.commit('Median filter')
11

Sharpening images with OpenCV

Sharpening an image in OpenCV involves creating a sharpening kernel and passing the kernel and the image to the filter2D function.

 
      
        1    ds.checkout('cv_sharpen', create = True)
2
3    with ds:
4        sharp_kernel = np.array([[-1,-1,-1], [-1,10,-1], [-1,-1,-1]])
5        for i, sample in enumerate(ds):
6            image = ds['highlight_images'][i].numpy()
7            ds.highlight_images[i] = cv2.filter2D(image, -1, sharp_kernel)
8        sharp_commit_id = ds.commit('Sharpen image')
9

Apart from viewing the changes from the Activeloop web UI, you can also check out to previous branches in your coding environment and visualize the image. For example, let’s take a look at a sample image that has been sharpened.

`Image.fromarray(ds.highlight_images[1].numpy())`

Checkout to the main branch and compare this image with the original one.

 
      
        1    ds.checkout('main')
2    Image.fromarray(ds.highlight_images[1].numpy())
3

You can clearly see the difference between the original and the modified image.

Blurring images with OpenCV

Blurring an image in OpenCV is done by creating a blurring kernel and passing it to the image.

 
      
        1    ds.checkout('blurring', create = True)
2
3    with ds:
4        kernel_4x4 = np.ones((4, 4), np.float32) / 16
5        for i, sample in enumerate(ds):
6            image = ds['highlight_images'][i].numpy()
7            ds.highlight_images[i] = cv2.filter2D(image, -1, kernel_4x4)
8            blurred_commit_id = ds.commit('blur images')
9        ds.checkout('binary', create=True)
10

Deep learning for image enhancement

Various papers have been proposed for enhancing images in deep learning. They include:

Learning Enriched Features for Real Image Restoration and Enhancement
Uformer: A General U-Shaped Transformer for Image Restoration
Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation
EnlightenGAN: Deep Light Enhancement without Paired Supervision
Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement
Accurate Image Super-Resolution Using Very Deep Convolutional Networks

The implementation and pre-trained models for some of these papers are available on GitHub. For example, using 'torch-enhance’, you can perform super-resolution based on the last mentioned paper.

To perform super-resolution with torch-enhance, you:

Fetch the pre-trained model.
Initialize the model with the desired configuration, in this case, doubling the resolution of the image.
Apply the model to the image.
Convert the output to an image.
Save or display the image.

And here is the code to do this.

 
      
        1    ds.checkout('super_resolution', create = True)
2
3    import torch
4    import torch_enhance
5    import torchvision
6    import torchvision.transforms as T
7    from PIL import Image
8
9    i = 234
10    image = image = ds['highlight_images'][i].numpy()
11
12    lr = torchvision.transforms.functional.pil_to_tensor(Image.fromarray(image).convert('RGB')) / 255.0
13    model = torch_enhance.models.SRResNet(scale_factor=2, channels=3)
14    sr = model(lr.unsqueeze(0))
15
16    transform = T.ToPILImage()
17    img = transform(sr.squeeze())
18    ds.highlight_images[i] = np.asarray(img)
19
20    super_commit_id = ds.commit('super resolution')
21

Inspecting the image shows that the size has increased and the dimensions have doubled.

Image enhancement metrics and loss functions

There are losses and metrics used when training machine learning models for image enhancement. Some of the losses used include:

Perpetual loss
Mean Squared Error

Metrics used to track the performance of image enhancement methods include:

Mean absolute error (MAE)
Mean squared error (MSE)
Peak-signal-noise ratio (PSNR)
Structural Similarity Index (SSIM)

Final thoughts

There are many image enhancement techniques. The choice of method depends on the subsequent task, such as object detection or optical character recognition, for example. Generally, enhancing images increases the probability of success in tasks that follow it. You have now added these techniques and their implementation to your machine learning repertoire. More specifically you have learned:

What is image enhancement?
Image enhancement techniques in Python.
How to implement image enhancement in Pillow.
Image enhancement in OpenCV.
How to perform image enhancement in PyTorch.

Image Enhancement FAQs

What are some techniques used for image quality enhancement?

Super Resolution and Low Light Image Enhancement are some techniques used for improving image quality. These methods aim to recover as much detail as possible from low-resolution images.

What methods do Accurate Image Super-Resolution and Zero-DCE use for image enhancement?

Accurate Image Super-Resolution uses Very Deep Convolutional Networks, while Zero-DCE (Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement) leverages both traditional methods and deep learning approaches like AI upscaling and Zero-DCE.

What are the benefits of using these image enhancement techniques?

These techniques help reduce noise, increase contrast and brightness in images, making them suitable for various applications such as security cameras and object detection.

How does the LIME technique enhance low-light images?

LIME (A Method for Low-Light Image Enhancement) enhances low-light images by creating an illumination map from the Red, Green, and Blue channels of the image.

Does LIME require paired or unpaired data for image enhancement?

No, LIME does not require paired or unpaired data for image enhancement. This feature makes it less prone to overfitting and versatile for various applications including security cameras, object detection, and scene understanding.

What are some practical applications of these image enhancement techniques?

Some practical applications of these image enhancement techniques include security cameras, object detection, and scene understanding among others.