Convolution

Convolution is a mathematical operation that combines two functions to produce a third function. It is widely used in image processing, particularly in the context of applying filters to images. I'm interested in convolution because it's used widely in the eponymously named convolutional neural networks (CNNs).

Convolution(f,g)=f(t)g(τt)dt\text{Convolution}(f, g) = \int_{-\infty}^{\infty} f(t) g(\tau - t) dt

And its matrix form is:

Convolution(f,g)=i=0mj=0nf(i,j)g(xi,yj)\text{Convolution}(f, g) = \sum_{i=0}^{m} \sum_{j=0}^{n} f(i, j) g(x - i, y - j)

Where ff is the input image, gg is the filter (or kernel), and (x,y)(x, y) are the coordinates of the output pixel. In a neural network the convolution kernel is learned from the data, but here I will use some common filters to build intuition in the concept.

Convolution for curves

We draw the function

f(x)=1exp(cx)f(x) = 1 - \exp(-c |x|)

(shown in blue) and convolve it with a Gaussian kernel:

G(x)=1σ2πexp(x22σ2)G(x) = \frac{1}{\sigma \sqrt{2 \pi} } \exp\left(-\frac{x^2}{2 \sigma^2}\right)

and the convolved result is shown in red.

Convolution for noisy data

The true curve is sin(x)\sin(x), the noisy curve is the true curve plus some random noise. and we convolve it with a normalised Gaussian kernel

G(x)=exp(x22σ2)G(x) = \exp\left(-\frac{x^2}{2 \sigma^2}\right)

to smooth it out. The convolved result is shown in red.

Convolution for Images

Original Image

This image is taking from the Fashion-MNIST dataset. It is a 28x28 pixel grayscale image of a handwritten digit. The image is represented as a 2D array of pixel values, where each pixel value is an integer between 0 and 1, representing the intensity of the pixel (0 = white, 1 = black).

Gaussian Filter (3x3)

Mild low-pass filter that smooths fine-grained noise without overly smearing edges.

Sobel

Computes the horizontal gradient, lighting up vertical edges where brightness changes left-to-right. Sign of the response encodes edge direction.

Like shining a light on the image from the left, revealing vertical features.

Sobel Y

Computes the vertical gradient, highlighting horizontal edges where brightness changes top-to-bottom. Like shining a light on the image from above, revealing horizontal features.

Laplacian

Isotropic second-derivative operator that fires on any abrupt intensity change, regardless of direction. Excellent for edge maps but very sensitive to noise, so a prior blur is common.

Sharpen

Boosts high-frequency detail while keeping overall brightness. Enhances texture and clarity but can exaggerate noise and ringing.

Emboss

Offsets a first-derivative kernel to accentuate one lighting direction, making foreground features appear raised. Produces a grayish bas-relief look useful for stylistic effects.