9/20
Convolutional Neural Networks (CNN) β€” Image Processing Β· Page 1 of 2

Convolution & Feature Maps

Convolutional Neural Networks (CNN)

Why CNNs for Images?

Traditional dense layers treat image as flat vector:

Image: 32Γ—32 pixels
↓
Flatten: 1024 values
↓
Dense layer: 1024 Γ— 512 parameters

Problem: Loses spatial structure!
- Pixel at (0,0) connected to pixel at (31,31)
- No local structure exploited
- VERY slow on large images

CNNs preserve spatial structure:

  • Learn local features (edges, corners, shapes)
  • Share weights (same filter applied to all locations)
  • Reduce parameters dramatically

The Convolution Operation

A filter (kernel) slides over the image, computing dot products:

Image:
[1 2 3]
[4 5 6]
[7 8 9]

Filter (3Γ—3):
[1 0 -1]
[2 0 -2]
[1 0 -1]

Convolution at position (0,0):
(1Γ—1 + 2Γ—0 + 3Γ—(-1) + 4Γ—2 + 5Γ—0 + 6Γ—(-2) + 7Γ—1 + 8Γ—0 + 9Γ—(-9))
= 1 + 0 - 3 + 8 + 0 - 12 + 7 + 0 - 81
= Output: -80

Result: Feature map (one number per position)

Feature Detection

Different filters detect different features:

FilterDetectsExample
[1 0 -1; 2 0 -2; 1 0 -1]Vertical edgesLine pattern
[1 2 1; 0 0 0; -1 -2 -1]Horizontal edgesLine pattern
LearnedCorners, textures, shapesComplex patterns

Key insight: CNNs automatically learn these filters!

Stacking Layers

Input Image (32Γ—32Γ—3)
      ↓
Conv1 (16 filters) β†’ 32Γ—32Γ—16  (detect low-level: edges)
      ↓
Conv2 (32 filters) β†’ 16Γ—16Γ—32  (detect mid-level: shapes)
      ↓
Conv3 (64 filters) β†’ 8Γ—8Γ—64    (detect high-level: objects)
      ↓
Global Average Pool β†’ 64
      ↓
Dense β†’ 10 classes (output)

Hierarchy:

  • Early layers: edges, colors
  • Middle layers: shapes, textures
  • Late layers: whole objects
main.py
Loading...
OUTPUT
β–ΆClick "Run Code" to execute…