Page9/20
Convolutional Neural Networks (CNN) β Image Processing Β· Page 1 of 2
Convolution & Feature Maps
Convolutional Neural Networks (CNN)
Why CNNs for Images?
Traditional dense layers treat image as flat vector:
Image: 32Γ32 pixels
β
Flatten: 1024 values
β
Dense layer: 1024 Γ 512 parameters
Problem: Loses spatial structure!
- Pixel at (0,0) connected to pixel at (31,31)
- No local structure exploited
- VERY slow on large images
CNNs preserve spatial structure:
- Learn local features (edges, corners, shapes)
- Share weights (same filter applied to all locations)
- Reduce parameters dramatically
The Convolution Operation
A filter (kernel) slides over the image, computing dot products:
Image:
[1 2 3]
[4 5 6]
[7 8 9]
Filter (3Γ3):
[1 0 -1]
[2 0 -2]
[1 0 -1]
Convolution at position (0,0):
(1Γ1 + 2Γ0 + 3Γ(-1) + 4Γ2 + 5Γ0 + 6Γ(-2) + 7Γ1 + 8Γ0 + 9Γ(-9))
= 1 + 0 - 3 + 8 + 0 - 12 + 7 + 0 - 81
= Output: -80
Result: Feature map (one number per position)
Feature Detection
Different filters detect different features:
| Filter | Detects | Example |
|---|---|---|
| [1 0 -1; 2 0 -2; 1 0 -1] | Vertical edges | Line pattern |
| [1 2 1; 0 0 0; -1 -2 -1] | Horizontal edges | Line pattern |
| Learned | Corners, textures, shapes | Complex patterns |
Key insight: CNNs automatically learn these filters!
Stacking Layers
Input Image (32Γ32Γ3)
β
Conv1 (16 filters) β 32Γ32Γ16 (detect low-level: edges)
β
Conv2 (32 filters) β 16Γ16Γ32 (detect mid-level: shapes)
β
Conv3 (64 filters) β 8Γ8Γ64 (detect high-level: objects)
β
Global Average Pool β 64
β
Dense β 10 classes (output)
Hierarchy:
- Early layers: edges, colors
- Middle layers: shapes, textures
- Late layers: whole objects
main.py
Loading...
OUTPUT
βΆClick "Run Code" to executeβ¦