Close Menu
    Trending
    • The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint
    • The Greedy Boruta Algorithm: Faster Feature Selection Without Sacrificing Recall
    • Metric Deception: When Your Best KPIs Hide Your Worst Failures
    • How to Scale Your LLM usage
    • TruthScan vs. SciSpace: AI Detection Battle
    • Data Science in 2026: Is It Still Worth It?
    • Why We’ve Been Optimizing the Wrong Thing in LLMs for Years
    • The Product Health Score: How I Reduced Critical Incidents by 35% with Unified Monitoring and n8n Automation
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » Understanding Convolutional Neural Networks (CNNs) Through Excel
    Artificial Intelligence

    Understanding Convolutional Neural Networks (CNNs) Through Excel

    ProfitlyAIBy ProfitlyAINovember 17, 2025No Comments13 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    as a black field. We all know that it learns from knowledge, however the query is how it really learns.

    On this article, we’ll construct a tiny Convolutional Neural Community (CNN) straight in Excel to know, step-by-step, how a CNN truly works for photos.

    We’ll open this black field, and watch every step occur proper earlier than our eyes. We’ll perceive all of the calculations which are the inspiration of what we name “deep studying”.

    This text is in a collection of articles about implementing machine studying and deep studying algorithms in Excel. And yow will discover all of the Excel recordsdata on this Kofi link.

    1. How Photographs are Seen by Machines

    1.1 Two Methods to Detect One thing in an Picture

    After we attempt to detect an object in an image, like a cat, there are two major methods: the deterministic strategy and the machine studying strategy. Let’s see how these two approaches work for this instance of recognizing a cat in an image.

    The deterministic approach means writing guidelines by hand.

    For instance, we are able to say {that a} cat has a spherical face, two triangle ears, a physique, a tail, and so forth. So the developer will do all of the work to outline the foundations.

    Then the pc runs all these guidelines, and provides a rating of similarity.

    Deterministic strategy to detect a cat on an image — picture by writer

    The machine studying strategy signifies that we don’t write guidelines by ourselves.

    As an alternative, we give the pc many examples, photos with cats and photos with out cats. Then it learns by itself what makes a cat a cat.

    Machine studying strategy to detect a cat on an image — picture by writer (cats are generated by AI)

    That’s the place issues could grow to be mysterious.

    We often say that the machine will determine it out by itself, however the true query is how.

    In reality, we nonetheless have to inform the machines the way to create these guidelines. And guidelines needs to be learnable. So the important thing level is: how can we outline the sort of guidelines that shall be used?

    To grasp the way to outline guidelines, we first have to know what a picture is.

    1.2 Understanding What an Picture Is

    A cat is advanced type, however we are able to take a easy and clear instance: recognizing handwritten digits from the MNIST dataset.

    First, what’s a picture?

    A digital picture will be seen as a grid of pixels. Every pixel is a quantity that exhibits how shiny it’s, from 0 for white to 255 for black.

    In Excel, we are able to characterize this grid with a desk the place every cell corresponds to at least one pixel.

    MNIST Handwritten digits – picture from the MNIST dataset https://en.wikipedia.org/wiki/MNIST_database (CC BY-SA 3.0)

    The unique dimension of the digits is 28 x 28. However to maintain issues easy, we’ll use a ten×10 desk. It’s sufficiently small for fast calculations however nonetheless massive sufficient to point out the final form.

    So we’ll cut back the dimension.

    For instance, the handwritten quantity “1” will be represented by a ten×10 grid as beneath in Excel.

    Picture is a grid of numbers — picture by writer

    1.3 Earlier than Deep Studying: Basic Machine Studying for Photographs

    Earlier than utilizing CNNs or any deep studying technique, we are able to already acknowledge easy photos with traditional machine studying algorithms equivalent to logistic regression or choice timber.

    On this strategy, every pixel turns into one characteristic. For instance, a ten×10 picture has 100 pixels, so there are 100 options as enter.

    The algorithm then learns to affiliate patterns of pixel values with labels equivalent to “0”, “1”, or “2”.

    Basic ML for picture recognition — picture by writer

    In reality with this easy machine studying strategy, logistic regression can obtain fairly good outcomes with an accuracy round 90%.

    This exhibits that traditional fashions are in a position to study helpful data from uncooked pixel values.

    Nevertheless, they’ve a significant limitation. They deal with every pixel as an impartial worth, with out contemplating its neighbors. Because of this, they can’t perceive spatial relationships with the pixels.

    So intuitively, we all know that the efficiency is not going to be good for advanced photos. So this technique shouldn’t be scalable.

    Now, in the event you already understand how traditional machine studying works, you realize that there is no such thing as a magic. And in reality, you already know what to do: you need to enhance the characteristic engineering step, you need to rework the options, with a view to get extra significant data from the pixels.

    2. Constructing a CNN Step by Step in Excel

    2.1 From advanced CNNs to a easy one in Excel

    After we discuss Convolutional Neural Networks, we frequently see very deep and complicated architectures, like VGG-16. Many layers, 1000’s of parameters, and numerous operations, it appears very advanced, and say that it’s unattainable to know precisely the way it works.

    VGG16 structure — picture by writer

    The primary concept behind the layers is: detecting patterns step-by-step.

    With the instance of handwritten digits, let’s ask a query: what could possibly be the best potential CNN structure?

    First, for the hidden layers, earlier than doing all of the layers, let’s cut back the quantity. What number of? Let’s do one. That’s proper: just one.

    As for the filters, what about their dimensions? In actual CNN layers, we often use 3×3 filters to detect small sample. However let’s start with massive ones.

    How massive? 10×10!

    Sure, why not?

    This additionally signifies that you don’t have to slip the filter throughout the picture. This fashion, we are able to straight examine the enter picture with the filter and see how properly they match.

    This easy case shouldn’t be about efficiency, however about readability.
    It would present how CNNs detect patterns step-by-step.

    Now, we’ve got to outline the variety of filters. We’ll say 10, it’s the minimal. Why? As a result of there are 10 digits, so we’ve got to have a minimal of 10 filters. And we’ll see how they are often discovered within the subsequent part.

    Within the picture beneath, you will have the diagram of this easiest structure of a CNN neural community:

    The only CNN structure – picture by writer

    2.2 Coaching the Filters (or Designing Them Ourselves)

    In an actual CNN, the filters aren’t written by hand. They’re discovered throughout coaching.

    The neural community adjusts the values inside every filter to detect the patterns that greatest assist to acknowledge the pictures.

    In our easy Excel instance, we is not going to prepare the filters.

    As an alternative, we’ll create them ourselves to know what they characterize.

    Since we already know the shapes of handwritten digits, we are able to design filters that appear to be every digit.

    For instance, we are able to draw a filter that matches the type of 0, one other for 1, and so forth.

    An alternative choice is to take the common picture of all examples for every digit and use that because the filter.

    Every filter will then characterize the “common form” of a quantity.

    That is the place the frontier between human and machine turns into seen once more. We will both let the machine uncover the filters, or we are able to use our personal data to construct them manually.

    That’s proper: machines don’t outline the character of the operations. Machine studying researchers outline them. Machines are solely good to do loops, to search out the optimum values for these defines guidelines. And in easy circumstances, people are all the time higher than machines.

    So, if there are solely 10 filters to outline, we all know that we are able to straight outline the ten digits. So we all know, intuitively, the character of those filters. However there are different choices, after all.

    Now, to outline the numerical values of those filters, we are able to straight use our data. And we can also use the coaching dataset.

    Under you possibly can see the ten filters created by averaging all the pictures of every handwritten digit. Every one exhibits the everyday sample that defines a quantity.

    Common values as filters — picture by writer

    2.3 How a CNN Detects Patterns

    Now that we’ve got the filters, we’ve got to check the enter picture to those filters.

    The central operation in a CNN is named cross-correlation. It’s the key mechanism that permits the pc to match patterns in a picture.

    It really works in two easy steps:

    1. Multiply values/dot product: we take every pixel within the enter picture, and we’ll multiply it by the pixel in the identical place of the filter. Which means that the filter “appears” at every pixel of the picture and measures how related it’s to the sample saved within the filter. Sure, if the 2 values are massive, then the result’s massive.
    2. Add outcomes/sum: The merchandise of those multiplications are then added collectively to supply a single quantity. This quantity expresses how strongly the enter picture matches the filter.
    Instance of Cross Correlation for one image – picture by writer

    In our simplified structure, the filter has the identical measurement because the enter picture (10×10).

    Due to this, the filter doesn’t want to maneuver throughout the picture.
    As an alternative, the cross-correlation is utilized as soon as, evaluating the entire picture with the filter straight.

    This quantity represents how properly the picture matches the sample contained in the filter.

    If the filter appears like the common form of a handwritten “5”, a excessive worth signifies that the picture might be a “5”.

    By repeating this operation with all filters, one per digit, we are able to see which sample provides the very best match.

    2.4 Constructing a Easy CNN in Excel

    We will now create a small CNN from finish to finish to see how the total course of works in follow.

    1. Enter: A ten×10 matrix represents the picture to categorise.
    2. Filters: We outline ten filters of measurement 10×10, each representing the common picture of a handwritten digit from 0 to 9. These filters act as sample detectors for every quantity.
    3. Cross correlation: Every filter is utilized to the enter picture, producing a single rating that measures how properly the picture matches that filter’s sample.
    4. Determination: The filter with the very best rating provides the expected digit. In deep studying frameworks, this step is commonly dealt with by a Softmax operate, which converts all scores into possibilities.
      In our easy Excel model, taking the most rating is sufficient to decide which digit the picture probably represents.
    Each 10×10 filter represents the average shape of a handwritten digit (0–9).
The input image is compared with all filters using cross-correlation.
The filter that produces the highest score — after normalization with Softmax — corresponds to the detected digit.
    Cross-correlation of the enter digit with ten common digit filters. The best rating, normalized by Softmax, identifies the enter as “6.” – picture by writer

    2.5 Convolution or Cross Correlation?

    At this level, you may surprise why we name it a Convolutional Neural Community when the operation we described is definitely cross-correlation.

    The distinction is refined however easy:

    • Convolution means flipping the filter each horizontally and vertically earlier than sliding it over the picture.
    • Cross-correlation means making use of the filter straight, with out flipping.

    For extra data, you possibly can learn this text:

    For some historic motive, the time period Convolution stayed, whereas the operation that’s truly executed in a CNN is cross-correlation.

    As you possibly can see, in most deep-learning frameworks, equivalent to PyTorch or TensorFlow, truly use cross-correlation when performing “convolutions”.

    Cross correlation and convolution — picture by writer

    Briefly:

    CNNs are “convolutional” in identify, however “cross-correlational” in follow.

    3. Constructing Extra Complicated Architectures

    3.1 Small filters to detect extra detailed patterns

    Within the earlier instance, we used a single 10×10 filter to check the entire picture with one sample.

    This was sufficient to know the precept of cross-correlation and the way a CNN detects similarity between a picture and a filter.

    Now we are able to take one step additional.

    As an alternative of 1 international filter, we’ll use a number of smaller filters, every of measurement 5×5. These filters will take a look at smaller areas of the picture, detecting native particulars as a substitute of the complete form.

    Let’s take an instance with 4 5×5 filters utilized to a handwritten digit.

    The enter picture will be reduce into 4 smaller elements of 5×5 pixels for each.

    We nonetheless can use the common worth of all of the digits to start with. So every filter will give 4 values, as a substitute of 1.

    Smaller filters in CNN for digits recognition – picture by writer

    On the finish, we are able to apply a Softmax operate to get the ultimate prediction.

    However on this easy case, additionally it is potential simply to sum all of the values.

    3.2 What if the digit shouldn’t be within the heart of the picture

    In my earlier examples, I examine the filters to mounted areas of the picture. And one intuitive query that we are able to ask is what if the item shouldn’t be centered. Sure, it may be at any place on a picture.

    The answer is sadly very fundamental: you slide the filter throughout the picture.

    Let’s take a easy instance once more: the dimension of the enter picture is 10×14. The peak shouldn’t be modified, and the width is 14.

    So the filter remains to be 10 x 10, and it’ll slide horizontally throughout the picture. Then, we’ll get 5 cross-correlation.

    We have no idea the place the picture is, however it’s not an issue as a result of we are able to simply get the max worth of the 5 the-cross correlations.

    That is what we name max pooling layer.

    Max pooling in a easy CNN – Picture by writer

    3.3 Different Operations Utilized in CNNs

    We attempt to clarify, why every element is beneficial in a CNN.

    Crucial element is the cross-correlation between the enter and the filters. And we additionally clarify that small filters will be helpful, and the way max pooling handles objects that may be wherever in a picture.

    There are additionally different steps generally utilized in CNNs, equivalent to utilizing a number of layers in a row or making use of non-linear activation features.

    These steps make the mannequin extra versatile, extra strong, and in a position to study richer patterns.

    Why are they helpful precisely?

    I’ll go away this query to you as an train.

    Now that you simply perceive the core concept, attempt to consider how every of those steps helps a CNN go additional, and you’ll attempt to consider some concrete instance in Excel.

    Conclusion

    Simulating a CNN in Excel is a enjoyable and sensible technique to see how machines acknowledge photos.

    By working with small matrices and easy filters, we are able to perceive the principle steps of a CNN.

    I hope this text gave you some meals for considered what deep studying actually is. The distinction between machine studying and deep studying shouldn’t be solely about how deep the mannequin is, however about the way it works with representations of photos and knowledge.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleJavascript Fatigue: HTMX Is All You Need to Build ChatGPT — Part 2
    Next Article Multimodal AI: Real-World Use Cases, Limits & What You Need
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint

    November 30, 2025
    Artificial Intelligence

    The Greedy Boruta Algorithm: Faster Feature Selection Without Sacrificing Recall

    November 30, 2025
    Artificial Intelligence

    Metric Deception: When Your Best KPIs Hide Your Worst Failures

    November 29, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Topic Model Labelling with LLMs | Towards Data Science

    July 14, 2025

    Guide: Så får du ut mesta möjliga av Perplexitys AI-funktioner

    June 26, 2025

    Gamers Nexus avslöjar omfattande GPU-smugglingsimperium från Kina

    August 19, 2025

    Top Multimodal AI Applications & Use Cases in 2025 – Transforming Industries

    April 4, 2025

    Exploring Prompt Learning: Using English Feedback to Optimize LLM Systems

    July 16, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    Why MissForest Fails in Prediction Tasks: A Key Limitation You Need to Keep in Mind

    September 26, 2025

    Miljoner vänder sig till AI-chattbotar för andlig vägledning och bikt

    October 3, 2025

    How to Consistently Extract Metadata from Complex Documents

    October 24, 2025
    Our Picks

    The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint

    November 30, 2025

    The Greedy Boruta Algorithm: Faster Feature Selection Without Sacrificing Recall

    November 30, 2025

    Metric Deception: When Your Best KPIs Hide Your Worst Failures

    November 29, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.