Close Menu
    Trending
    • Optimizing Data Transfer in Distributed AI/ML Training Workloads
    • Achieving 5x Agentic Coding Performance with Few-Shot Prompting
    • Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research by Anthropic Found
    • From Transactions to Trends: Predict When a Customer Is About to Stop Buying
    • America’s coming war over AI regulation
    • “Dr. Google” had its issues. Can ChatGPT Health do better?
    • Evaluating Multi-Step LLM-Generated Content: Why Customer Journeys Require Structural Metrics
    • Why SaaS Product Management Is the Best Domain for Data-Driven Professionals in 2026
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » The Machine Learning “Advent Calendar” Day 7: Decision Tree Classifier
    Artificial Intelligence

    The Machine Learning “Advent Calendar” Day 7: Decision Tree Classifier

    ProfitlyAIBy ProfitlyAIDecember 7, 2025No Comments8 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    , we explored how a Determination Tree Regressor chooses its optimum cut up by minimizing the Imply Squared Error (MSE).

    In the present day for Day 7 of the Machine Learning “Advent Calendar”, we proceed the identical method however with a Determination Tree Classifier, the classification counterpart of yesterday’s mannequin.

    Fast instinct experiment with two easy datasets

    Allow us to start with a really small toy dataset that I generated, with one numerical function and one goal variable with two courses: 0 and 1.

    The concept is to chop the dataset into two elements, primarily based on one rule. However the query is: what ought to this rule be? What’s the criterion that tells us which cut up is best?

    Now, even when we have no idea the arithmetic but, we will already take a look at the information and guess doable cut up factors.

    And visually, it might 8 or 12, proper?

    However the query is which one is extra appropriate numerically.

    Determination Tree Classifier in Excel – picture by writer

    If we expect intuitively:

    • With a cut up at 8:
      • left aspect: no misclassification
      • proper aspect: one misclassification
    • With a cut up at 12:
      • proper aspect: no misclassification
      • left aspect: two misclassifications

    So clearly, the cut up at 8 feels higher.

    Now, allow us to take a look at an instance with three courses. I added some extra random information, and made 3 courses.

    Right here I label them 0, 1, 3, and I plot them vertically.

    However we have to be cautious: these numbers are simply class names, not numeric values. They shouldn’t be interpreted as “ordered”.

    So the instinct is all the time: How homogeneous is every area after the cut up?

    However it’s more durable to visually decide the perfect cut up.

    Now, we want a mathematical option to categorical this concept.

    That is precisely the subject of the following chapter.

    Impurity measure because the criterion of cut up

    Within the Determination Tree Regressor, we already know:

    • The prediction for a area is the common of the goal.
    • The standard of a cut up is measured by MSE.

    Within the Determination Tree Classifier:

    • The prediction for a area is the majority class of the area.
    • The standard of a cut up is measured by an impurity measure: Gini impurity or Entropy.

    Each are normal in textbooks, and each can be found in scikit-learn. Gini is utilized by default.

    BUT, what is that this impurity measure, actually?

    Should you take a look at the curves of Gini and Entropy, they each behave the identical means:

    • They’re 0 when the node is pure (all samples have the identical class).
    • They attain their most when the courses are evenly combined (50 % / 50 %).
    • The curve is clean, symmetric, and will increase with dysfunction.

    That is the important property of any impurity measure:

    Impurity is low when teams are clear, and excessive when teams are combined.

    Determination Tree Classifier in Excel – gini and entropy – picture by writer

    So we are going to use these measures to determine which cut up to create.

    Cut up with One Steady Function

    Identical to for the Determination Tree Regressor, we are going to observe the identical construction.

    Checklist of all doable splits

    Precisely just like the regressor model, with one numerical function, the one splits we have to check are the midpoints between consecutive sorted x values.

    For every cut up, compute impurity on all sides

    Allow us to take a cut up worth, for instance, x = 5.5.

    We separate the dataset into two areas:

    • Area L: x < 5.5
    • Area R: x ≥ 5.5

    For every area:

    1. We rely the whole variety of observations
    2. We compute Gini impurity
    3. Finally, we compute weighted impurity of the cut up
    Determination Tree Classifier in Excel – picture by writer

    Choose the cut up with the bottom impurity

    Like within the regressor case:

    • Checklist all doable splits
    • Compute impurity for every
    • The optimum cut up is the one with the minimal impurity
    Determination Tree Classifier in Excel – picture by writer

    Artificial Desk of All Splits

    To make every thing computerized in Excel,
    we set up all calculations in one desk, the place:

    • every row corresponds to at least one candidate cut up,
    • for every row, we compute:
      • Gini of the left area,
      • Gini of the proper area,
      • and the general weighted Gini of the cut up.

    This desk offers a clear, compact overview of each doable cut up,
    and the perfect cut up is just the one with the bottom worth within the closing column.

    Determination Tree Classifier in Excel – picture by writer

    Multi-class classification

    Till now, we labored with two courses. However the Gini impurity extends naturally to three courses, and the logic of the cut up stays precisely the identical.

    Nothing adjustments within the construction of the algorithm:

    • we listing all doable splits,
    • we compute impurity on all sides,
    • we take the weighted common,
    • we choose the cut up with the bottom impurity.

    Solely the formulation of the Gini impurity turns into barely longer.

    Gini impurity with three courses

    If a area incorporates proportions p1,  p2,  p3

    for the three courses, then the Gini impurity is:

    The identical thought as earlier than:
    a area is “pure” when one class dominates,
    and the impurity turns into giant when courses are combined.

    Left and Proper areas

    For every cut up:

    • Area L incorporates some observations of courses 1, 2, and three
    • Area R incorporates the remaining observations

    For every area:

    1. rely what number of factors belong to every class
    2. compute the proportions p1,p2,p3
    3. compute the Gini impurity utilizing the formulation above

    All the pieces is strictly the identical as within the binary case, simply with yet one more time period.

    Abstract Desk for 3-class splits

    Identical to earlier than, we accumulate all computations in a single desk:

    • every row is one doable cut up
    • we rely class 1, class 2, class 3 on the left
    • we rely class 1, class 2, class 3 on the proper
    • we compute Gini (Left), Gini (Proper)​, and the weighted Gini

    The cut up with the smallest weighted impurity is the one chosen by the choice tree.

    Determination Tree Classifier in Excel – picture by writer

    We are able to simply generalize the algorithm to Okay courses, utilizing these following formulation to calculate Gini or Entropy

    Determination Tree Classifier in Excel – picture by writer

    How Totally different Are Impurity Measures, Actually?

    Now, we all the time point out Gini or Entropy as criterion, however do they actually differ? When wanting on the mathematical formulation, some might say

    The reply will not be that a lot.

    In concept, in virtually all sensible conditions:

    • Gini and Entropy select the identical cut up
    • The tree construction is virtually equivalent
    • The predictions are the identical

    Why?

    As a result of their curves look extraordinarily comparable.

    They each peak at 50 % mixing and drop to zero at purity.

    The one distinction is the form of the curve:

    • Gini is a quadratic perform.​ It penalizes misclassification extra linearly.
    • Entropy is a logarithmic perform, so it penalizes uncertainty a bit extra strongly close to 0.5.

    However the distinction is tiny, in apply, and you are able to do it in Excel!

    Different impurity measures?

    One other pure query: is it doable to invent/use different measures?

    Sure, you would invent your individual perform, so long as:

    • It’s 0 when the node is pure
    • It’s maximal when courses are combined
    • It’s clean and strictly rising in “dysfunction”

    For instance: Impurity = 4*p0*p1

    That is one other legitimate impurity measure. And it’s really equal to Gini multiplied by a relentless when there are solely two courses.

    So once more, it offers the identical splits. If you’re not satisfied, you possibly can

    Listed here are another measures that can be used.

    Determination Tree Classifier in Excel – many impurity measures – picture by writer

    Workout routines in Excel

    Checks with different parameters and options

    When you construct the primary cut up, you possibly can lengthen your file:

    • Strive Entropy as an alternative of Gini
    • Strive including categorical options
    • Strive constructing the subsequent cut up
    • Strive altering max depth and observe under- and over-fitting
    • Strive making a confusion matrix for predictions

    These easy exams already offer you instinct for a way actual determination bushes behave.

    Implementations of the foundations for Titanic Survival Dataset

    A pure follow-up train is to recreate determination guidelines for the well-known Titanic Survival Dataset (CC0 / Public Area).

    First, we will begin with solely two options: intercourse and age.

    Implementing the foundations in Excel is lengthy and a bit tedious, however that is precisely the purpose: it makes you understand what determination guidelines actually seem like.

    They’re nothing greater than a sequence of IF / ELSE statements, repeated many times.

    That is the true nature of a choice tree: easy guidelines, stacked on prime of one another.

    Determination Tree Classifier in Excel for Titanic Survival Dataset (CC0 / Public Area) – picture by writer

    Conclusion

    Implementing a Determination Tree Classifier in Excel is surprisingly accessible.

    With a couple of formulation, you uncover the center of the algorithm:

    • listing doable splits
    • compute impurity
    • select the cleanest cut up
    Determination Tree Classifier in Excel – picture by writer

    This easy mechanism is the muse of extra superior ensemble fashions like Gradient Boosted Bushes, which we are going to focus on later on this collection.

    And keep tuned for Day 8 tomorrow!



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleArtificial Intelligence, Machine Learning, Deep Learning, and Generative AI — Clearly Explained
    Next Article How to Climb the Hidden Career Ladder of Data Science
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    Optimizing Data Transfer in Distributed AI/ML Training Workloads

    January 23, 2026
    Artificial Intelligence

    Achieving 5x Agentic Coding Performance with Few-Shot Prompting

    January 23, 2026
    Artificial Intelligence

    Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research by Anthropic Found

    January 23, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Think. Know. Act. How AI’s Core Capabilities Will Shape the Future of Work

    May 6, 2025

    Beyond Code Generation: Continuously Evolve Text with LLMs

    June 19, 2025

    Miljoner vänder sig till AI-chattbotar för andlig vägledning och bikt

    October 3, 2025

    Let’s Call a Spade a Spade: RDF and LPG — Cousins Who Should Learn to Live Together

    April 9, 2025

    The Hidden Security Risks of LLMs

    May 29, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    Enterprise AI: From Build-or-Buy to Partner-and-Grow

    April 23, 2025

    Samsungs släpper Internet för PC med Galaxy AI

    November 3, 2025

    Why Nonparametric Models Deserve a Second Look

    November 5, 2025
    Our Picks

    Optimizing Data Transfer in Distributed AI/ML Training Workloads

    January 23, 2026

    Achieving 5x Agentic Coding Performance with Few-Shot Prompting

    January 23, 2026

    Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research by Anthropic Found

    January 23, 2026
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.