Close Menu
    Trending
    • Three OpenClaw Mistakes to Avoid and How to Fix Them
    • I Stole a Wall Street Trick to Solve a Google Trends Data Problem
    • How AI is turning the Iran conflict into theater
    • Why Your AI Search Evaluation Is Probably Wrong (And How to Fix It)
    • Machine Learning at Scale: Managing More Than One Model in Production
    • Improving AI models’ ability to explain their predictions | MIT News
    • Write C Code Without Learning C: The Magic of PythoC
    • LatentVLA: Latent Reasoning Models for Autonomous Driving
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » The Machine Learning “Advent Calendar” Day 6: Decision Tree Regressor
    Artificial Intelligence

    The Machine Learning “Advent Calendar” Day 6: Decision Tree Regressor

    ProfitlyAIBy ProfitlyAIDecember 6, 2025No Comments7 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    5 days of this Machine Learning “Advent Calendar”, we explored 5 fashions (or algorithms) which can be all primarily based on distances (native Euclidean distance, or international Mahalanobis distance).

    So it’s time to change the strategy, proper? We’ll come again to the notion of distance later.

    For right now, we’ll see one thing completely completely different: Choice Timber!

    Introduction with a Easy dataset

    Let’s use a easy dataset with just one steady characteristic.

    As all the time, the concept is you could visualize the outcomes your self. Then you must suppose how one can make the pc do it.

    Choice tree regression in Excel easy dataset (I generated myself)  — picture by writer

    We will visually guess that for the primary break up, there are two doable values, one round 5.5 and the opposite round 12.

    Now the query is, which one will we select?

    That is precisely what we’re going to discover out: how will we decide the worth for the primary break up with an implementation in Excel?

    As soon as we decide the worth for the primary break up, we are able to apply the identical course of for the next splits.

    That’s the reason we’ll solely implement the primary break up in Excel.

    The Algorithmic Precept of Choice Tree Regressors

    I wrote an article to always distinguish three steps of machine learning to learn it effectively, and let’s apply the precept to Choice Tree Regressors.

    So for the primary time, now we have a “true” machine studying mannequin, with non-trivial steps for all three.

    What’s the mannequin?

    The mannequin here’s a algorithm, to partition the dataset, and for every partition, we’re going to assign a price. Which one? The common worth y of all of the observations in the identical group.

    So whereas k-NN predicts with the common worth of the closest neighbors (comparable observations by way of options variables), the Choice Tree regressor predicts with the common worth of a gaggle of observations (comparable by way of the characteristic variable).

    Mannequin becoming or coaching course of

    For a call tree, we additionally name this course of totally rising a tree. Within the case of a Choice Tree Regressor, the leaves will comprise just one statement, with thus a MSE of zero.

    Rising a tree consists of recursively partitioning the enter information into smaller and smaller chunks or areas. For every area, a prediction will be calculated.

    Within the case of regression, the prediction is the common of the goal variable for the area.

    At every step of the constructing course of, the algorithm selects the characteristic and the break up worth that maximizes the one criterion, and within the case of a regressor, it’s usually the Imply Squared Error (MSE) between the precise worth and the prediction.

    Mannequin Tuning or Pruning

    For a call tree, the final time period of mannequin tuning can be name it pruning, be it may be seen as dropping nodes and leaves from a completely grown tree.

    Additionally it is equal to say that the constructing course of stops when a criterion is met, reminiscent of a most depth or a minimal variety of samples in every leaf node. And these are the hyperparameters that may be optimized with the tuning course of.

    Inferencing course of

    As soon as the choice tree regressor is constructed, it may be used to foretell the goal variable for brand new enter cases by making use of the principles and traversing the tree from the foundation node to a leaf node that corresponds to the enter’s characteristic values.

    The anticipated goal worth for the enter occasion is then the imply of the goal values of the coaching samples that fall into the identical leaf node.

    Implementation in Excel of the First Cut up

    Listed below are the steps we’ll observe:

    • Listing all doable splits
    • For every break up, we’ll calculate the MSE (Imply Squared Error)
    • We’ll choose the break up that minimizes the MSE because the optimum subsequent break up

    All doable splits

    First, now we have to listing all of the doable splits which can be the common values of two consecutive values. There is no such thing as a want to check extra values.

    Choice tree regression in Excel with doable splits — picture by writer

    MSE calculation for every doable break up

    As a place to begin, we are able to calculate the MSE earlier than any splits. This additionally signifies that the prediction is simply the common worth of y. And the MSE is equal to the Normal Deviation of y.

    Now, the concept is to discover a break up in order that the MSE with a break up is decrease than earlier than. It’s doable that the break up doesn’t considerably enhance the efficiency (or decrease the MSE), then the ultimate tree could be trivial, that’s the common worth of y.

    For every doable break up, we are able to then calculate the MSE (Imply Squared Error). The picture under exhibits the calculation for the primary doable break up, which is x = 2.

    Choice tree regression in Excel MSE for all doable splits — picture by writer

    We will see the main points of the calculation:

    1. Lower the dataset into two areas: with the worth x=2, we decide two prospects x<2 or x>2, so the x axis is reduce into two elements.
    2. Calculate the prediction: for every half, we calculate the common of y. That’s the potential prediction for y.
    3. Calculate the error: then we examine the prediction to the precise worth of y
    4. Calculate the squared error: for every statement, we are able to calculate the sq. error.
    Choice tree regression in Excel with all doable splits — picture by writer

    Optimum break up

    For every doable break up, we do the identical to acquire the MSE. In Excel, we are able to copy and paste the formulation and the one worth that adjustments is the doable break up worth for x.

    Choice tree regression in Excel splits— picture by writer

    Then we are able to plot the MSE on the y-axis and the doable break up on the x-axis, and now we are able to see that there’s a minimal of MSE for x=5.5, that is precisely the outcome obtained with python code.

    Choice tree regression in Excel Minimumization of MSE — picture by writer

    An train you may check out

    Now, you may play with the Google Sheet:

    • You possibly can modify the dataset, the MSE can be up to date, and you will note the optimum reduce
    • You possibly can introduce a categorical characteristic
    • You possibly can attempt to discover the following break up
    • You possibly can change the criterion, as an alternative of MSE, you should utilize absolute error, Poisson or friedman_mse as indicated within the documentation of DecisionTreeRegressor
    • You possibly can change the goal variable to a binary variable, usually, this turns into a classification process, however 0 or 1 are additionally numbers so the criterion MSE nonetheless will be utilized. However if you wish to create a correct classifier, you must apply the standard criterion Entroy or Gini. It’s for the following article.

    Conclusion

    Utilizing Excel, it’s doable to implement one break up to realize extra insights into how Choice Tree Regressors work. Regardless that we didn’t create a full tree, it’s nonetheless fascinating, since crucial half is discovering the optimum break up amongst all doable splits.

    Another factor

    Have you ever observed one thing fascinating about how options are dealt with between distance-based fashions, and choice timber?

    For distance-based fashions, every thing have to be numeric. Steady options keep steady, and categorical options have to be remodeled into numbers. The mannequin compares factors in house, so every thing has to dwell on a numeric axis.

    Choice Timber do the inverse: they reduce options into teams. A steady characteristic turns into intervals. A categorical characteristic stays categorical.

    And a lacking worth? It merely turns into one other class. There is no such thing as a have to impute first. The tree can naturally deal with it by sending all “lacking” values to 1 department, identical to every other group.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleTDS Newsletter: How to Design Evals, Metrics, and KPIs That Work
    Next Article Reading Research Papers in the Age of LLMs
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    Three OpenClaw Mistakes to Avoid and How to Fix Them

    March 9, 2026
    Artificial Intelligence

    I Stole a Wall Street Trick to Solve a Google Trends Data Problem

    March 9, 2026
    Artificial Intelligence

    Why Your AI Search Evaluation Is Probably Wrong (And How to Fix It)

    March 9, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Implementing the Snake Game in Python

    February 10, 2026

    Machine-learning tool gives doctors a more detailed 3D picture of fetal health | MIT News

    September 15, 2025

    Build Effective Internal Tooling with Claude Code

    February 23, 2026

    MIT scientists debut a generative AI model that could create molecules addressing hard-to-treat diseases | MIT News

    November 25, 2025

    How to Run Coding Agents in Parallel

    January 15, 2026
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    How to Protect Your Brand in an AI-Powered World with Jen Leonard [MAICON 2025 Speaker Series]

    October 2, 2025

    This data set helps researchers spot harmful stereotypes in LLMs

    April 30, 2025

    Enhance your AP automation workflows

    May 22, 2025
    Our Picks

    Three OpenClaw Mistakes to Avoid and How to Fix Them

    March 9, 2026

    I Stole a Wall Street Trick to Solve a Google Trends Data Problem

    March 9, 2026

    How AI is turning the Iran conflict into theater

    March 9, 2026
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.