Close Menu
    Trending
    • Three OpenClaw Mistakes to Avoid and How to Fix Them
    • I Stole a Wall Street Trick to Solve a Google Trends Data Problem
    • How AI is turning the Iran conflict into theater
    • Why Your AI Search Evaluation Is Probably Wrong (And How to Fix It)
    • Machine Learning at Scale: Managing More Than One Model in Production
    • Improving AI models’ ability to explain their predictions | MIT News
    • Write C Code Without Learning C: The Magic of PythoC
    • LatentVLA: Latent Reasoning Models for Autonomous Driving
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » Is Your Model Time-Blind? The Case for Cyclical Feature Encoding
    Artificial Intelligence

    Is Your Model Time-Blind? The Case for Cyclical Feature Encoding

    ProfitlyAIBy ProfitlyAIDecember 24, 2025No Comments8 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    : The Midnight Paradox

    Think about this. You’re constructing a mannequin to foretell electrical energy demand or taxi pickups. So, you feed it time (resembling minutes) beginning at midnight. Clear and easy. Proper?

    Now your mannequin sees 23:59 (minute 1439 within the day) and 00:01 (minute 1 within the day). To you, they’re two minutes aside. To your mannequin, they’re very far aside. That’s the midnight paradox. And sure, your mannequin might be time-blind.

    Why does this occur?

    As a result of most machine studying fashions deal with numbers as straight strains, not circles.

    Linear regression, KNN, SVMs, and even neural networks will deal with numbers logically, assuming larger numbers are “extra” than decrease ones. They don’t know that point wraps round. Midnight is the sting case they by no means forgive.

    In case you’ve ever added hourly data to your mannequin with out success, questioning later why your mannequin struggles round day boundaries, that is probably why.

    The Failure of Customary Encoding

    Let’s discuss in regards to the traditional approaches. You’ve most likely used a minimum of one in every of them.

    You encode hours as numbers from 0 to 23. Now there’s a synthetic cliff between hour 23 and hour 0. Thus, this mannequin thinks midnight is the largest bounce of the day. Nevertheless, is midnight actually extra completely different from 11 PM than 10 PM is from 9 PM?

    In fact not. However your mannequin doesn’t know that.

    Right here’s the hours illustration once they’re within the “linear” mode.

    # Generate information
    date_today = pd.to_datetime('at present').normalize()
    datetime_24_hours = pd.date_range(begin=date_today, intervals=24, freq='h')
    df = pd.DataFrame({'dt': datetime_24_hours})
    df['hour'] = df['dt'].dt.hour	
    
    # Calculate Sin and Cosine
    df["hour_sin"] = np.sin(2 * np.pi * df["hour"] / 24)
    df["hour_cos"] = np.cos(2 * np.pi * df["hour"] / 24)
    
    # Plot the Hours in Linear mode
    plt.determine(figsize=(15, 5))
    plt.plot(df['hour'], [1]*24, linewidth=3)
    plt.title('Hours in Linear Mode')
    plt.xlabel('Hour')
    plt.xticks(np.arange(0, 24, 1))
    plt.ylabel('Worth')
    plt.present()
    Hours within the Linear Mode. Picture by the writer.

    What if we one-hot encode the hours? Twenty-four binary columns. Drawback solved, proper? Effectively… partially. You mounted the factitious hole, however you misplaced proximity. 2 AM is now not nearer to three AM than to 10 PM.
    You additionally exploded dimensionality. For timber, that’s annoying. For linear fashions, it’s most likely inefficient.

    So, let’s transfer on to a possible various.

    • The Resolution: Trigonometric Mapping

    Right here’s the mindset shift:

    Cease excited about time as a line. Give it some thought as a circle.

    A 24-hour day loops again to itself. So your encoding ought to loop too, pondering in circles. Every hour is an evenly spaced level on a circle. Now, to characterize some extent on a circle, you don’t use one quantity, however as a substitute you utilize two coordinates: x and y.

    That’s the place sine and cosine are available in.

    The geometry behind it

    Each angle on a circle will be mapped to a novel level utilizing sine and cosine. This offers your mannequin a easy, steady illustration of time.

    plt.determine(figsize=(5, 5))
    plt.scatter(df['hour_sin'], df['hour_cos'], linewidth=3)
    plt.title('Hours in Cyclical Mode')
    plt.xlabel('Hour')
    Hours in cyclcical mode after sine and cosine. Picture by the writer.

    Right here’s the mathematics system to calculate cycles for hours of the day:

    • First, 2 * π * hour / 24 converts every hour into an angle. Midnight and 11 PM find yourself nearly on the identical place on the circle.
    • Then sine and cosine mission that angle into two coordinates.
    • These two values collectively uniquely outline the hour. Now 23:00 and 00:00 are shut in function area. Precisely what you needed all alongside.

    The identical concept works for minutes, days of the week, or months of the 12 months.

    Code

    Let’s experiment with this dataset Home equipment Power Prediction [4]. We’ll attempt to enhance the prediction utilizing a Random Forest Regressor mannequin (a tree-based mannequin).

    Candanedo, L. (2017). Home equipment Power Prediction [Dataset]. UCI Machine Studying Repository. https://doi.org/10.24432/C5VC8G. Artistic Commons 4.0 License.

    # Imports
    from sklearn.ensemble import RandomForestRegressor
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import root_mean_squared_error
    from ucimlrepo import fetch_ucirepo 

    Get information.

    # fetch dataset 
    appliances_energy_prediction = fetch_ucirepo(id=374) 
      
    # information (as pandas dataframes) 
    X = appliances_energy_prediction.information.options 
    y = appliances_energy_prediction.information.targets 
      
    # To Pandas
    df = pd.concat([X, y], axis=1)
    df['date'] = df['date'].apply(lambda x: x[:10] + ' ' + x[11:])
    df['date'] = pd.to_datetime(df['date'])
    df['month'] = df['date'].dt.month
    df['day'] = df['date'].dt.day
    df['hour'] = df['date'].dt.hour
    df.head(3)

    Let’s create a fast mannequin with the linear time first, as our baseline for comparability.

    # X and y
    # X = df.drop(['Appliances', 'rv1', 'rv2', 'date'], axis=1)
    X = df[['hour', 'day', 'T1', 'RH_1', 'T_out', 'Press_mm_hg', 'RH_out', 'Windspeed', 'Visibility', 'Tdewpoint']]
    y = df['Appliances']
    
    # Practice Take a look at Cut up
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    # Match the mannequin
    lr = RandomForestRegressor().match(X_train, y_train)
    
    # Rating
    print(f'Rating: {lr.rating(X_train, y_train)}')
    
    # Take a look at RMSE
    y_pred = lr.predict(X_test)
    rmse = root_mean_squared_error(y_test, y_pred)
    print(f'RMSE: {rmse}')

    The outcomes are right here.

    Rating: 0.9395797670166536
    RMSE: 63.60964667197874

    Subsequent, we’ll encode the cyclical time elements (day and hour) and retrain the mannequin.

    # Add cyclical hours sin and cosine
    df['hour_sin'] = np.sin(2 * np.pi * df['hour'] / 24)
    df['hour_cos'] = np.cos(2 * np.pi * df['hour'] / 24)
    df['day_sin'] = np.sin(2 * np.pi * df['day'] / 31)
    df['day_cos'] = np.cos(2 * np.pi * df['day'] / 31)
    
    # X and y
    X = df[['hour_sin', 'hour_cos', 'day_sin', 'day_cos','T1', 'RH_1', 'T_out', 'Press_mm_hg', 'RH_out', 'Windspeed', 'Visibility', 'Tdewpoint']]
    y = df['Appliances']
    
    # Practice Take a look at Cut up
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    # Match the mannequin
    lr_cycle = RandomForestRegressor().match(X_train, y_train)
    
    # Rating
    print(f'Rating: {lr_cycle.rating(X_train, y_train)}')
    
    # Take a look at RMSE
    y_pred = lr_cycle.predict(X_test)
    rmse = root_mean_squared_error(y_test, y_pred)
    print(f'RMSE: {rmse}')

    And the outcomes. We’re seeing an enchancment of 1% within the rating and 1 level within the RMSE.

    Rating: 0.9416365489096074
    RMSE: 62.87008070927842

    I’m positive this doesn’t seem like a lot, however let’s do not forget that this toy instance is utilizing a easy out-of-the-box mannequin with none information therapy or cleanup. We’re seeing principally the impact of the sine and cosine transformation.

    What’s actually taking place right here is that, in actual life, electrical energy demand doesn’t reset at midnight. And now your mannequin lastly sees that continuity.

    Why You Want Each Sine and Cosine

    Don’t fall into the temptation of utilizing solely sine, because it feels sufficient. One column as a substitute of two. Cleaner, proper?

    Sadly, it breaks symmetry. On a 24-hour clock, 6 AM and 6 PM can produce the identical sine worth. Totally different occasions with similar encoding will be unhealthy as a result of the mannequin now confuses morning rush hour with night rush hour. Thus, not excellent except you get pleasure from confused predictions.

    Utilizing each sine and cosine fixes this. Collectively, they offer every hour a novel fingerprint on the circle. Consider it like latitude and longitude. You want each to know the place you might be.

    Actual-World Affect & Outcomes

    So, does this really assist fashions? Sure. Particularly sure ones.

    Distance-based fashions

    KNN and SVMs rely closely on distance calculations. Cyclical encoding prevents faux “lengthy distances” at boundaries. Your neighbors really develop into neighbors once more.

    Neural networks

    Neural networks be taught sooner with easy function areas. Cyclical encoding removes sharp discontinuities at midnight. That normally means sooner convergence and higher stability.

    Tree-based fashions

    Gradient Boosted Bushes like XGBoost or LightGBM can ultimately be taught these patterns. Cyclical encoding provides them a head begin. In case you care about efficiency and interpretability, it’s value it.

    7. When Ought to You Use This?

    At all times ask your self the query: Does this function repeat in a cycle? If sure, take into account cyclical encoding.

    Widespread examples are:

    • Hour of day
    • Day of week
    • Month of 12 months
    • Wind path (levels)
    • If it loops, you would possibly attempt encoding it like a loop.

    Earlier than You Go

    Time isn’t just a quantity. It’s a coordinate on a circle.

    In case you deal with it like a straight line, your mannequin can stumble at boundaries and have a tough time understanding that variable as a cycle, one thing that repeats and has a sample.

    Cyclical encoding with sine and cosine fixes this elegantly, preserving proximity, decreasing artifacts, and serving to fashions be taught sooner.

    So subsequent time your predictions look bizarre round day modifications, do that new software you’ve realized, and let it make your mannequin shine because it ought to.

    In case you favored this content material, discover extra of my work and my contacts at my web site.

    https://gustavorsantos.me

    GitHub Repository

    Right here’s the entire code of this train.

    https://github.com/gurezende/Time-Series/tree/main/Sine%20Cosine%20Time%20Encode

    References & Additional Studying

    [1. Encoding hours Stack Exchange]: https://stats.stackexchange.com/questions/451295/encoding-cyclical-feature-minutes-and-hours

    [2. NumPy trigonometric functions]: https://numpy.org/doc/stable/reference/routines.math.html

    [3. Practical discussion on cyclical features]:
    https://www.kaggle.com/code/avanwyk/encoding-cyclical-features-for-deep-learning

    [4. Appliances Energy Prediction Dataset] https://archive.ics.uci.edu/dataset/374/appliances+energy+prediction



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous Article4 Techniques to Optimize AI Coding Efficiency
    Next Article Why Enterprise AI Scale Stalls
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    Three OpenClaw Mistakes to Avoid and How to Fix Them

    March 9, 2026
    Artificial Intelligence

    I Stole a Wall Street Trick to Solve a Google Trends Data Problem

    March 9, 2026
    Artificial Intelligence

    Why Your AI Search Evaluation Is Probably Wrong (And How to Fix It)

    March 9, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Anthropic Says It Detected a Major AI-Powered Hack by China

    November 19, 2025

    Ny forskning visar varför AI-bilder ser så konstiga ut

    October 21, 2025

    AI shapes autonomous underwater “gliders” | MIT News

    July 9, 2025

    New training approach could help AI agents perform better in uncertain conditions | MIT News

    April 7, 2025

    30% Faster Travel? Dubai’s AI Plan Is Blowing Minds

    April 24, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    How to make a cash flow forecasting app work for other systems

    February 20, 2026

    Machine Learning in Production? What This Really Means

    January 28, 2026

    5 Techniques to Prevent Hallucinations in Your RAG Question Answering

    September 23, 2025
    Our Picks

    Three OpenClaw Mistakes to Avoid and How to Fix Them

    March 9, 2026

    I Stole a Wall Street Trick to Solve a Google Trends Data Problem

    March 9, 2026

    How AI is turning the Iran conflict into theater

    March 9, 2026
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.