Close Menu
    Trending
    • From Transactions to Trends: Predict When a Customer Is About to Stop Buying
    • America’s coming war over AI regulation
    • “Dr. Google” had its issues. Can ChatGPT Health do better?
    • Evaluating Multi-Step LLM-Generated Content: Why Customer Journeys Require Structural Metrics
    • Why SaaS Product Management Is the Best Domain for Data-Driven Professionals in 2026
    • Stop Writing Messy Boolean Masks: 10 Elegant Ways to Filter Pandas DataFrames
    • What Other Industries Can Learn from Healthcare’s Knowledge Graphs
    • Everyone wants AI sovereignty. No one can truly have it.
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » Retrieval for Time-Series: How Looking Back Improves Forecasts
    Artificial Intelligence

    Retrieval for Time-Series: How Looking Back Improves Forecasts

    ProfitlyAIBy ProfitlyAIJanuary 8, 2026No Comments15 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Helps in Time Collection Forecasting

    Everyone knows the way it goes: Time-series knowledge is difficult.

    Conventional forecasting fashions are unprepared for incidents like sudden market crashes, black swan occasions, or uncommon climate patterns.

    Even massive fancy fashions like Chronos typically battle as a result of they haven’t handled that form of sample earlier than.

    We will mitigate this with retrieval. With retrieval, we’re in a position to ask Has something like this occurred earlier than? after which utilizing that previous instance to information the forecast.

    As all of us may know now, in pure language processing (NLP), this concept is named Retrieval-Augmented Technology (RAG). It’s changing into common too within the time-series forecasting world.

    The mannequin then considers previous conditions that look much like the present one, and from there it could actually make extra dependable predictions.

    How is that this RAF totally different from conventional time-series? Retrieval forecasting provides an express reminiscence entry step.

    As a substitute of:

    Previous -> parameters -> forecast

    With retrieval we’ve got:

    Present scenario -> similarity search -> concrete previous episodes-> forecast

    Retrieval-Augmented Forecasting Cycle. Picture by Writer | Serviette AI.

    As a substitute of simply utilizing what the mannequin discovered throughout coaching, the thought is to provide it entry to a spread of related conditions.

    It’s like letting a climate mannequin verify, “What did previous winters like this one appear to be earlier than?”.


    Hey there, I’m Sara Nóbrega, an AI Engineer. In the event you’re engaged on related issues or need suggestions on making use of these concepts, I accumulate my writing, assets, and mentoring hyperlinks here.


    On this article, I discover retrieval–augmented forecasting from first rules and present, with concrete examples and code examples, how retrieval can be utilized in actual forecasting pipelines.

    What Is Retrieval-Augmented Forecasting (RAF)?

    What’s RAF? On a really high-level view, as a substitute solely leaning on what a mannequin discovered in coaching, RAF lets the mannequin actively lookup concrete previous conditions much like the present one and use their outcomes to information its prediction.

    Let’s see it extra intimately:

    • You change the present scenario (e.g., the previous few weeks of a time sequence inventory dataset) into a question.
    • This question is then used to search a database of historic time-series segments to seek out probably the most related patterns.
    • These matches don’t want to return from the identical inventory; the system must also floor related actions from different shares or monetary merchandise.

    It retrieves these patterns and what occurred afterwards.

    Afterwards, this data is ingested to the forecasting mannequin to assist it make higher predictions.

    This method is highly effective in:

    • Zero-shot eventualities: When the mannequin faces one thing it wasn’t skilled on.
    • Uncommon or anomalous occasions: Like COVID, sudden monetary crashes, and so on.
    • Evolving seasonal developments: The place previous knowledge incorporates useful patterns, however they shift over time.

    RAF doesn’t exchange your forecasting mannequin, however as a substitute augments it by giving it further hints and grounding it in related historic examples.

    One other instance: let’s say you need to forecast power consumption throughout an unusually scorching week.

    As a substitute of hoping your mannequin remembers how heatwaves have an effect on utilization, retrieval finds related previous heatwaves and lets the mannequin take into account what occurred in that point.

    What Do These Fashions Truly Retrieve?

    The retrieved “information” isn’t solely uncooked knowledge. It’s context that provides the mannequin clues.

    Listed here are some frequent examples:

    Examples of Data Retrieval. Image by Author | Napkin AI.
    Examples of Information Retrieval. Picture by Writer | Serviette AI.

    As you possibly can see, retrieval focuses on significant historic conditions, like uncommon shocks, seasonal results and patterns which have related buildings. These give actionable context for the present forecast.

    How Do These Fashions Retrieve?

    To seek out related patterns from the previous, these fashions use structured mechanisms that signify the present scenario in a means that makes it simple to go looking massive databases and discover the closest matches.

    The code snippets on this part are a simplified illustration meant to construct instinct, they don’t signify manufacturing code.

    Retrieval methods for time series forecasting. Image by Author | Napkin AI.
    Retrieval strategies for time sequence forecasting. Picture by Writer | Serviette AI.

    A few of these strategies are:

    Embedding-Based mostly Similarity

    This one converts time-series (or patches/home windows of a sequence) into compact vectors, then evaluate them with distance metrics like Euclidean or cosine similarity.

    In easy phrases: The mannequin turns chunks of time-series knowledge into brief summaries after which checks which previous summaries look most much like what’s occurring now.

    Some retrieval-augmented forecasters (e.g., RAFT) retrieve probably the most related historic patches from the coaching knowledge / whole sequence after which mixture retrieved values with attention-like weights.

    In easy phrases: It finds related conditions from the previous and averages them, paying extra consideration to the greatest matches.

    import numpy as np
    
    # Instance: embedding-based retrieval for time-series patches
    # This can be a toy instance to point out the *thought* behind retrieval.
    # In observe:
    # - embeddings are discovered by neural networks
    # - similarity search runs over hundreds of thousands of vectors
    # - this logic lives inside a bigger forecasting pipeline
    
    
    def embed_patch(patch: np.ndarray) -> np.ndarray:
        """
        Convert a brief time-series window ("patch") right into a compact vector.
    
        Right here we use easy statistics (imply, std, min, max) purely for illustration.
        Actual-world methods may use:
          - a skilled encoder community
          - shape-based representations
          - frequency-domain options
          - latent vectors from a forecasting spine
        """
        return np.array([
            patch.mean(),   # average level
            patch.std(),    # volatility
            patch.min(),    # lowest point
            patch.max()     # highest point
        ])
    
    
    def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float:
        """
        Measure how related two vectors are.
        Cosine similarity focuses on *course* quite than magnitude,
        which is usually helpful for evaluating patterns or shapes.
        """
        return float(a @ b) / (np.linalg.norm(a) * np.linalg.norm(b) + 1e-9)
    
    
    # Step 1: Characterize the present scenario
    
    # A brief window representing the present time-series conduct
    query_patch = np.array([10, 12, 18, 25, 14, 11])
    
    # Flip it into an embedding
    query_embedding = embed_patch(query_patch)
    
    
    # Step 2: Characterize historic conditions
    
    # Previous home windows extracted from historic knowledge
    historical_patches = [
        np.array([9, 11, 17, 24, 13, 10]),   # appears related
        np.array([2, 2, 2, 2, 2, 2]),        # flat, unrelated
        np.array([10, 13, 19, 26, 15, 12])   # very related
    ]
    
    # Convert all historic patches into embeddings
    historical_embeddings = [
        embed_patch(patch) for patch in historical_patches
    ]
    
    # Step 3: Evaluate and retrieve probably the most related previous instances
    
    # Compute similarity scores between the present scenario
    # and every historic instance
    similarities = [
        cosine_similarity(query_embedding, hist_emb)
        for hist_emb in historical_embeddings
    ]
    
    # Rank historic patches by similarity
    top_k_indices = np.argsort(similarities)[::-1][:2]
    
    print("Most related historic patches:", top_k_indices)
    
    # Step 4 (conceptual):
    # In a retrieval-augmented forecaster, the mannequin would now:
    # - retrieve the *future outcomes* of those related patches
    # - weight them by similarity (attention-like weighting)
    # - use them to information the ultimate forecast
    # This integration step is model-specific and never proven right here.
    

    Retrieval Instruments and Libraries

    1. FAISS
    FAISS is an excellent quick and GPU-friendly library for similarity search over dense vectors. The greatest datasets for this library are those which might be massive and in-reminiscence, although its construction makes real-time updates tougher to implement.

    import faiss
    import numpy as np
    
    # Suppose we have already got embeddings for historic home windows
    d = 128  # embedding dimension
    xb = np.random.randn(100_000, d).astype("float32")  # historic embeddings
    xq = np.random.randn(1, d).astype("float32")        # question embedding
    
    index = faiss.IndexFlatIP(d)   # interior product (typically used with normalized vectors for cosine-like conduct)
    index.add(xb)
    
    okay = 5
    scores, ids = index.search(xq, okay)
    print("Nearest neighbors (ids):", ids)
    print("Similarity scores:", scores)
    
    # Some FAISS indexes/algorithms can run on GPU.
    

    Nearest-neighbor lookup (Annoy)
    The Annoy library is comparatively light-weight and straightforward to work with.

    The very best datasets for this library is historic datasets that stay largely static, since any modification to the dataset requires rebuilding the index.

    from annoy import AnnoyIndex
    import numpy as np
    
    # Variety of values in every embedding vector.
    # The "size" of every fingerprint.
    f = 64
    
    # Create an Annoy index.
    # This object will retailer many previous embeddings and assist us shortly discover probably the most related ones.
    ann = AnnoyIndex(f, "angular")
    # "angular" distance is often used to match patterns
    # and behaves equally to cosine similarity.
    
    # Add historic embeddings (previous conditions).
    # Every merchandise represents a compressed model of a previous time-series window.
    # Right here we use random numbers simply for example.
    for i in vary(10000):
        ann.add_item(i, np.random.randn(f).tolist())
    
    # Construct the search construction.
    # This step organizes the information so similarity searches are quick.
    # After this, the index turns into read-only.
    ann.construct(10)
    
    # Save the index to disk.
    # This permits us to load it later with out rebuilding every part.
    ann.save("hist.ann")
    
    # Create a question embedding.
    # This represents the present scenario we need to evaluate
    # towards previous conditions.
    q = np.random.randn(f).tolist()
    
    # Discover the 5 most related previous embeddings.
    # Annoy returns the IDs of the closest matches.
    neighbors = ann.get_nns_by_vector(q, 5)
    
    print("Nearest neighbors:", neighbors)
    
    # Necessary word:
    # As soon as the index is constructed, you can't add new objects.
    # If new historic knowledge seems, the index should be rebuilt.
    

    Qdrant / Pinecone

    Qdrant and Pinecone are like Google for embeddings.

    You retailer plenty of vector “fingerprints” (plus further tags like metropolis/season), and when you will have a brand new fingerprint, you ask:

    Present me probably the most related ones however solely from this metropolis/season/retailer kind.”
    That is what makes them simpler than rolling your individual retrieval: they deal with quick search and filtering!

    Qdrant calls metadata payload, and you may filter search outcomes utilizing situations.

    # Instance solely (for instinct). Actual code wants a operating Qdrant occasion + actual embeddings.
    
    from qdrant_client import QdrantClient, fashions
    
    consumer = QdrantClient(url="http://localhost:6333")
    
    assortment = "time_series_windows"
    
    # Fake that is the embedding of the *present* time-series window
    query_vector = [0.12, -0.03, 0.98, 0.44]  # shortened for readability
    
    # Filter = "solely take into account previous home windows from New York in summer time"
    # Qdrant documentation reveals filters constructed from FieldCondition + MatchValue. :contentReference[oaicite:3]{index=3}
    query_filter = fashions.Filter(
        should=[
            models.FieldCondition(
                key="city",
                match=models.MatchValue(value="New York"),
            ),
            models.FieldCondition(
                key="season",
                match=models.MatchValue(value="summer"),
            ),
        ]
    )
    
    # In actual utilization, you’d name search/question and get again the closest matches
    # plus their payload (metadata) for those who request it.
    outcomes = consumer.search(
        collection_name=assortment,
        query_vector=query_vector,
        query_filter=query_filter,
        restrict=5,
        with_payload=True,   # return metadata so you possibly can examine what you retrieved
    )
    
    print(outcomes)
    
    # What you'd do subsequent (conceptually):
    # - take the matched IDs
    # - load the precise historic home windows behind them
    # - feed these home windows (or their outcomes) into your forecasting mannequin
    

    Pinecone shops metadata key-value pairs alongside vectors and allows you to filter at question time (together with $eq) and return metadata.

    # Instance solely (for instinct). Actual code wants an API key + an index host.
    
    from pinecone import Pinecone
    
    computer = Pinecone(api_key="YOUR_API_KEY")
    index = computer.Index(host="INDEX_HOST")
    
    # Fake that is the embedding of the present time-series window
    query_vector = [0.12, -0.03, 0.98, 0.44]  # shortened for readability
    
    # Ask for probably the most related previous home windows, however solely the place:
    # metropolis == "New York" AND season == "summer time"
    # Pinecone docs present query-time filtering and `$eq`. :contentReference[oaicite:5]{index=5}
    res = index.question(
        namespace="home windows",
        vector=query_vector,
        top_k=5,
        filter={
            "metropolis": {"$eq": "New York"},
            "season": {"$eq": "summer time"},
        },
        include_metadata=True,  # return tags so you possibly can sanity-check matches
        include_values=False
    )
    
    print(res)
    
    # Conceptually subsequent:
    # - use the returned IDs to fetch the underlying historic home windows/outcomes
    # - situation your forecast on these retrieved examples
    

    Why do vector DBs assist? They allow you to do similarity search + “SQL-like WHERE filters” in a single step, which is tough to do cleanly with a DIY setup (each Qdrant payload filtering and Pinecone metadata filtering are first-class options of their docs.)

    Every software has its trade-offs. As an example, FAISS is nice for efficiency however isn’t suited to frequent updates. Qdrant provides flexibility and real-time filtering. Pinecone is straightforward to arrange however SaaS-only.

    Retrieval + Forecasting: How one can Mix Them

    After understanding what to retrieve, the subsequent step is to mix that data with the present enter.

    It will probably fluctuate relying on the structure and the duty. There are a number of methods for doing this (see picture beneath).

    Strategies for Combining Retrieval and Forecasting
    Methods for Combining Retrieval and Forecasting. Picture by Writer | Serviette AI.

    A. Concatenation
    Thought:
    deal with retrieved context as “extra enter” by appending it to the present sequence (quite common in retrieval-augmented era setups).

    Works properly with transformer-based fashions like Chronos and doesn’t require structure adjustments.

    import torch
    
    # x_current: the mannequin's common enter sequence (e.g., final N timesteps or tokens)
    # form: [batch, time, d_model]   (or [batch, time] for those who suppose in tokens)
    x_current = torch.randn(8, 128, 256)
    
    # x_retrieved: retrieved context encoded within the SAME illustration house
    # e.g., embeddings for related previous home windows (or their summaries)
    # form: [batch, retrieved_time, d_model]
    x_retrieved = torch.randn(8, 32, 256)
    
    # Easy fusion: simply append retrieved context to the tip of the enter sequence
    # Now the mannequin sees: [current history ... + retrieved context ...]
    x_fused = torch.cat([x_current, x_retrieved], dim=1)
    
    # In observe, you'd additionally add:
    # - an consideration masks (so the mannequin is aware of what’s actual vs padded)
    # - phase/kind embeddings (so the mannequin is aware of which half is retrieved context)
    # Then feed x_fused to your transformer.
    

    B. Cross-Consideration Fusion
    Thought:
    hold the “present enter” and “retrieved context” separate, and let the mannequin attend to retrieved context when it wants it. That is the core “fusion within the decoder by way of cross-attention” sample utilized by retrieval-augmented architectures like FiD.

    import torch
    
    # current_repr: illustration of the present time-series window
    # form: [batch, time, d_model]
    current_repr = torch.randn(8, 128, 256)
    
    # retrieved_repr: illustration of retrieved home windows (may very well be concatenated)
    # form: [batch, retrieved_time, d_model]
    retrieved_repr = torch.randn(8, 64, 256)
    
    # Consider cross-attention like:
    # - Question (Q) comes from the present sequence
    # - Keys/Values (Ok/V) come from retrieved context
    Q = current_repr
    Ok = retrieved_repr
    V = retrieved_repr
    
    # Consideration scores: "How a lot ought to every present timestep have a look at every retrieved timestep?"
    scores = torch.matmul(Q, Ok.transpose(-1, -2)) / (Q.measurement(-1) ** 0.5)
    
    # Flip scores into weights (so that they sum to 1 throughout retrieved positions)
    weights = torch.softmax(scores, dim=-1)
    
    # Weighted sum of retrieved data (that is the “fused” retrieved sign)
    retrieval_signal = torch.matmul(weights, V)
    
    # Remaining fused illustration: present data + retrieved data
    # (Some fashions add, some concatenate, some use a discovered projection)
    fused = current_repr + retrieval_signal
    
    # Then the forecasting head reads from `fused` to foretell the longer term.
    

    C. Combination-of-Specialists (MoE)
    Thought: mix two “consultants”:

    • the retrieval-based forecaster (non-parametric, case-based)
    • the base forecaster (parametric information)

    A “gate” decides which one to belief extra at every time step.

    import torch
    
    # base_pred: forecast from the primary mannequin (what it "discovered in weights")
    # form: [batch, horizon]
    base_pred = torch.randn(8, 24)
    
    # retrieval_pred: forecast instructed by retrieved related instances
    # form: [batch, horizon]
    retrieval_pred = torch.randn(8, 24)
    
    # context_for_gate: abstract of the present scenario (may very well be final hidden state)
    # form: [batch, d_model]
    context_for_gate = torch.randn(8, 256)
    
    # gate: a quantity between 0 and 1 saying "how a lot to belief retrieval"
    # (In actual fashions, it is a tiny neural internet.)
    gate = torch.sigmoid(torch.randn(8, 1))
    
    # Combination: convex mixture
    # - if gate ~ 1 -> belief retrieval extra
    # - if gate ~ 0 -> belief the bottom mannequin extra
    final_pred = gate * retrieval_pred + (1 - gate) * base_pred
    
    # In observe:
    # - gate is likely to be timestep-dependent: form [batch, horizon, 1]
    # - you may also add coaching losses to stabilize routing/utilization (frequent in MoE)
    

    D. Channel Prompting
    Thought:
    deal with retrieved sequence as further enter channels/options (particularly pure in multivariate time sequence, the place every variable is a “channel”).

    import torch
    
    # x: multivariate time sequence enter
    # form: [batch, time, channels]
    # Instance: channels may very well be [sales, price, promo_flag, temperature, ...]
    x = torch.randn(8, 128, 5)
    
    # retrieved_series_aligned: retrieved sign aligned to the identical time grid
    # Instance: common of the top-k related previous home windows (or one consultant neighbor)
    # form: [batch, time, retrieved_channels]
    retrieved_series_aligned = torch.randn(8, 128, 2)
    
    # Channel prompting = append retrieved channels as further options
    # Now the mannequin will get "regular channels + retrieved channels"
    x_prompted = torch.cat([x, retrieved_series_aligned], dim=-1)
    
    # In observe you’d possible additionally embrace:
    # - a masks or confidence rating for retrieved channels
    # - normalization so retrieved alerts are on a comparable scale
    # Then feed x_prompted into the forecaster.
    

    Some fashions even mix a number of strategies.

    A standard strategy is to retrieve a number of related sequence, merge them utilizing consideration so the mannequin can deal with probably the most related elements, after which feed them to an professional.

    Wrap-up

    Retrieval-Augmented Forecasting (RAF) lets your mannequin study from the previous in a means that conventional time-series modeling doesn’t obtain.

    It acts like an exterior reminiscence that helps the mannequin navigate unfamiliar conditions with extra confidence.

    It’s easy to experiment with and delivers significant enhancements in forecasting duties.

    Retrieval just isn’t a tutorial hype anymore, it’s already delivering leads to real-world methods.

    Thanks for studying!

    My title is Sara Nóbrega. I’m an AI engineer targeted on MLOps and on deploying machine studying methods into manufacturing.


    References

    [1] J. Liu, Y. Zhang, Z. Wang et al., Retrieval-Augmented Time Collection Forecasting (2025), arXiv preprint
    Supply: https://arxiv.org/html/2505.04163v1

    [2] UConn DSIS, TS-RAG: Time-Collection Retrieval-Augmented Technology (n.d.), GitHub Repository
    Supply: https://github.com/UConn-DSIS/TS-RAG

    [3] Y. Zhang, H. Xu, X. Chen et al., Reminiscence-Augmented Forecasting for Time Collection with Uncommon Occasions (2024), arXiv preprint
    Supply: https://arxiv.org/abs/2412.20810



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleHow to Improve the Performance of Visual Anomaly Detection Models
    Next Article Beyond Prompting: The Power of Context Engineering
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    From Transactions to Trends: Predict When a Customer Is About to Stop Buying

    January 23, 2026
    Artificial Intelligence

    Evaluating Multi-Step LLM-Generated Content: Why Customer Journeys Require Structural Metrics

    January 22, 2026
    Artificial Intelligence

    Why SaaS Product Management Is the Best Domain for Data-Driven Professionals in 2026

    January 22, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    How Deep Feature Embeddings and Euclidean Similarity Power Automatic Plant Leaf Recognition

    November 18, 2025

    Pause Your ML Pipelines for Human Review Using AWS Step Functions + Slack

    May 13, 2025

    The Pentagon is gutting the team that tests AI and weapons systems

    June 10, 2025

    Who Let The Digital Genies Out?

    April 9, 2025

    The Machine Learning “Advent Calendar” Day 23: CNN in Excel

    December 24, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    Avoiding Costly Mistakes with Uncertainty Quantification for Algorithmic Home Valuations

    April 8, 2025

    The Rise of Semantic Entity Resolution

    September 14, 2025

    Microsoft’s AI Chief Says We’re Not Ready for ‘Seemingly Conscious’ AI

    August 26, 2025
    Our Picks

    From Transactions to Trends: Predict When a Customer Is About to Stop Buying

    January 23, 2026

    America’s coming war over AI regulation

    January 23, 2026

    “Dr. Google” had its issues. Can ChatGPT Health do better?

    January 22, 2026
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.