Close Menu
    Trending
    • Creating AI that matters | MIT News
    • Scaling Recommender Transformers to a Billion Parameters
    • Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know
    • Is RAG Dead? The Rise of Context Engineering and Semantic Layers for Agentic AI
    • ChatGPT Gets More Personal. Is Society Ready for It?
    • Why the Future Is Human + Machine
    • Why AI Is Widening the Gap Between Top Talent and Everyone Else
    • Implementing the Fourier Transform Numerically in Python: A Step-by-Step Guide
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » How To Significantly Enhance LLMs by Leveraging Context Engineering
    Artificial Intelligence

    How To Significantly Enhance LLMs by Leveraging Context Engineering

    ProfitlyAIBy ProfitlyAIJuly 22, 2025No Comments11 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    is the science of offering LLMs with the right context to maximise efficiency. Whenever you work with LLMs, you sometimes create a system immediate, asking the LLM to carry out a sure job. Nevertheless, when working with LLMs from a programmer’s perspective, there are extra parts to contemplate. You need to decide what different information you’ll be able to feed your LLM to enhance its means to carry out the duty you requested it to do.

    On this article, I’ll focus on the science of context engineering and how one can apply context engineering methods to enhance your LLM’s efficiency.

    On this article, I focus on context engineering: The science of offering the right context on your LLMs. Appropriately using context engineering can considerably enhance the efficiency of your LLM. Picture by ChatGPT.

    You may as well learn my articles on Reliability for LLM Applications and Document QA using Multimodal LLMs

    Desk of Contents

    Definition

    Earlier than I begin, it’s necessary to outline the time period context engineering. Context engineering is actually the science of deciding what to feed into your LLM. This may, for instance, be:

    • The system immediate, which tells the LLM act
    • Doc information fetch utilizing RAG vector search
    • Few-shot examples
    • Instruments

    The closest earlier description of this has been the time period immediate engineering. Nevertheless, immediate engineering is a much less descriptive time period, contemplating it implies solely altering the system immediate you’re feeding to the LLM. To get most efficiency out of your LLM, it’s a must to take into account all of the context you’re feeding into it, not solely the system immediate.

    Motivation

    My preliminary motivation for this text got here from studying this Tweet by Andrej Karpathy.

    +1 for “context engineering” over “immediate engineering”.

    Folks affiliate prompts with quick job descriptions you’d give an LLM in your day-to-day use. When in each industrial-strength LLM app, context engineering is the fragile artwork and science of filling the context window… https://t.co/Ne65F6vFcf

    — Andrej Karpathy (@karpathy) June 25, 2025

    I actually agreed with the purpose Andrej made on this tweet. Immediate engineering is certainly an necessary science when working with LLMs. Nevertheless, immediate engineering doesn’t cowl every little thing we enter into LLMs. Along with the system immediate you write, you even have to contemplate parts corresponding to:

    • Which information do you have to insert into your immediate
    • How do you fetch that information
    • Find out how to solely present related info to the LLM
    • And so on.

    I’ll focus on all of those factors all through this text.

    API vs Console utilization

    One necessary distinction to make clear is whether or not you’re utilizing the LLMs from an API (calling it with code), or through the console (for instance, through the ChatGPT website or software). Context engineering is certainly necessary when working with LLMs via the console; nonetheless, my focus on this article shall be on API utilization. The explanation for that is that when utilizing an API, you’ve extra choices for dynamically altering the context you’re feeding the LLM. For instance, you are able to do RAG, the place you first carry out a vector search, and solely feed the LLM an important bits of knowledge, reasonably than your complete database.

    These dynamic modifications will not be accessible in the identical manner when interacting with LLMs via the console; thus, I’ll give attention to utilizing LLMs via an API.

    Context engineering methods

    Zero-shot prompting

    Zero-shot prompting is the baseline for context engineering. Doing a job zero-shot means the LLM is performing a job it hasn’t seen earlier than. You might be primarily solely offering a job description as context for the LLM. For instance, offering an LLM with an extended textual content and asking it to categorise the textual content into class A or B, in response to some definition of the lessons. The context (immediate) you’re feeding the LLM may look one thing like this:

    You might be an professional textual content classifier, and tasked with classifying texts into
    class A or class B. 
    - Class A: The textual content comprises a optimistic sentiment
    - Class B: The subsequent comprises a detrimental sentiment
    
    Classify the textual content: {textual content}

    Relying on the duty, this might work very effectively. LLMs are generalists and are in a position to carry out simplest text-based duties. Classifying a textual content into one in all two lessons will normally be a easy job, and zero-shot prompting will thus normally work fairly effectively.

    Few-shot prompting

    This infographic highlights carry out few-shot prompting:

    Context engineering. Few shot prompting
    This infographic highlights how one can carry out few-shot prompting to reinforce LLM efficiency. Picture by ChatGPT.

    The follow-up from zero-shot prompting is few-shot prompting. With few-shot prompting, you present the LLM with a immediate just like the one above, however you additionally present it with examples of the duty it would carry out. This added context will assist the LLM enhance at performing the duty. Following up on the immediate above, a few-shot immediate may seem like:

    You might be an professional textual content classifier, and tasked with classifying texts into
    class A or class B. 
    - Class A: The textual content comprises a optimistic sentiment
    - Class B: The subsequent comprises a detrimental sentiment
    
    <instance>
    {textual content 1} -> Class A
    </instance>
    <instance>
    {textual content 2} -> class B
    </instance>
    
    Classify the textual content: {textual content}

    You may see I’ve supplied the mannequin some examples wrapped in <instance></instance> tags. I’ve mentioned the subject of making sturdy LLM prompts in my article on LLM reliability beneath:

    Few-shot prompting works effectively since you are offering the mannequin with examples of the duty you’re asking it to carry out. This normally will increase efficiency.

    You may think about this works effectively on people as effectively. In the event you ask a human a job they’ve by no means achieved earlier than, simply by describing the duty, they may carry out decently (in fact, relying on the issue of the duty). Nevertheless, in case you additionally present the human with examples, their efficiency will normally enhance.

    Total, I discover it helpful to consider LLM prompts as if I’m asking a human to carry out a job. Think about as an alternative of prompting an LLM, you merely present the textual content to a human, and also you ask your self the query:

    Given this immediate, and no different context, will the human be capable to carry out the duty?

    If the reply is not any, you must work on clarifying and bettering your immediate.


    I additionally wish to point out dynamic few-shot prompting, contemplating it’s a way I’ve had a whole lot of success with. Historically, with few-shot prompting, you’ve a set record of examples you feed into each immediate. Nevertheless, you’ll be able to usually obtain larger efficiency utilizing dynamic few-shot prompting.

    Dynamic few-shot prompting means choosing the few-shot examples dynamically when creating the immediate for a job. For instance, in case you are requested to categorise a textual content into lessons A and B, and you have already got a listing of 200 texts and their corresponding labels. You may then carry out a similarity search between the brand new textual content you’re classifying and the instance texts you have already got. Persevering with, you’ll be able to measure the vector similarity between the texts and solely select essentially the most comparable texts (out of the 200 texts) to feed into your immediate as context. This fashion, you’re offering the mannequin with extra related examples of carry out the duty.

    RAG

    Retrieval augmented technology is a well known method for rising the data of LLMs. Assume you have already got a database consisting of hundreds of paperwork. You now obtain a query from a consumer, and should reply it, given the data inside your database.

    Sadly, you’ll be able to’t feed your complete database into the LLM. Despite the fact that now we have LLMs corresponding to Llama 4 Scout with a 10-million context size window, databases are normally a lot bigger. You due to this fact have to seek out essentially the most related info within the database to feed into your LLM. RAG does this equally to dynamic few-shot prompting:

    1. Carry out a vector search
    2. Discover essentially the most comparable paperwork to the consumer query (most comparable paperwork are assumed to be most related)
    3. Ask the LLM to reply the query, given essentially the most comparable paperwork

    By performing RAG, you’re doing context engineering by solely offering the LLM with essentially the most related information for performing its job. To enhance the efficiency of the LLM, you’ll be able to work on the context engineering by bettering your RAG search. This may, for instance, be achieved by bettering the search to seek out solely essentially the most related paperwork.

    You may learn extra about RAG in my article about creating a RAG system on your private information:

    Instruments (MCP)

    You may as well present the LLM with instruments to name, which is a vital a part of context engineering, particularly now that we see the rise of AI brokers. Device calling at the moment is commonly achieved utilizing Model Context Protocol (MCP), a concept started by Anthropic.

    AI brokers are LLMs able to calling instruments and thus performing actions. An instance of this may very well be a climate agent. In the event you ask an LLM with out entry to instruments in regards to the climate in New York, it will be unable to supply an correct response. The explanation for that is naturally that details about the climate must be fetched in actual time. To do that, you’ll be able to, for instance, give the LLM a device corresponding to:

    @device
    def get_weather(metropolis):
        # code to retrieve the present climate for a metropolis
        return climate
    

    In the event you give the LLM entry to this device and ask it in regards to the climate, it could possibly then seek for the climate for a metropolis and give you an correct response.

    Offering instruments for LLMs is extremely necessary, because it considerably enhances the talents of the LLM. Different examples of instruments are:

    • Search the web
    • A calculator
    • Search through Twitter API

    Matters to contemplate

    On this part, I make just a few notes on what you must take into account when creating the context to feed into your LLM

    Utilization of context size

    The context size of an LLM is a vital consideration. As of July 2025, you’ll be able to feed most frontier mannequin LLMs with over 100,000 enter tokens. This offers you with a whole lot of choices for make the most of this context. You need to take into account the tradeoff between:

    • Together with a whole lot of info in a immediate, thus risking among the info getting misplaced within the context
    • Lacking some necessary info within the immediate, thus risking the LLM not having the required context to carry out a selected job

    Normally, the one manner to determine the steadiness, is to check your LLMs efficiency. For instance with a classificaition job, you’ll be able to verify the accuracy, given totally different prompts.

    If I uncover the context to be too lengthy for the LLM to work successfully, I generally cut up a job into a number of prompts. For instance, having one immediate summarize a textual content, and a second immediate classifying the textual content abstract. This can assist the LLM make the most of its context successfully and thus enhance efficiency.

    Moreover, offering an excessive amount of context to the mannequin can have a major draw back, as I describe within the subsequent part:

    Context rot

    Final week, I learn an interesting article about context rot. The article was about the truth that rising the context size lowers LLM efficiency, regardless that the duty issue doesn’t enhance. This means that:

    Offering an LLM irrelevant info, will lower its means to carry out duties succesfully, even when job issue doesn’t enhance

    The purpose right here is actually that you must solely present related info to your LLM. Offering different info decreases LLM efficiency (i.e., efficiency will not be impartial to enter size)

    Conclusion

    On this article, I’ve mentioned the subject of context engineering, which is the method of offering an LLM with the best context to carry out its job successfully. There are a whole lot of methods you’ll be able to make the most of to refill the context, corresponding to few-shot prompting, RAG, and instruments. These are all highly effective methods you should use to considerably enhance an LLM’s means to carry out a job successfully. Moreover, you even have to contemplate the truth that offering an LLM with an excessive amount of context additionally has downsides. Rising the variety of enter tokens reduces efficiency, as you would examine within the article about context rot.

    👉 Observe me on socials:

    🧑‍💻 Get in touch
    🔗 LinkedIn
    🐦 X / Twitter
    ✍️ Medium
    🧵 Threads





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleI Analysed 25,000 Hotel Names and Found Four Surprising Truths
    Next Article Five things you need to know about AI right now
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    Creating AI that matters | MIT News

    October 21, 2025
    Artificial Intelligence

    Scaling Recommender Transformers to a Billion Parameters

    October 21, 2025
    Artificial Intelligence

    Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know

    October 21, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Cyberbrottslingar använder Vercels v0 för att skapa falska inloggningssidor

    July 3, 2025

    Hands-On Attention Mechanism for Time Series Classification, with Python

    May 30, 2025

    Dynamic Inventory Optimization with Censored Demand

    July 14, 2025

    Kinesiska startupen Z.ai lanserar billigare modell med öppen källkod

    July 29, 2025

    MIT and Mass General Brigham launch joint seed program to accelerate innovations in health | MIT News

    June 27, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    End-to-End AWS RDS Setup with Bastion Host Using Terraform

    July 28, 2025

    What Counts as AGI? The Test That Could Rewrite One of AI’s Richest Deals

    August 5, 2025

    Svenska AI-startupbolaget IntuiCell har skapat en robothunden Luna som har ett funktionellt digitalt nervsystem

    April 4, 2025
    Our Picks

    Creating AI that matters | MIT News

    October 21, 2025

    Scaling Recommender Transformers to a Billion Parameters

    October 21, 2025

    Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know

    October 21, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.