Close Menu
    Trending
    • Why Care About Prompt Caching in LLMs?
    • How Vision Language Models Are Trained from “Scratch”
    • Why physical AI is becoming manufacturing’s next advantage
    • Personalized Restaurant Ranking with a Two-Tower Embedding Variant
    • A Tale of Two Variances: Why NumPy and Pandas Give Different Answers
    • How to Build Agentic RAG with Hybrid Search
    • Building a strong data infrastructure for AI agent success
    • Defense official reveals how AI chatbots could be used for targeting decisions
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » How to Train a Chatbot Using RAG and Custom Data
    Artificial Intelligence

    How to Train a Chatbot Using RAG and Custom Data

    ProfitlyAIBy ProfitlyAIJune 25, 2025No Comments6 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    ?

    RAG, which stands for Retrieval-Augmented Technology, describes a course of by which an LLM (Massive Language Mannequin) could be optimized by coaching it to drag from a extra particular, smaller data base moderately than its big unique base. Usually, LLMs like ChatGPT are educated on the whole web (billions of knowledge factors). This implies they’re susceptible to small errors and hallucinations.

    Right here is an instance of a scenario the place RAG might be used and be useful:

    I wish to construct a US state tour information chat bot, which comprises common details about US states, reminiscent of their capitals, populations, and essential vacationer sights. To do that, I can obtain Wikipedia pages of those US states and prepare my LLM utilizing textual content from these particular pages.

    Creating your RAG LLM

    One of the vital fashionable instruments for constructing RAG programs is LlamaIndex, which:

    • Simplifies the mixing between LLMs and exterior knowledge sources
    • Permits builders to construction, index, and question their knowledge in a manner that’s optimized for LLM consumption
    • Works with many sorts of knowledge, reminiscent of PDFs and textual content information
    • Helps assemble a RAG pipeline that retrieves and injects related chunks of knowledge right into a immediate earlier than passing it to the LLM for technology

    Obtain your knowledge

    Begin by getting the information you wish to prepare your mannequin with. To obtain PDFs from Wikipedia (CC by 4.0) in the precise format, be sure you click on Print after which “Save as PDF.”

    Don’t simply export the Wikipedia as a PDF — Llama gained’t just like the format it’s in and can reject your information.

    For the needs of this text and to maintain issues easy, I’ll solely obtain the pages of the next 5 fashionable states: 

    • Florida
    • California
    • Washington D.C.
    • New York
    • Texas

    Be sure that to avoid wasting these all in a folder the place your venture can simply entry them. I saved them in a single referred to as “knowledge”.

    Get vital API keys

    Earlier than you create your customized states database, there are 2 API keys you’ll must generate.

    • One from OpenAI, to entry a base LLM
    • One from Llama to entry the index database you add customized knowledge to

    After you have these API keys, retailer them in a .env file in your venture. 

    #.env file
    LLAMA_API_KEY = "<Your-Api-Key>"
    OPENAI_API_KEY = "<Your-Api-Key>"

    Create an Index and Add your knowledge 

    Create a LlamaCloud account. When you’re in, discover the Index part and click on “Create” to create a brand new index.

    Screenshot by writer

    An index shops and manages doc indexes remotely to allow them to be queried by way of an API with no need to rebuild or retailer them domestically.

    Right here’s the way it works:

    1. Once you create your index, there will probably be a spot the place you possibly can add information to feed into the mannequin’s database. Add your PDFs right here.
    2. LlamaIndex parses and chunks the paperwork.
    3. It creates an index (e.g., vector index, key phrase index).
    4. This index is saved in LlamaCloud.
    5. You may then question it utilizing an LLM via the API.

    The following factor you could do is to configure an embedding mannequin. An embedding mannequin is the LLM that may underlie your venture and be chargeable for retrieving the related info and outputting textual content.

    Once you’re creating a brand new index you wish to choose “Create a brand new OpenAI embedding”:

    Screenshot by writer

    Once you create your new embedding you’ll have to offer your OpenAI API key and identify your mannequin.

    Screenshot by writer

    After you have created your mannequin, depart the opposite index settings as their defaults and hit “Create Index” on the backside.

    It might take a couple of minutes to parse and retailer all of the paperwork, so guarantee that all of the paperwork have been processed earlier than you attempt to run a question. The standing ought to present on the precise aspect of the display screen while you create your index in a field that claims “Index Information Abstract”.

    Accessing your mannequin by way of code

    When you’ve created your index, you’ll additionally get an Group ID. For cleaner code, add your Group ID and Index Title to your .env file. Then, retrieve all the required variables to initialize your index in your code:

    index = LlamaCloudIndex(
      identify=os.getenv("INDEX_NAME"), 
      project_name="Default",
      organization_id=os.getenv("ORG_ID"),
      api_key=os.getenv("LLAMA_API_KEY")
    )

    Question your index and ask a query

    To do that, you’ll must outline a question (immediate) after which generate a response by calling the index as such:

    question = "What state has the best inhabitants?"
    response = index.as_query_engine().question(question)
    
    # Print out simply the textual content a part of the response
    print(response.response)

    Having an extended dialog along with your bot

    By querying a response from the LLM the best way we simply did above, you’ll be able to simply entry info from the paperwork you loaded. Nevertheless, in case you ask a observe up query, like “Which one has the least?” with out context, the mannequin gained’t keep in mind what your unique query was. It’s because we haven’t programmed it to maintain observe of the chat historical past.

    With a purpose to do that, you could:

    • Create reminiscence utilizing ChatMemoryBuffer
    • Create a chat engine and add the created reminiscence utilizing ContextChatEngine

    To create a chat engine:

    from llama_index.core.chat_engine import ContextChatEngine
    from llama_index.core.reminiscence import ChatMemoryBuffer
    
    # Create a retriever from the index
    retriever = index.as_retriever()
    
    # Arrange reminiscence
    reminiscence = ChatMemoryBuffer.from_defaults(token_limit=2000)
    
    # Create chat engine with reminiscence
    chat_engine = ContextChatEngine.from_defaults(
        retriever=retriever,
        reminiscence=reminiscence,
        llm=OpenAI(mannequin="gpt-4o"),
    )

    Subsequent, feed your question into your chat engine:

    # To question:
    response = chat_engine.chat("What's the inhabitants of New York?")
    print(response.response)

    This offers the response: “As of 2024, the estimated inhabitants of New York is nineteen,867,248.”

    I can then ask a observe up query:

    response = chat_engine.chat("What about California?")
    print(response.response)

    This offers the next response: “As of 2024, the inhabitants of California is 39,431,263.” As you possibly can see, the mannequin remembered that what we had been asking about beforehand was inhabitants and responded accordingly.

    Streamlit UI chatbot app for US state RAG. Screenshot by writer

    Conclusion

    Retrieval Augmented Technology is an environment friendly technique to prepare an LLM on particular knowledge. LlamaCloud affords a easy and easy technique to construct your individual RAG framework and question the mannequin that lies beneath.

    The code I used for this tutorial was written in a pocket book, but it surely can be wrapped in a Streamlit app to create a extra pure backwards and forwards dialog with a chatbot. I’ve included the Streamlit code here on my Github.

    Thanks for studying

    • Join with me on LinkedIn
    • Buy me a coffee to help my work!
    • I supply 1:1 knowledge science tutoring, profession teaching/mentoring, writing recommendation, resume evaluations & extra on Topmate!



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleMerging AI and underwater photography to reveal hidden ocean worlds | MIT News
    Next Article Stop Chasing “Efficiency AI.” The Real Value Is in “Opportunity AI.”
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    Why Care About Prompt Caching in LLMs?

    March 13, 2026
    Artificial Intelligence

    How Vision Language Models Are Trained from “Scratch”

    March 13, 2026
    Artificial Intelligence

    Personalized Restaurant Ranking with a Two-Tower Embedding Variant

    March 13, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s “Humanist” AI vs. Elon Musk’s “Inevitable” AI: The Battle for Superintelligence

    November 11, 2025

    Exploratory Data Analysis: Gamma Spectroscopy in Python

    June 10, 2025

    Keeping Probabilities Honest: The Jacobian Adjustment

    December 25, 2025

    Why Care About Prompt Caching in LLMs?

    March 13, 2026

    MIT researchers introduce Boltz-1, a fully open-source model for predicting biomolecular structures | MIT News

    April 9, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    The Machine Learning “Advent Calendar” Day 5: GMM in Excel

    December 5, 2025

    Meta’s Chief AI Scientist Leaving to Launch Startup Focused on “World Models”

    November 21, 2025

    From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting

    January 27, 2026
    Our Picks

    Why Care About Prompt Caching in LLMs?

    March 13, 2026

    How Vision Language Models Are Trained from “Scratch”

    March 13, 2026

    Why physical AI is becoming manufacturing’s next advantage

    March 13, 2026
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.