in AI-related releases continues unabated. Only a few days in the past, it launched a brand new instrument for Gemini referred to as URL context grounding.
URL context grounding can be utilized stand-alone or mixed with Google search grounding to conduct deep dives into web content material.
What’s URL context grounding?
In a nutshell, it’s a option to programmatically have Gemini learn, perceive and reply questions on content material and information contained in particular person net URLs (together with these pointing to PDFs) with out the necessity to carry out what we all know as conventional RAG processing.
In different phrases, there is no such thing as a have to extract the URL textual content and content material, chunk it, vectorise it, retailer it and so forth. You inform Google what URL you’re keen on and off you go. As you’ll see in a second, it is rather simple to code and extremely correct.
It’s for these causes that I mentioned it may very well be one other nail in RAG’s coffin.
However does it work? Let’s take a look at a few examples.
I’ll arrange my improvement atmosphere first below Ubuntu WSL2 for Home windows. Comply with alongside or use whichever technique you’re used to.
$ uv init url_context
$ cd url_context
$ uv venv url_context
$ uv pip set up jupyter
$ uv pip set up "google-genai>=1.16.0"
You’ll additionally want a Google API key. In the event you don’t have already got one, head over to Google AI Studio, join if you must, and set your key up. The hyperlink to take action will likely be close to the highest right-hand facet nook of the dashboard web page.
Now running this command should bring up a new TAB in your browser with a Notebook.
$ jupyter notebook
Some limitations to be aware of
Before proceeding to our coding examples, there are a few limitations and restrictions on the use of URL context grounding you should be aware of.
- A maximum of 20 URLs can be included per request.
- The maximum size for content retrieved from a single URL is 34MB.
- The following content types are not supported
- Paywalled content
- YouTube videos
- Google Workspace files, like Google Docs or spreadsheets
- Video and audio files
With that being said, let’s get on with our examples.
Example 1 — Interrogating a complex PDF
My go-to test data file when I’m testing RAG or similar processing against data in PDFs is to use one of Tesla’s 10-Q quarterly earnings report. It’s pretty long at around 50 pages and has some quite complex layouts with tables etc.
As it’s an SEC filing document, it also means that it’s publicly available and completely free to use its contents.
If you want to have a look for yourself, the document can be found at this URL.
https://ir.tesla.com/_flysystem/s3/sec/000162828023034847/tsla-20230930-gen.pdf
For this PDF, the query I all the time pose is that this,
"What are the Whole liabilities and Whole belongings for 2022 and 2023"
The reply to that query is on web page 4 of the doc. Right here is that web page.
To people, the reply is simple to seek out. As you’ll be able to see, the Whole belongings for 2022/2023 had been (in Hundreds of thousands) $82,338/$93,941. The Whole liabilities had been (in Hundreds of thousands) $36,440/$39,446.
Again within the day (i.e about 18 months in the past!), it was difficult to get this data from this doc utilizing conventional RAG strategies.
How will Google URL context grounding cope?
In your Jupyter pocket book, sort on this code.
from google import genai
from google.genai import sorts
from IPython.show import HTML, Markdown
consumer = genai.Shopper(api_key='YOUR_API_KEY HERE')
# We will use a lot of the Gemini fashions equivalent to 2.5 Flash and so forth... right here
MODEL_ID = "gemini-2.5-pro"
immediate = """
Based mostly on the contents of this PDF https://ir.tesla.com/_flysystem/s3/sec/000162828023034847/tsla-20230930-gen.pdf, What
are the Whole liabilities and Whole belongings for 2022 and 2023. Lay them out on this format
September 30 2023 December 31, 2022
Whole Property $123 $456
Whole Liabilities $67 $23
Do not output anything, simply the above data
"""
config = {
"instruments": [{"url_context": {}}],
}
response = consumer.fashions.generate_content(
contents=[prompt],
mannequin=MODEL_ID,
config=config
)
show(response.textual content)
That’s it, only a handful of traces, however let’s see the output.
'September 30 2023 December 31, 2022nTotal Property $93,941 $82,338nTotal Liabilities $39,446 $36,440'
Spot on, not too shabby.
Let’s see if it will probably pick another data. Close to the tip of the PDF, there’s a letter to an worker who’s about to go away the corporate outlining their phrases of severance. Can URL context grounding decide why the exit date referred to within the letter is marked by asterisks (***)? Right here’s a snippet of the letter.

The explanation for the masking out of the exit date is given in a footnote.

The code we have to extract this data is similar to our first instance. In reality, the one factor that modifications is the immediate, so I’ll solely present that.
...
...
immediate = """
Based mostly on https://ir.tesla.com/_flysystem/s3/sec/000162828023034847/tsla-20230930-gen.pdf, an worker severance letter is displayed
Why is the exit date referred to within the letter marked with ***
"""
...
...
And the output?
'Based mostly on the supplied doc, the exit date within the worker severance
letter is marked with "[***]" as a result of particular, non-material data
that the corporate treats as non-public or confidential has been deliberately
omitted from the general public submitting.nnThe doc features a notice clarifying
this follow: "Sure recognized data has been omitted from this
doc as a result of it's not materials and is the kind that the corporate treats
as non-public or confidential, and has been marked with "[***]" to point
the place omissions have been made."'
As you’ll be able to see, that’s spot on as soon as once more.
What are different makes use of for URL context grounding?
For my part, it opens up a wealth of recent alternatives, together with:-
In-depth Content material Evaluation and Synthesis.
- Information Extraction. The instrument can pull particular data, equivalent to costs, names, or key findings, from a number of URLs.
- Doc Comparability. It could analyse a number of reviews, articles, and even PDFs to determine variations and observe traits.
- Content material Creation. By combining data from a number of supply URLs, the AI can generate correct summaries, weblog posts, or reviews. For instance, a developer may use the instrument to match two recipes from completely different web sites, analysing components and cooking instances.
- Code and Documentation Evaluation. Builders can level the AI to a GitHub repository or technical documentation to elucidate code, generate setup directions, or reply particular questions on it.
Refined Agentic Workflows.
- The mixture of broad discovery by means of Google Search and deep evaluation through the URL context instrument varieties the premise for advanced, multi-step duties. An AI agent may first seek for related articles on a subject after which use the URL context instrument to deeply “learn” and synthesise data from probably the most pertinent search outcomes.
- The Gemini CLI, an open-source AI agent, utilises the URL context instrument for its web-fetch command. This enables builders to rapidly summarise webpages, extract key data, and even translate content material straight from their terminal.
Improved Factual Accuracy and Decreased Hallucinations.
- By grounding responses within the content material of particular net pages, the AI’s factual accuracy is elevated, lowering the chance of producing incorrect or fabricated data. This additionally permits the AI to supply citations for its claims, constructing person belief by displaying the sources of its data.
Helps all kinds of content material sorts.
- PDFs. The AI can extract textual content and perceive the construction of tables inside PDF paperwork, making reviews and manuals accessible for grounding.
- Photographs. It could course of and analyse pictures in varied codecs (PNG, JPEG, BMP, WebP), leveraging multimodal capabilities to know charts and diagrams.
- Net and Information Recordsdata. Continued help for HTML, JSON, XML, CSV, and plain textual content recordsdata ensures broad applicability.
Instance 2 — Carry out a worth comparability
For our second instance, let’s assume we’re on the hunt for a brand new set of headphones. We are going to feed an inventory of the URLs of a number of on-line outlets promoting the product into our code and ask the mannequin to retrieve the three least expensive merchandise that meet our specification.
This instance might really feel a bit redundant since there are many procuring comparability web sites on the market, however it’s actually simply meant to spotlight the sorts of issues you are able to do with the instrument.
Say we wish to purchase a selected mannequin of headphones, e.g. the Sony WH-1000XM5 Wi-fi Noise-Cancelling Headphones. Now we have recognized on-line outlets with probably the most aggressive costs, however these costs fluctuate nearly day by day. Let’s create a script that may run at any time to return the shops with the three least expensive costs.
Once more, the one distinction between this instance code and our first is the immediate. The remainder of the code is similar.
immediate = """
Based mostly on these URL hyperlinks, output the three least expensive costs for these
headphones and the related retailer.
https://electronics.sony.com/audio/headphones/headband/p/wh1000xm5-b?srsltid=AfmBOopJmjebTtZEieUvHEf5xEke7C7piVi3BdlSUdTPJH3wuBfTksJy
https://tristatecamera.com/product/TRI_STATE_CAMERA_Sony_WH-1000XM5_Wireless_Noise-Canceling_Over-Ear_Headphones_Black_1_Yr_WH1000XM5BS2.html?refid=279&KPID=SONWH1000XM5BS2&fl=GSOrganic&srsltid=AfmBOoqnE7vgc1uOELadhkaRlhHuJx3HGRTV5ICN7ihNkFXI_UEuImZ2gXU
https://poshmark.com/itemizing/Sony-WH-1000xm5-Headphones-672d0ab515ad54b37949b845#utm_source=gdm_unpaid
https://reverb.com/merchandise/91492218-sony-wh-1000xm5-wireless-noise-canceling-over-the-ear-headphones-silver?utm_campaign=US-Shop_unpaid&utm_medium=cpc&utm_source=google
Sony WH-1000XM5 Noise-Canceling Wireless Over-Ear Headphones (Black)
https://www.newegg.com/p/0TH-000U-00JZ4?merchandise=9SIA29PK9N4805&utm_source=google&utm_medium=natural+procuring&utm_campaign=knc-googleadwords-_-headphones+and+accessories-_-sony-_-9SIA29PK9N4805&supply=area&srsltid=AfmBOooONnd3a1lju0DgyhpdXlT1VtUp_skJdsx_uYH1DdHKLWPNe_DWBuY&com_cvv=8fb3d522dc163aeadb66e08cd7450cbbdddc64c6cf2e8891f6d48747c6d56d2c
"""
This time the output is.
'Based mostly on the supplied URLs, listed below are the three least expensive costs for the
Sony WH-1000XM5 headphones:nn1.
**$145.00** at Reverb.n2.
**$258.99** at Teds Electronics.n3.
**$329.99** at Sony.'
Instance 3 — Firm monetary evaluation and comparisons.
On this instance, we’ll evaluate the Quarter 2, 2025 earnings reviews from each Amazon and Microsoft. We’ll ask the mannequin to analyse each reviews, extract key data and conclude with a abstract indicating the important thing strengths and techniques of each corporations. The info is as soon as once more being obtained from their public SEC 10-Q earnings reviews.
from google import genai
from google.genai import sorts
from IPython.show import HTML, Markdown
consumer = genai.Shopper(api_key='YOUR_API_KEY_HERE')
MODEL_ID = "gemini-2.5-pro"
microsoft_earnings_url = "https://www.sec.gov/ix?doc=/Archives/edgar/information/0000789019/000095017025100235/msft-20250630.htm"
amazon_earnings_url = "https://www.sec.gov/ix?doc=/Archives/edgar/information/0001018724/000101872425000086/amzn-20250630.htm"
# --- Step 3: Assemble the Detailed, Non-Trivial Immediate ---
# This immediate guides the AI to carry out a deep, comparative evaluation
# somewhat than only a easy information extraction.
immediate = f"""
Please act as a senior monetary analyst and supply a comparative evaluation of the most recent quarterly earnings reviews for Amazon and Microsoft.
Entry and completely analyse the content material from the next two URLs:
1. **Microsoft Earnings Report:** {microsoft_earnings_url}
2. **Amazon's Earnings Report:** {amazon_earnings_url}
Based mostly *solely* on the knowledge contained inside these two paperwork, please carry out the next duties:
1. **Extract and Examine Key Monetary Metrics:**
* Determine and extract the Whole Income, Internet Earnings, and Diluted Earnings Per Share (EPS) for each corporations.
* Current these core metrics in a transparent, formatted markdown desk for straightforward comparability.
2. **Analyse and Summarise Administration Commentary:**
* Evaluate the sections containing quotes from the CEOs (Satya Nadella for Microsoft, Jeff Bezos for Amazon) and CFOs.
* For every firm, write a paragraph summarising the important thing themes they're emphasising. What are the first drivers of their efficiency, in keeping with them? What's the general tone of their commentary (e.g., optimistic, cautious)?
3. **Determine and Distinction Strategic Focus:**
* Pinpoint the precise enterprise segments or product classes that every firm highlights as main progress drivers (e.g., Microsoft Cloud and AI, Amazon's AWS companies, and so forth).
* Distinction their major strategic focus for the quarter. Is another centered on enterprise/cloud, whereas the opposite is extra centered on client {hardware} and ecosystem progress?
4. **Synthesise a Conclusive Govt Abstract:**
* Write a remaining, concise paragraph that synthesises the findings. Examine the general well being and present strategic posture of the 2 corporations primarily based on these reviews. For instance, conclude which firm demonstrated stronger progress on this particular quarter and wherein areas.
Simply output your remaining evaluation. There is no such thing as a have to output intervening steps thopughts or information
"""
config = {
"instruments": [{"url_context": {}}],
}
response = consumer.fashions.generate_content(
contents=[prompt],
mannequin=MODEL_ID,
config=config
)
show(Markdown(response.textual content))
Right here is the output.
Here's a comparative evaluation of the most recent quarterly earnings reviews for
Amazon and Microsoft.
1. Key Monetary Metrics
Metric Amazon (Q2 2025) Microsoft (FY25 Q2)
Whole Income 143.7 billion 61 9 billion
Internet Earnings 10.4 billion 21.9 billion
Diluted EPS 1 2.94
2. Administration Commentary Evaluation
Microsoft:
Microsoft's administration, led by CEO Satya Nadella, maintains a extremely
optimistic tone, emphasizing the transformative impression of AI on their
enterprise. Nadella's commentary highlights the "Microsoft Cloud" as a major
efficiency driver, citing its function in serving to clients apply AI at scale.
The corporate's CFO, Amy Hood, reinforces this by noting that the sturdy
quarterly outcomes had been pushed by the Microsoft Cloud, which noticed a 23% improve
in income. The overarching theme from Microsoft's management is that their
investments in AI aren't simply paying off however are basically reshaping their
product choices and buyer relationships, resulting in vital progress
and market management.
Amazon:
Amazon's management, together with CEO Jeff Bezos, presents a assured and
forward-looking perspective. The commentary highlights the sturdy efficiency
of Amazon Net Providers (AWS) and the corporate's continued innovation in AI,
notably with companies like Bedrock and the event of customized AI chips.
Bezos emphasizes the corporate's concentrate on long-term, customer-centric innovation.
The general tone is one in all strong optimism, with a transparent message that Amazon's
investments in AI and cloud computing are yielding substantial outcomes and
will proceed to be central to their technique.
3. Strategic Focus Distinction
Microsoft:
Microsoft's strategic focus for the quarter is closely centered on its
enterprise and cloud choices, with a specific emphasis on synthetic
intelligence. The report repeatedly highlights the "Microsoft Cloud" because the
cornerstone of its success, with particular mentions of merchandise like Azure,
Dynamics 365, and Microsoft 365, all of that are being infused with AI
capabilities. The commentary from CEO Satya Nadella explicitly factors to the
firm's success in serving to clients apply AI at scale as a key
differentiator and progress driver.
Amazon:
Amazon's strategic focus, whereas additionally closely invested in cloud and AI by means of
AWS, is broader, encompassing its huge e-commerce and consumer-facing
companies. The report underscores the expansion of AWS, with a notable improve
in income and working earnings, and highlights the corporate's innovation in
AI with companies like Bedrock. Nonetheless, not like Microsoft, Amazon's report additionally
factors to the power of its client enterprise, with a concentrate on choice,
worth, and comfort, in addition to the expansion of its promoting companies.
4. Govt Abstract
Based mostly on their newest quarterly reviews, each Microsoft and Amazon are
demonstrating sturdy monetary well being and are strategically positioned to
capitalize on the expansion of cloud computing and synthetic intelligence.
Microsoft's report signifies a extra centered technique, with the "Microsoft
Cloud" and its built-in AI capabilities serving as the first engine of
progress, leading to spectacular profitability. Amazon, whereas additionally experiencing
vital progress in its AWS section, presents a extra diversified technique,
balancing its enterprise cloud enterprise with its large e-commerce and
promoting operations. On this particular quarter, Microsoft has proven stronger
progress in web earnings and EPS, reflecting the excessive margins of its enterprise
software program and cloud enterprise.
Abstract
The article introduces Google’s new URL context grounding instrument for Gemini, which permits builders to question and analyse the contents of particular net URLs (together with PDFs) straight, with out conventional Retrieval-Augmented Technology (RAG) steps like textual content extraction, chunking, and vectorisation.
I demonstrated its ease of use with Python code examples working on Jupyter notebooks, displaying profitable retrieval of information from Tesla’s 10-Q SEC submitting PDF, product worth comparisons throughout on-line outlets, and a monetary evaluation of Amazon and Microsoft’s Q2 2025 monetary outcomes.
Whereas noting limitations such because the instrument not spporting paywalled URLs and a few media content material like YouTube videoas, I highlighted its means to carry out deep doc interrogation, information extraction, comparability, and synthesis on all kinds of net pages and opnline PDFs – enhancing its accuracy by grounding responses in actual sources.
For a lot of use instances, this instrument successfully replaces conventional RAG workflows, notably when mixed with Google Search grounding to allow extra refined agentic workflows, factual reliability, and multimodal content material evaluation.
I hope this text has whetted your urge for food for the myriad of use instances that this convenient utility can supply.
