Close Menu
    Trending
    • Dispatch: Partying at one of Africa’s largest AI gatherings
    • Topp 10 AI-filmer genom tiderna
    • OpenAIs nya webbläsare ChatGPT Atlas
    • Creating AI that matters | MIT News
    • Scaling Recommender Transformers to a Billion Parameters
    • Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know
    • Is RAG Dead? The Rise of Context Engineering and Semantic Layers for Agentic AI
    • ChatGPT Gets More Personal. Is Society Ready for It?
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » OpenAI’s New Benchmark Shows AI Does Knowledge Work 100X Faster and Cheaper Than Experts
    Latest News

    OpenAI’s New Benchmark Shows AI Does Knowledge Work 100X Faster and Cheaper Than Experts

    ProfitlyAIBy ProfitlyAISeptember 30, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    For years, the gold normal for measuring AI progress has been difficult tutorial assessments and summary puzzles. However the true query has all the time been: Can AI do the precise work individuals receives a commission for?

    OpenAI is trying to reply that query with the launch of its new analysis framework, GDPval, and the outcomes are a wake-up name for each information employee and enterprise chief.

    In line with the blind evaluations run by trade specialists, right this moment’s greatest fashions—like GPT-5 and Claude Opus 4.1—are already producing work rated as equal to or higher than human output almost half the time. This framework, which measures efficiency throughout 44 information work occupations, is the type of real-world evaluation that AI has desperately wanted.

    To unpack this new analysis framework’s significance, I spoke to SmarterX and Advertising and marketing AI Institute founder and CEO Paul Roetzer on Episode 170 of The Artificial Intelligence Show.

    Why GDPval Is the Actual-World Take a look at That Issues

    At its core, GDPval principally features like a real-world check for AI to find out if it may well do economically precious information work. Not like conventional benchmarks that use easy textual content prompts or exam-style questions, the GDPval analysis system is constructed on real-world deliverables and contexts:

    • The analysis spans 1,320 specialised duties, all primarily based on actual work merchandise like authorized briefs, engineering blueprints, buyer assist conversations, and nursing care plans.
    • Each activity was meticulously crafted by subject material specialists with over a decade of expertise, who then served because the blind graders. They in contrast the human- and AI-generated deliverables with out figuring out which was which, providing critiques and rankings.
    • The duties aren’t easy textual content prompts; they embrace reference information and context, with anticipated deliverables spanning paperwork, slides, diagrams, spreadsheets, and multimedia.

    This deal with the fact of labor is important. 

    “The factor we’ve talked about for some time is that the IQ assessments [in traditional AI evaluations] have been saturated,” he says. “What we actually wanted to know was the implications on precise work. Individuals do the duties which can be a part of these jobs.”

    And, if GDPval is any indication, AI is getting superb on the duties that folks do as a part of their jobs.

    100X Sooner and 100X Cheaper

    OpenAI’s analysis discovered that frontier fashions can full the GDPval duties roughly 100 occasions quicker and 100 occasions cheaper than human trade specialists.

    Roetzer emphasised the importance of this discovering, particularly contemplating the comparability level: these are trade specialists, not simply common employees. We’re already on the level the place plainly giving a few of these duties to an AI mannequin as an alternative of a human would save each money and time.

    That’s going to have some disruptive results on the financial system as we all know it. The occupations chosen for the examine have been these contributing most to whole wages and compensation within the 9 industries that contribute over 5% of US GDP. 

    This deliberate focus parallels the technique of AI labs and VCs trying on the “whole addressable market of salaries” to find out which markets may be most disrupted by AI know-how.

    In different phrases, GDPval shouldn’t be solely an analysis framework, but additionally a roadmap that factors to precisely which information work jobs AI may disrupt.

    2026 because the 12 months AI Begins to Overtake People

    The GDPval outcomes are a present snapshot, however one pc scientist and AI researcher—Julian Schrittwieser, a key participant within the growth of Google’s AlphaGo and AlphaZero—issued a transparent warning concerning the tempo of future progress.

    In a widely shared post, Schrittwieser cautioned in opposition to the entice of concluding that AI is plateauing simply because it makes occasional errors. Extrapolating the constant development of exponential efficiency enchancment, he predicts that 2026 will probably be a pivotal yr for widespread integration of AI into the financial system:

    • By mid-2026, he says fashions will be capable of work autonomously for full eight-hour work days.
    • By the tip of 2026, a minimum of one mannequin will match the efficiency of human specialists throughout many industries.
    • And by the tip of 2027, fashions will steadily outperform specialists on many duties.

    This sober evaluation, that “extrapolating straight strains on graphs is probably going to provide you a greater mannequin of the long run than most specialists,” is why economists are beginning to sound the alarm. 

    A new research paper from specialists at Stanford is already recommending a analysis agenda to deal with the affect of “transformative AI” on financial progress, earnings distribution, and human wellbeing.

    Why You Can’t Afford to Have Blindspots

    This confluence of proof—the GDPval’s present proof of expert-level functionality and the conservative timeline for AGI—means nobody can afford to stay skeptical.

    The dialog is shifting from “AI would not actually do something” to the conclusion that it is getting actually good in any respect the stuff you do. OpenAI’s says their objective is to maintain everybody on the “up elevator” of AI by democratizing entry and supporting employees via change.

    However the problem is that essentially the most direct proof of AI’s affect is private adoption. 

    As Roetzer concluded, whenever you cease to take a look at the duties that make up your job, you’ll be able to see the change occurring. The sunshine bulb second, the place individuals understand how extremely useful and environment friendly the instruments are when utilized to their on a regular basis work, is the second the financial system actually begins to rework earlier than all our eyes.

    However for those who don’t use the instruments sufficient to succeed in that time, you danger growing some critical blindspots with regards to AI’s affect in your profession.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleHow AI-Generated Content Is Destroying Team Productivity
    Next Article OpenAI’s New Report Details How We Use ChatGPT at Work
    ProfitlyAI
    • Website

    Related Posts

    Latest News

    ChatGPT Gets More Personal. Is Society Ready for It?

    October 21, 2025
    Latest News

    Why the Future Is Human + Machine

    October 21, 2025
    Latest News

    Why AI Is Widening the Gap Between Top Talent and Everyone Else

    October 21, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Google har lanserat Gemini 2.5 Flash med thinking budget

    April 18, 2025

    Googles framtidsvision är att Gemini utför googling åt användarna

    May 23, 2025

    The AI Hype Index: AI-designed antibiotics show promise

    August 27, 2025

    Google ersätter Google assistant med Gemini for Home

    October 13, 2025

    ACP: The Internet Protocol for AI Agents

    May 9, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    Designing better products with AI and sustainability 

    August 26, 2025

    DeepCoder: Open Source AI som når O3-mini Prestanda

    April 9, 2025

    What Statistics Can Tell Us About NBA Coaches

    May 22, 2025
    Our Picks

    Dispatch: Partying at one of Africa’s largest AI gatherings

    October 22, 2025

    Topp 10 AI-filmer genom tiderna

    October 22, 2025

    OpenAIs nya webbläsare ChatGPT Atlas

    October 22, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.