top of page

News

The first wave of AI innovation is over. Here’s what comes next

Businesses themselves, not closed-source AI companies, need ownership and control of their proprietary models.

The first wave of AI innovation is over. Here’s what comes next
Unknown
By
Gordon Ritter
26 July 2024
less than 3 min read
Become smarter in just 5 minutes

Ai Onion delivers quick and insightful updates about the most important and impactful Ai news and insights from careers to crime

Thanks for subscribing!

Read original article



The first wave of AI innovation is over. Here’s what comes next


Businesses themselves, not closed-source AI companies, need ownership and control of their proprietary models.


Businesses themselves, not closed-source AI companies, need ownership and control of their proprietary models.

Imagine asking a sophisticated AI model for tips on making the perfect pizza, only to be told to use glue to help the cheese stick. Or watching it fumble through basic arithmetic problems that a middle schooler could solve with ease. These are the limitations and quirks of generative AI and the large language models (LLMs) that underpin them. They’re happening because AI models are running out of good training data, causing them to plateau.


This is a cycle in innovation that repeats throughout history: For a long time, an almost undetectable amount of knowledge and craft builds up around an idea, like an invisible gas. Then, a spark. An explosion of innovation ensues but, of course, eventually stabilizes. This pattern is called an S-Curve. For example:

TCP/IP: Originating in ideas from the 1960s, TCP/IP saw significant acceleration after its 1974 RFC and stabilized with version 4 in 1981, which still underpins the modern Internet.

The Browser Wars: During the late 1990s, web browsers rapidly evolved into interactive, programmable platforms. Since then, improvements have largely been incremental.

Mobile Apps: The iPhone App Store’s launch in 2008 spurred a surge in mobile app innovation. Today, truly novel mobile apps are rare.


The AI plateau


The AI revolution is following this curve. In a 1950 paper, Alan Turing was one of the first computer scientists to explore how to build a thinking machine, starting the slow buildup of knowledge. Seventy years later, the spark: A 2017 research paper, Attention Is All You Need, leads to OpenAI’s development of ChatGPT, which convincingly mimics human conversation, unleashing a global shock wave of innovation based upon generative AI technology.


For a while, each subsequent LLM release, and releases from other companies like Anthropic, Google, and Meta, offered drastic improvements. 

But lately, progress has ebbed. Consider this chart of performance increases of OpenAI’s flagship model:

Although every benchmarking system has shortcomings, clearly the pace of change is no longer setting the world on fire.


Lack of good training data is what has caused AI capabilities to plateau, and access to the next frontier of data is what AI needs to make the jump to the next S-Curve.

Today’s LLMs were primarily trained on publicly available internet data, harvested from Github, Reddit, WordPress, and other website scraping and licensing activities. But it’s no longer enough to sustain model improvements. To fill its insatiable hunger for new data, OpenAI, for instance, developed a neural net called Whisper to transcribe a million hours of YouTube videos for GPT-4. Other novel methods being used include employing human-data labelers via services like ScaleAI (many of which have struggled with negative press for poor working conditions). However, the data clearly shows that continuing down these paths are leading to diminishing returns.


The next curve: Business data


We believe the real breakthrough that will allow humanity to jump to the next S-Curve is data produced at work. Workplace data—e.g. product specifications, sales presentations, and customer support interactions—is of far higher quality than what’s left of public data for training purposes, especially compared to running the dregs of the internet through the transformer mill. (The results of which may be why a lot of AI-generated content is already being called “slop.”)

Startups that unlock and harness business data will create significant value and tools that enterprises actually want to adopt. The potential for AI in the B2B space is vast and largely untapped.

bottom of page