The artificial intelligence industry is experiencing a fundamental shift in how companies approach model development. For years, tech giants like OpenAI, Meta, and Google have operated under the assumption that amassing massive amounts of training data would inevitably lead to smarter, more powerful AI models. This approach was based on research showing that transformer models scale linearly with the amount of data and computational power they receive.
However, industry leaders are now questioning whether this conventional wisdom can continue to drive AI advancement. At the recent Cerebral Valley conference, Scale AI CEO Alexandr Wang identified scaling laws as “the biggest question in the industry,” noting that much of the investment in AI has depended on the assumption that these laws would continue to hold true.
Several prominent executives are voicing concerns about the limitations of the data-scaling approach. Aidan Gomez, CEO of Cohere, described the method as “the dumbest” way to improve models, despite its reliability. He advocates for smaller, more efficient models that are gaining industry support for being more cost-effective. Meanwhile, Richard Socher, former Salesforce executive and CEO of You.com, argues that the current approach of training models to simply “predict the next token” is insufficient. He suggests forcing models to translate questions into computer code and generate answers based on that code’s output, which could reduce hallucinations and enhance capabilities.
Not everyone agrees that AI has hit a scaling wall. Microsoft CTO Kevin Scott stated in July that “we’re not at diminishing marginal returns on scale-up,” maintaining confidence in the traditional approach.
Companies are also exploring hybrid solutions. OpenAI’s o1 model, released in September, represents a new direction by spending more time on inference or “thinking” before answering questions. The model is specialized for quantitative tasks like coding and mathematics, unlike the general-purpose ChatGPT. However, o1 requires significantly more computational power, making it slower and more expensive to operate. This trade-off highlights the complex challenges facing AI development as the industry searches for the most effective path forward to achieve artificial general intelligence (AGI)—a theoretical form of AI that matches or surpasses human intelligence.
Key Quotes
It’s definitely true that if you throw more compute at the model, if you make the model bigger, it’ll get better. It’s kind of like it’s the most trustworthy way to improve models. It’s also the dumbest.
Aidan Gomez, CEO of Cohere, expressed this criticism of the traditional scaling approach on the 20VC podcast, highlighting the tension between reliability and innovation in AI development methods.
The money going into AI has largely hung on the idea that this scaling law would hold. It’s now the biggest question in the industry.
Scale AI CEO Alexandr Wang made this statement at the Cerebral Valley conference, emphasizing how fundamental the scaling law debate is to AI investment and industry direction.
Despite what other people think, we’re not at diminishing marginal returns on scale-up.
Microsoft CTO Kevin Scott offered this counterpoint in a July interview with Sequoia Capital’s Training Data podcast, representing the view that traditional scaling approaches still have significant potential.
Large language models are trained simply to predict the next token, given the previous set of tokens. The more effective way to train them is to force these models to translate questions into computer code and generate an answer based on the output of that code.
Richard Socher, former Salesforce executive and CEO of You.com, proposed this alternative training methodology that could reduce hallucinations and improve AI reasoning capabilities.
Our Take
This industry-wide reassessment signals a maturation of AI development from brute-force scaling to more sophisticated approaches. The emergence of models like OpenAI’s o1, which prioritizes reasoning time over immediate response, suggests we’re entering an era where quality of thinking matters more than quantity of data. This shift parallels historical technology transitions where initial growth through resource expansion eventually gives way to efficiency and optimization.
The divide between executives like Kevin Scott who maintain faith in scaling and critics like Aidan Gomez reveals genuine uncertainty about AI’s path forward. What’s particularly significant is that this debate is happening publicly, suggesting the industry recognizes it may need diverse approaches rather than a single dominant paradigm. The cost implications of compute-intensive models like o1 also raise questions about AI accessibility and whether the technology will concentrate further among well-funded players or democratize through efficient alternatives. This inflection point will likely define which companies lead the next phase of AI innovation.
Why This Matters
This debate represents a critical inflection point for the AI industry and has profound implications for billions of dollars in investment and the future trajectory of artificial intelligence development. The questioning of scaling laws challenges the fundamental strategy that has driven AI progress for years and guided massive capital allocation by tech giants and venture capitalists.
For businesses, this shift could mean more cost-effective AI solutions through smaller, efficient models rather than increasingly expensive large language models. The industry’s exploration of alternative training methods like code-based reasoning and extended inference time could lead to more reliable, less hallucination-prone AI systems that are better suited for enterprise applications.
The broader implications extend to the race for artificial general intelligence. If traditional scaling approaches have limitations, companies may need to fundamentally rethink their AGI strategies and timelines. This could reshape competitive dynamics in the AI industry, potentially favoring companies that innovate on training methodologies rather than those simply accumulating more data and compute power. For society, more efficient models could democratize AI access, while improved reasoning capabilities could accelerate AI’s impact across sectors from healthcare to scientific research.
Recommended Reading
For those interested in learning more about artificial intelligence, machine learning, and effective AI communication, here are some excellent resources:
Recommended Reading
Related Stories
- The Artificial Intelligence Race: Rivalry Bathing the World in Data
- The AI Hype Cycle: Reality Check and Future Expectations
- Sam Altman’s Bold AI Predictions: AGI, Jobs, and the Future by 2025
- Google’s Gemini: A Potential Game-Changer in the AI Race
- Wall Street Asks Big Tech: Will AI Ever Make Money?
Source: https://www.businessinsider.com/ai-leaders-are-starting-to-rethink-way-to-advance-ai-2024-11