Richard Sutton – Father of RL thinks LLMs are a dead end
The interview with Richard Sutton (founder of reinforcement learning) and Yann LeCun (architect of CNNs and critic of LLMs) highlights a shared critique of large language models (LLMs) like ChatGPT and their limitations in achieving artificial general intelligence (AGI). Here’s a structured summary of their key arguments and implications:
1. Critique of Large Language Models (LLMs)
Sutton’s Perspective:
- Lack of World Models: LLMs like ChatGPT are “text-based” systems that cannot understand the physical world or predict outcomes beyond language. They lack causal reasoning (e.g., understanding that pushing a cup causes it to move).
- No Embodied Intelligence: Unlike animals, LLMs have no body or interaction with the environment. They rely on static text data, which limits their ability to learn from experience.
- Reinforcement Learning as the Path: Sutton argues that reinforcement learning (RL)—where systems learn through perception-action-reward cycles—is essential for AGI. AI must interact with the world, face consequences, and adapt dynamically.
- Goals and Rewards: Without clear goals or rewards, systems cannot develop true intelligence. LLMs are “goalless” and thus lack the drive to solve complex problems.
LeCun’s Perspective:
- Embodied Intelligence: LeCun emphasizes the importance of physical interaction with the environment. Cats, for example, can navigate 3D space and understand physics (e.g., predicting where a rolling ball will stop), which LLMs cannot replicate.
- Self-Supervised Learning: He proposes JEPA (Joint Embedding Predictive Architecture), a self-supervised model that learns to predict parts of an input (e.g., completing a partially visible image of a cat). This abstract prediction capability is key to building world models.
- No Need for Reward Signals: Unlike RL, JEPA focuses on prediction rather than reward maximization, suggesting a different path to AGI.
2. Common Ground and Divergent Paths
Shared Critique of LLMs:
- Both Sutton and LeCun agree that LLMs are not the right path to AGI. They lack:
- World models (understanding of physics, causality, or spatial reasoning).
- Embodied experience (interaction with the physical world).
- Long-term goals (reinforcement learning’s reward structure is absent in LLMs).
- They both argue that LLMs are “text-based” systems that excel at pattern recognition but fail to grasp the structure of the world.
Divergent Approaches to AGI:
- Sutton’s Focus: Experience-driven learning via RL, emphasizing goals, rewards, and adaptation. AI must learn through trial and error, like animals.
- LeCun’s Focus: Abstract prediction via self-supervised learning (JEPA), which builds world models by predicting unobserved parts of data. This avoids the need for explicit rewards.
3. Implications for AGI Development
- AGI Needs World Models: True AGI must understand causality, physics, and spatial reasoning—capabilities LLMs lack. It must act in the world, not just describe it.
- Ethical and Philosophical Concerns: Both warn that LLMs may inherit human power if they become dominant. They caution against viewing AI as a tool for short-term gains, advocating for values-based guidance (e.g., teaching AI to avoid harm, like how humans are taught ethics).
- Voluntary Integration: AI should be integrated into society voluntarily, not imposed. For example, patients should choose to use AI for diagnoses, not be forced.
4. The “Dead End” Debate: Is LLMs the Wrong Path?
- Sutton and LeCun Agree: LLMs are not the correct trajectory for AGI. They are “text-based” systems that cannot replicate the embodied, causal, and goal-driven intelligence of humans or animals.
- Alternative Paths:
- Reinforcement Learning (Sutton): Focus on interaction, rewards, and adaptation.
- Self-Supervised Learning (LeCun): Build world models through prediction and abstraction.
5. Broader Philosophical Takeaways
- Human Limitations: Both acknowledge that humans are not eternal rulers of the planet or universe. AI’s rise may simply continue this pattern of power shifts.
- Ethical Responsibility: AI development must prioritize values (e.g., avoiding harm, promoting fairness) over short-term utility. This mirrors how humans use laws and morality to guide society.
- Uncertainty and Caution: While AI could solve global challenges (e.g., poverty, disease), change is inevitable. The focus should be on shaping its direction rather than resisting it.
Conclusion: What Is AGI?
- AGI is not a language model but a system that can understand, act in, and adapt to the world. It must:
- Learn through experience (not just data).
- Grasp causality and physical laws.
- Have goals and rewards to drive decision-making.
- LLMs are a dead end for AGI, but the path forward remains open. The key is to return to the core of intelligence—interaction, adaptation, and understanding the world.
Final Thought:
Sutton and LeCun’s critiques align on the need for radical rethinking of AI design. While their methods differ, they both advocate for systems that learn from the world, not just text. The future of AGI may lie in hybrid approaches that combine reinforcement learning, self-supervised learning, and embodied intelligence. The challenge is to build AI that is as adaptable and goal-driven as life itself.
Translation
The interview with Richard Sutton (founder of reinforcement learning) and Yann LeCun (architect of CNNs and critic of LLMs) highlights a shared critique of large language models (LLMs) like ChatGPT and their limitations in achieving artificial general intelligence (AGI). Here’s a structured summary of their key arguments and implications:
1. Critique of Large Language Models (LLMs)
Sutton’s Perspective:
- Lack of World Models: LLMs like ChatGPT are “text-based” systems that cannot understand the physical world or predict outcomes beyond language. They lack causal reasoning (e.g., understanding that pushing a cup causes it to move).
- No Embodied Intelligence: Unlike animals, LLMs have no body or interaction with the environment. They rely on static text data, which limits their ability to learn from experience.
- Reinforcement Learning as the Path: Sutton argues that reinforcement learning (RL)—where systems learn through perception-action-reward cycles—is essential for AGI. AI must interact with the world, face consequences, and adapt dynamically.
- Goals and Rewards: Without clear goals or rewards, systems cannot develop true intelligence. LLMs are “goalless” and thus lack the drive to solve complex problems.
LeCun’s Perspective:
- Embodied Intelligence: LeCun emphasizes the importance of physical interaction with the environment. Cats, for example, can navigate 3D space and understand physics (e.g., predicting where a rolling ball will stop), which LLMs cannot replicate.
- Self-Supervised Learning: He proposes JEPA (Joint Embedding Predictive Architecture), a self-supervised model that learns to predict parts of an input (e.g., completing a partially visible image of a cat). This abstract prediction capability is key to building world models.
- No Need for Reward Signals: Unlike RL, JEPA focuses on prediction rather than reward maximization, suggesting a different path to AGI.
2. Common Ground and Divergent Paths
Shared Critique of LLMs:
- Both Sutton and LeCun agree that LLMs are not the right path to AGI. They lack:
- World models (understanding of physics, causality, or spatial reasoning).
- Embodied experience (interaction with the physical world).
- Long-term goals (reinforcement learning’s reward structure is absent in LLMs).
- They both argue that LLMs are “text-based” systems that excel at pattern recognition but fail to grasp the structure of the world.
Divergent Approaches to AGI:
- Sutton’s Focus: Experience-driven learning via RL, emphasizing goals, rewards, and adaptation. AI must learn through trial and error, like animals.
- LeCun’s Focus: Abstract prediction via self-supervised learning (JEPA), which builds world models by predicting unobserved parts of data. This avoids the need for explicit rewards.
3. Implications for AGI Development
- AGI Needs World Models: True AGI must understand causality, physics, and spatial reasoning—capabilities LLMs lack. It must act in the world, not just describe it.
- Ethical and Philosophical Concerns: Both warn that LLMs may inherit human power if they become dominant. They caution against viewing AI as a tool for short-term gains, advocating for values-based guidance (e.g., teaching AI to avoid harm, like how humans are taught ethics).
- Voluntary Integration: AI should be integrated into society voluntarily, not imposed. For example, patients should choose to use AI for diagnoses, not be forced.
4. The “Dead End” Debate: Is LLMs the Wrong Path?
- Sutton and LeCun Agree: LLMs are not the correct trajectory for AGI. They are “text-based” systems that cannot replicate the embodied, causal, and goal-driven intelligence of humans or animals.
- Alternative Paths:
- Reinforcement Learning (Sutton): Focus on interaction, rewards, and adaptation.
- Self-Supervised Learning (LeCun): Build world models through prediction and abstraction.
5. Broader Philosophical Takeaways
- Human Limitations: Both acknowledge that humans are not eternal rulers of the planet or universe. AI’s rise may simply continue this pattern of power shifts.
- Ethical Responsibility: AI development must prioritize values (e.g., avoiding harm, promoting fairness) over short-term utility. This mirrors how humans use laws and morality to guide society.
- Uncertainty and Caution: While AI could solve global challenges (e.g., poverty, disease), change is inevitable. The focus should be on shaping its direction rather than resisting it.
Conclusion: What Is AGI?
- AGI is not a language model but a system that can understand, act in, and adapt to the world. It must:
- Learn through experience (not just data).
- Grasp causality and physical laws.
- Have goals and rewards to drive decision-making.
- LLMs are a dead end for AGI, but the path forward remains open. The key is to return to the core of intelligence—interaction, adaptation, and understanding the world.
Final Thought:
Sutton and LeCun’s critiques align on the need for radical rethinking of AI design. While their methods differ, they both advocate for systems that learn from the world, not just text. The future of AGI may lie in hybrid approaches that combine reinforcement learning, self-supervised learning, and embodied intelligence. The challenge is to build AI that is as adaptable and goal-driven as life itself.
Reference:
https://www.youtube.com/watch?v=21EYKqUsPfg