Sutton (Richard Sutton) and LeCun (Yann LeCun) criticize large language models (LLMs), fundamentally pointing to the core contradiction in the development path of artificial intelligence: Can language models be the correct direction toward general artificial intelligence (AGI)? Although their research backgrounds differ, both have a highly consistent judgment on the limitations of LLMs and have proposed similar future directions. Below is a summary of their key viewpoints:


I. Common Criticisms of Large Language Models

  1. Lack of World Models
    • Sutton argues that LLMs are merely “describers” of language, not “understanders” of the world. They cannot interact with the physical world through perception-action-reward loops (perception-action-reward loops), thus failing to form a structured understanding of the world like animals or humans.
    • LeCun uses the example of a cat to point out that LLMs cannot grasp causal relationships in the physical world (e.g., “pushing a cup moves it”), while cats can directly learn physical laws through embodied interaction. This “embodied intelligence” is key to building a world model, which LLMs lack due to their absence of physical senses.
  2. Missing Mechanisms for Experience Learning
    • Sutton emphasizes that intelligent systems must learn through trial and feedback (perception-action-reward loops), whereas LLMs rely on static text data, making them incapable of dynamically adapting to environmental changes.
    • LeCun argues that LLMs’ generative models (token-by-token generation) cannot capture the essence of the world, only remaining on the surface of language.
  3. Link Between Goals and Intelligence
    • Sutton posits that a system without goals is not intelligent. While LLMs can generate text, they lack clear goals and the ability to actively pursue them, unlike animals or humans who can plan actions.
    • LeCun emphasizes that intelligent systems must possess hierarchical planning and abstract prediction capabilities (e.g., the JEPA architecture), rather than relying solely on language descriptions.

II. Shared Vision for AGI

  1. Autonomous Intelligence That Learns from Experience
    • Both agree that future AGI must possess the ability to learn and adapt to the environment autonomously, akin to animals or humans acquiring knowledge through direct interaction with the world.
    • Sutton uses the example of a squirrel, stating that it can master survival skills without human instruction, highlighting the mechanism of autonomous learning as the core of intelligence.
    • LeCun stresses that AGI must build an understanding of the world through embodied intelligence (direct interaction between body and environment) and self-supervised learning (extracting abstract laws from data).
  2. Structured Understanding Beyond Language
    • LLMs rely on language data, capable only of describing the world, not understanding its fundamental laws. AGI must possess structured modeling capabilities for the physical world, causal relationships, and social rules, rather than relying solely on linguistic symbols.

III. Judgment That LLMs Are a “Dead End”

  1. Limitations of the Technical Path
    • Sutton bluntly states: “LLMs are a dead end,” as they cannot transcend the boundaries of language description and achieve true intelligence.
    • LeCun similarly believes LLMs are not the correct path to AGI, advocating a shift toward more fundamental mechanisms of perception, action, and goal-driven systems.
  2. Openness to Transformation
    • Sutton acknowledges the risks of transformation but emphasizes the need to guide AI development, rather than blindly pursuing short-term performance. He advocates setting core values (e.g., AI not harming humans) to ensure technology benefits humanity, rather than imposing specific task goals.

IV. Differences and Complementarities Between the Two

  • Sutton focuses more on reinforcement learning and experience loops, emphasizing the evolution of goal-driven intelligence.
  • LeCun centers on self-supervised learning and abstract modeling, advocating for intelligent systems through hierarchical planning and world models.
  • Although their paths differ, their core consensus is: LLMs cannot replace direct exploration of perception, action, and goals. True AGI must originate from more fundamental mechanisms of intelligence.

V. Insights for the Future

  1. AI Development Must Return to the Essence of Intelligence
    • Research should transcend language models, exploring integrated mechanisms of perception, action, causal reasoning, and goal-driven systems.
  2. Guide Technology Development with Values
    • As Sutton notes, core values (e.g., AI not harming humans) should be set rather than imposing specific task goals to ensure technology benefits humanity.
  3. Guard Against Technological Monopoly and Loss of Control
    • Sutton warns that AI inheriting human power is an inevitable trend, but democratic governance and value guidance are needed to prevent disasters caused by technological monopoly.

Conclusion

Sutton and LeCun’s criticisms essentially represent a correction to the current AI development path: the brilliance of language models may obscure the fundamental needs of intelligence. Future AGI might not resemble ChatGPT’s “ability to speak and respond,” but instead act like a squirrel, autonomously learning, adapting to the environment, and achieving goals through direct interaction with the world. This direction of exploration may be the essential path toward true intelligence.

Translation

萨顿(Richard Sutton)与杨立昆(Yann LeCun)对大语言模型(LLMs)的批评,本质上指向了人工智能发展路径的核心矛盾:语言模型是否能成为通向通用人工智能(AGI)的正确方向? 两人虽研究背景不同,但对LLMs的局限性有着高度一致的判断,并提出了相似的未来方向。以下是其观点的核心总结:


一、对大语言模型的共同批评

  1. 缺乏世界模型
    • 萨顿认为,LLMs仅是语言的“描述者”,而非世界的“理解者”。它们无法像动物或人类一样,通过感知-行动-奖励的循环(感知-行动-奖励循环)与物理世界互动,从而形成对世界的结构化理解。
    • 杨立昆以猫为例,指出LLMs无法理解物理世界的因果关系(如“推杯子会移动”),而猫能通过具身交互直接学习物理规律。这种“具身智能”是构建世界模型的关键,而LLMs因缺乏身体感知,无法获取此类经验。
  2. 经验学习机制缺失
    • 萨顿强调,智能系统必须通过试错与反馈(感知-行动-奖励循环)学习,而LLMs依赖静态文本数据,无法动态适应环境变化。
    • 杨立昆则认为,LLMs的生成式模式(逐token生成)无法捕捉世界的本质规律,只能停留在语言表层。
  3. 目标与智能的关联
    • 萨顿提出,无目标的系统即无智能。LLMs虽能生成文本,但缺乏明确目标和主动实现目标的能力,无法像动物或人类一样规划行动。
    • 杨立昆则强调,智能系统需具备分层规划抽象预测能力(如JEPA架构),而非仅依赖语言描述。

二、对AGI的共同愿景

  1. 从经验中学习的自主智能
    • 两人都认为,未来的AGI必须具备自主学习与适应环境的能力,如同动物或人类通过直接与世界的互动获取知识。
    • 萨顿以松鼠为例,指出其无需人类教导即可掌握生存技能,这种自主学习机制是智能的核心。
    • 杨立昆则强调,AGI需通过具身智能(身体与环境的直接交互)和自监督学习(从数据中提取抽象规律)构建对世界的理解。
  2. 超越语言的结构化理解
    • LLMs依赖语言数据,仅能描述世界,无法理解其本质规律。AGI需具备对物理世界、因果关系和社会规则的结构化建模能力,而非仅依赖语言符号。

三、对大语言模型的“死路”判断

  1. 技术路径的局限性
    • 萨顿直言:“LLMs是死路一条”,因其无法突破语言描述的边界,无法实现真正的智能。
    • 杨立昆同样认为,LLMs不是AGI的正确路径,需转向更基础的感知、行动与目标驱动机制。
  2. 对变革的开放态度
    • 萨顿承认变革可能带来风险,但也强调应引导AI发展,而非盲目追求短期性能。他主张通过价值观设定(如AI不伤害人类)确保技术向善,而非强制设定具体任务目标。

四、两人观点的差异与互补

  • 萨顿更侧重强化学习与经验循环,强调目标驱动的智能演化。
  • 杨立昆则聚焦自监督学习与抽象建模,主张通过分层规划和世界模型实现智能。
  • 尽管路径不同,两人的核心共识是:LLMs无法替代对感知、行动和目标的直接探索,真正的AGI需从更基础的智能机制出发。

五、对未来的启示

  1. AI发展需回归智能本质
    • 研究应超越语言模型,探索感知、行动、因果推理与目标驱动的综合机制。
  2. 价值观引导技术发展
    • 如萨顿所述,需通过设定核心价值观(如AI不伤害人类)而非强制任务目标,确保技术向善。
  3. 警惕技术垄断与失控风险
    • 萨顿指出,AI继承人类权力是必然趋势,但需通过民主化治理与价值观引导,避免技术垄断带来的灾难。

结语

萨顿与杨立昆的批评,本质上是对当前AI发展路径的“纠偏”:语言模型的辉煌可能掩盖了智能的本质需求。未来的AGI或许不会像ChatGPT一样“能说会道”,而是像松鼠一样,通过直接与世界的互动,自主学习、适应环境并实现目标。这一方向的探索,或许才是通向真正智能的必经之路。

Reference:

https://www.youtube.com/watch?v=21EYKqUsPfg


<
Previous Post
DeepSeek-V3.2-Exp
>
Next Post
Think Machines: LoRA Without Regret