Avoiding Catastrophic Risks from Uncontrolled AI Agency by Yoshua Bengio

The document appears to be a transcript of a lecture or presentation by a researcher on the topic of artificial intelligence (AI), specifically discussing the risks and challenges associated with developing more powerful AI systems.

Here are the main points from the document:

Risks of uncontrolled AI: The speaker highlights the potential dangers of creating AI systems that can operate autonomously without human oversight, citing examples such as loss of human control, use by terrorists, and creation of pandemics.
Agentic vs. non-agentic AI: The speaker distinguishes between agentic AI (which has autonomy and can make decisions on its own) and non-agentic AI (which is controlled by humans and doesn’t have the ability to make decisions).
Need for aligning AI with human values: The speaker emphasizes the importance of developing AI systems that follow human moral instructions, such as not providing information that can be used to harm people.
Challenges in aligning AI with human values: The speaker notes that current approaches to training AI models are based on maximum likelihood estimation, which can lead to overconfidence and errors in decision-making.
Proposal for a “scientist” AI model: The speaker suggests developing an AI model (which they call the “scientist”) that is capable of generating explanations and justifications for its decisions, rather than simply trying to imitate human language.
Use of latent variables models: The speaker proposes using latent variable models to train the scientist AI model, which can help it learn to generate structured explanations and make more informed decisions.
Need for national regulation and international cooperation: The speaker emphasizes the importance of developing regulations and guidelines for the development and deployment of AI systems, as well as international cooperation among governments and companies.

Overall, the document provides a thorough discussion of the challenges and risks associated with developing more powerful AI systems, and highlights the need for careful consideration and coordination to ensure that these systems are aligned with human values and used responsibly.

Translation

1. **不受控的AI风险**：讲者强调了开发可以在没有人类监督的情况下自主运行的AI系统的潜在危险，举例说明了失去对人力的控制、被恐怖分子利用以及创造传染病等风险。 2. **有机化vs.非有机化AI**：讲者区分了有机化AI（具有自主性和可以自己做决定的）和非有机化AI（由人类控制，无法做出决策）。 3. **将AI与人类价值观相吻合的必要性**：讲者强调了开发遵循人类道德指令的AI系统的重要性，比如不提供可能对他人造成伤害的信息。 4. **将AI与人类价值观相吻合面临的挑战**：讲者注意到当前训练机器学习模型的方法基于最大似然估计，这可能导致过度自信和决策过程中的错误。 5. **科学家型AI模型提议**：讲者建议开发一种能生成解释和证明其决定合理性的AI模型（称之为"科学家")，而不是简单地模仿人类语言。 6. **使用潜在变量模型的提议**：讲者建议使用潜在变量模型来训练科学家型AI模型，这可能有助于它学习生成结构化解释并做出更明智的决策。 7. **国家监管和国际合作的必要性**：讲者强调了制定关于开发和部署AI系统的规则和指导方针，以及与政府和公司进行国际合作的重要性。总体而言，这个文件提供了一份关于开发更强大的AI系统挑战和风险的全面的讨论，并强调了确保这些系统与人类价值观相吻合并且负责任地使用的必要性。

Reference:

https://www.youtube.com/watch?v=pd4KzyXon_s

Sholto Douglas interview: Claude 4 and Path to AI Coworkers

Some thoughts on human-AI relationships by Joanne Jang