Here is the translation of the contents:

This article mainly discusses the architecture of OpenAI’s o1 Pro model and the importance of pre-training and reasoning in AI development. The article first mentioned the role of pre-trained models, including the problem of error accumulation during token generation, which is particularly evident for long-context models. It pointed out that o1 Pro adopts self-consistency and majority voting mechanisms to address this issue.

The article then discussed the concept of scaling laws in the field of computing and how OpenAI’s o1 model demonstrated the potential of reasoning models. It believed that AI domain scaling laws will continue to evolve with new technological paradigms emerging and expanding.

The article also mentioned that pre-training is often the most focused aspect, but it is actually only a part of an AI life cycle. It stated that the goal of pre-training is very single-minded - to correctly predict the next token - but achieving this goal is still far from reaching the ultimate goal of large model development, which is to answer user prompts or complete tasks.

The article also discussed the concept of scaling laws for computing during testing and reasoning during inference. This concept is not new and has been around in game playing and poker. It pointed out that with more powerful computation, reasoning models can think through more steps, increasing the possibility of getting correct answers.

Finally, the article mentioned that controlling deployment costs becomes increasingly important, while scaling pre-training still offers a significant cost reduction, and is even more cost-effective than scaling testing time computation.

Translation

这篇文章主要讨论了 OpenAI 的 o1 Pro 模型的架构,以及预训练和推理在 AI 开发中的重要性。文章首先提到了预训练模型的作用,包括生成新 token 的过程中出错导致错误累积的问题,这个问题对长上下文模型尤其明显。文章指出 o1 Pro 采用了自洽性和多数投票机制来应对这个问题。

接着,文章讨论了计算领域 Scaling Laws 的概念,以及 OpenAI 的 o1 模型展示了推理模型的潜力。文章认为 AI 领域的 Scaling Laws 也会随着新技术范式的出现和扩展持续下去。

文章也提到了预训练常常是最受关注的部分,但是它实际上只是 AI 生命周期的一部分。文章指出预训练的目标非常单一,即正确预测下一个 token,然而实现这个目标仍然远远没有达到大模型开发的最终目标,即回答用户的提示词或完成任务。

文章也讨论了测试时计算和推理时计算 Scaling Laws 的概念,这实际上并不是新鲜事物,在棋类游戏和扑克中就有这样的理念。文章提到通过更强大的计算力,推理模型可以思考更多的步骤,从而增加得出正确答案的可能性。

最后,文章提到了控制部署成本的上升变得至关重要,而scaling预训练目前仍然可以大幅降低成本,并且比scaling测试时计算更加划算。

Reference:

https://semianalysis.com/2024/12/11/scaling-laws-o1-pro-architecture-reasoning-training-infrastructure-orion-and-claude-3-5-opus-failures/#scaling-sings-odes-to-the-greatest-scaling-law-of-computing-moore%e2%80%99s-law


<
Previous Post
The rise of small language model (SLMs)
>
Next Post
Ilya Sutskever: pre-training is over