Li Mu speech@SHJU: LLM present and future
Here is the translation:
-
Evaluating Language Models: Mu Li emphasizes the importance of evaluating language models, particularly in terms of their correctness, logic, and style. However, due to the ambiguity of natural language, this task becomes more challenging.
-
The Role of Data: Mu Li points out that data determines a model’s upper limit, while algorithms determine its lower limit. He believes we are still far from achieving AGI (Artificial General Intelligence) because our models still rely on rote learning rather than self-learning. Additionally, he thinks Anthropic has done well in terms of data, which is the result of their substantial investment and effort.
-
Data and GPU Resources: Mu Li emphasizes that for startups, having good data is essential to creating a model, followed by sufficient computational power (mainly relying on GPUs). However, due to NVIDIA’s control over the GPU market, costs have become extremely high, even reaching 50% of startup costs. This has made Mu Li angry, as he feels it’s an unfair phenomenon.
-
The Limitations of Language Models: Mu Li believes that current language models are still not advanced enough to be considered true artificial intelligence. They remain a type of machine learning, albeit on a much larger scale than before. Therefore, Mu Li advises people not to overhype them and instead treat them with caution.
-
Personal Experiences and Entrepreneurial Insights: Mu Li shares his career experiences, including various roles in different companies and the creation of a humorous take on punching-the-clock life. He mentions reading for a PhD and starting a business as two major choices each with its own benefits and drawbacks.
-
Methods for Continuous Self-Improvement: Finally, Mu Li shares some advice on how to continuously improve oneself, including regularly reflecting on one’s motivations, goals, and processes to better improve and grow.
Translation
文中主要讨论了以下几个主题:
-
语言模型的评估: 李沐强调了对语言模型进行评估的重要性,特别是考虑到它的正确性、逻辑性和风格。然而,由于自然语言的二义性,这个任务变得比较困难。
-
数据的作用: 李沐指出,数据决定了模型的上限,而算法决定了模型的下限。他认为,我们目前还远离了将达到AGI(人工一般智能)的目标,因为我们的模型还是依靠填鸭式学习,而不是自主学习。同时,他也认为Anthropic在数据方面做得不错,得到他们大量投资和努力的结果。
-
数据和GPU资源: 李沐强调了对于创业公司来说,要想模型好,首先需要好的数据,然后是足够的算力,而后者主要依靠GPU。然而,由于英伟达对GPU市场的控制,导致成本非常高,甚至会达到创业公司50%的成本。这让李沐感到愤怒,他觉得这是一个不公平的现象。
-
语言模型的局限性: 李沐认为目前的语言模型还不足以称之为真正的人工智能。它仍然是机器学习的一种范畴,尽管规模比以前大很多。因此,李沐劝说人们不要神话它,只要认真对待它。
-
自己的工作经历和创业心得: 李沐分享了他的职业生涯中的点滴经历。他在不同的公司里做过不同的事情,并且创造了一个关于打卡式人生的幽默。他提到了读PhD、创业等几个主要的选择,每种选择都有它自己的好处和坏处。
-
持续提升自我的方法: 最后,李沐分享了一些关于持续提升自我的建议,这包括不断的复盘自己做事的动机、目标以及过程,以便于更好的改进和成长。