How will multimodal AI change the world?

The AI field will undergo dramatic changes in the coming years. The race for computing power will heat up further, video generation models and robotics technology will experience explosive growth, while multimodal AI will fundamentally change our lives. Video generation models will completely revolutionize movie production within the next two years. Fully AI-generated movies good enough to win awards will soon appear. This is a disruptive change for the film industry and brings huge imagination space for investors in related fields. Robotics technology will see widespread application in five years. Retail and warehouse work environments will lead the “robot revolution”. This also means that enterprises along the robot industry chain will face unprecedented development opportunities. Focus on the training of AI talent. Mathematics, programming, writing, art and creativity skills will be the core competitiveness of future talents, and investment in education in related fields will yield substantial returns.

GPT-5 will bring major breakthroughs
Many people think that since the release of GPT-4, the progress of large language models has stalled, but insiders have a completely different view. Developing a language model like GPT requires a lot of computing power, which depends on the construction of new data centers, a slow multi-year process.
From GPT-4 to GPT-5 will require a 100-fold increase in computing power, which will take time. Before the official release of GPT-5, there may be a transition version with computing power increased by 10 times.

At present, the focus of AI companies is on “testing-time computation”, that is, investing more computing power during the model’s generation of answers to obtain longer and more coherent thought chains. For example, OpenAI expanded GPT-4 to the 0.1 model, achieving a 100-fold increase in computing power.
“Testing-time computation” does not require the construction of new data centers, so there is still a lot of room for algorithm improvement. In the coming years, “testing-time computation” will be one of the most exciting developments in the AI field.

The breakthrough of multimodal AI: Sora leads the video generation revolution
Unlike other modalities such as images, videos are an extended event sequence that requires a complete user interface to consider how the story unfolds over time. Additionally, the training and running costs of video models are very high.

Sora is the first high-quality video generation model that solves some of the challenges of video generation through its storyboarding feature. The storyboarding feature allows users to place checkpoints at different time points to guide the generation of the video.
In the future, the quality of video models will be better, the generation time will be longer, and the cost will be lower. Just like LLMs, you will be able to see very beautiful and realistic videos that cost almost nothing.

It is expected that within two years, we will see fully AI-generated movies that are good enough to win awards. The appeal of these movies will lie in how directors use video models to realize their creative vision and do things that cannot be filmed.

The Future of Robotics: Interacting with Robots in Our Daily Lives in Five Years

On AGI
Many people worry that AI will lead to mass unemployment, but in reality, AI can only automate individual tasks. Most jobs involve some tasks that cannot be automated, even programming.

The progress of AI will continue, it will be exciting, and it will not slow down, but it will change. We are moving from a world where intelligence may be a critical and scarce factor in society, to a world where intelligence is ubiquitous and free.

When intelligence is no longer scarce, agency will become a scarce productive factor. Agency refers to the ability to ask the right questions and pursue the right projects. We need to think about how to develop this agency so that we can work with AI.

The future will be continuous, and the progress of AI will gradually change our lives. We should focus on areas that require infinite patience, such as carefully checking expenses or comparative shopping, where AI can do better.

How to prepare children for the AI era?
Although AI is rapidly developing, we should not change the way we educate children. We should still teach them math, programming, writing, art, and creativity, as these skills can help them think structurally. The future is unpredictable, and we should encourage children to try things that challenge their limits and develop their adaptability.

How will multimodal AI change the world?

熱門頭條新聞

其他動漫資訊

動漫世界網絡中國站