fbpx

This Week in Artificial Intelligence #15/2025: When Meta Fell Apart, Claude Won, and OpenAI Finally Opened Its Doors

The Llama 4 that never was, Claude Max for the elite and Copilot as a collaborator of the future

Ta teden v umetni inteligenci
Photo: Jan Macarol / Ai art

Explosive drama at Meta, shocking transparency at OpenAI, and developments that put us just months away from AGI. So - this week in artificial intelligence.

If you missed the AI news this week, let us tell you just one thing: Meta she knelt down (again), Claude becomes the head of the office, OpenAI is finally embracing open source, and Google is developing its vision of an artificial super assistant that can edit video better than you. And this is no joke. This is this week in artificial intelligence – number 15.

#15 this week in artificial intelligence:
Meta and the Llama 4 disaster

Llama 4 was supposed to be the crown jewel of Meta's AI development, but the end result is a disappointment. The model that was presented to the public is not the one that participated in the benchmarks and impressed. This was first noticed by Professor Ethan Mollick, who confirmed that the results of the model published in the LLM-arena do not correspond to those achieved by the public version. (source: x.com/ethanmollick)

Posts then appeared on Reddit by former Meta employees who now work at OpenAI, openly distancing themselves from Llama 4. One of them wrote in his profile: “Llama 2 and Llama 3, Llama 4? I have nothing to do with it.” (source: reddit.com)

Additionally, it was reported that an internal shake-up in Meta's AI division was triggered when an unknown and low-budget Chinese model, the DeepSeek V3, beat them in benchmarks. For a company that invests billions in AI development, this is no small feat.

Claude Max and predictions about Claude 4

Anthropic surprised with a new service this week Claude Max – a subscription plan for demanding users, offering five to twenty times higher interaction quota, priority access to the latest models and features. (source: anthropic.com)

Meanwhile, Jared Kaplan, chief scientist at Anthropic, has announced that we will see Claude 4 within the next six months. According to him, the development of AI models is happening faster than the development of hardware, mainly due to accelerated post-training and improvements in reinforcement learning. One of the subtle but major news – this week in artificial intelligence.

OpenAI finally announces open source model

After years of criticism for lack of transparency and a departure from its original mission, Sam Altman announced that OpenAI will soon release an open-source model that will outperform all existing alternatives. (source: openai.com)

Moreover, ChatGPT now has long-term memory, which allows past interactions to be used to personalize experiences, and allows the user to have AI actively track their goals and guide them through incompatibilities in thought patterns.

But not everything is so rosy: abbreviated security testing at OpenAI

Financial Times revealed that OpenAI has significantly reduced the time and scope of security testing of its models. Instead of weeks, they now have just a few days, raising concerns that the models could be released to the public with undiscovered vulnerabilities. A former test engineer told the FT that the vulnerabilities in GPT-4 were discovered only two months after its release. (source: ft.com)

The reason is simple – competitive pressure. Companies are rushing to introduce new models to keep up. And even though these are the most powerful tools of our time, security is being pushed to the sidelines.

DeepCode 14B: an open-source competitor to OpenAI

DeepSeek and Aentica presented DeepCode 14B, an open-source model for generating program code. With only 14 billion parameters, it achieves results comparable to commercial GPT-3.5 models. It was trained with more than 24,000 unique tasks and used the GRPO+ method, which rewards the model only for perfect solutions. (source: github.com/aentica)

BrowseComp: New League for AI Agents

OpenAI has introduced BrowseComp, a benchmark for AI agents that can efficiently browse the web and search for complex information. It is intended to test models that have to browse dozens of pages to get to relevant information. (source: github.com/openai/simple-evals)

Google leads the game

Google unveils merger plans Gemini and Vio models – textual, image and audio understanding with video generation. Their goal is to create a multimodal super assistant that understands the world like we do. (source: googlecloudnext.com)

In addition, Google introduced a new generation of AI chips TPU Ironwood, which is 3600 times more powerful than the first generation from 2018. This allows them to train larger models and run them faster without dependence on Nvidia.

Microsoft Copilot is becoming a serious competitor

Microsoft's Co-pilot has a revamped interface, features for finding apartments, help with writing letters, and even image editing capabilities. It acts as a real-time personal assistant, with access to the screen and context. (source: microsoft.com)

Mustafa Suleyman, head of Microsoft AI, believes AGI could be here within five years, although he acknowledges that basic problems – such as hallucinations and poor instruction-following – have not yet been solved.

Midjourney v7: stunning images, but still no text

Midjourney has released the seventh generation of its image model, which impresses with its hyperrealism. However, the text generation is still lagging behind, as they themselves admit – users hardly use it, so it is not a priority. (source: midjourney.com)

Neo robot working live

Robotic platform 1X Neo has shown that it can perform tasks autonomously in a live performance. This is not just another PR shot; the robot moved, cleaned, and acted without scripts. Its design includes artificial muscles and mobility that allows it to safely coexist with humans. (source: 1x.tech)

AI scientist writes first professional article

Sakana AI Labs announced that their model has written the first scientific paper that has passed peer review at a workshop. The AI formulated a hypothesis, analyzed data, and drew conclusions – without human assistance. (Source: sakana.ai)


Conclusion

In just one week, we’ve seen the collapse of Meta’s vision, the acceleration of open source models, dangerous trends in security testing, and a new generation of multimodal agents. The world of AI is not only evolving rapidly—it’s evolving in a direction that seemed like science fiction just a year ago.

Next week promises even more. If you miss anything, we'll be here. Every Monday.

With you since 2004

From 2004 we research urban trends and inform our community of followers daily about the latest in lifestyle, travel, style and products that inspire with passion. From 2023, we offer content in major global languages.