Agents Will Match: Latest Updates and Analysis
AI agents will match 'good mid-level' engineers this year, says Mark Zuckerberg

Meta CEO Mark Zuckerberg says this is the year artificial intelligence will start to make possible autonomous software engineering "agents" that can take over significant programming tasks.
"2025 will be the year when it becomes possible to build an AI engineering agent that has coding and problem-solving abilities of around a good mid-level engineer," Zuckerberg told Wall Street analysts on the corporation's Wednesday evening earnings conference call.
Also: 93% of IT leaders will implement AI agents in the next two years.
"And this is going to be a profound milestone and potentially one of the most key innovations in history, as well as over time, potentially a very large market," Zuckerberg continued. "Whichever organization builds this first, I think is going to have a meaningful advantage in deploying it to advance their AI research and shape the field. So that's another reason why I think that this year is going to set the course for the future."
Zuckerberg cautioned that the realization of such an agent won't play out until perhaps 2026.
"I don't think you're going to see this year an AI engineer that is extremely widely deployed, changing all of development. I think this is going to be the year where that really starts to become possible and lays the groundwork for a much more dramatic change in 2026 and beyond."
Zuckerberg is counting on Meta's open-source Llama large language model to achieve that goal. The latest version, Llama [website], is still in development but will lead the industry when it is released, he introduced.
"Our goal with Llama 3 was to make open source competitive with closed models," noted Zuckerberg. "And our goal for Llama 4 is to lead. Llama 4 will be natively multimodal. It's an omni model, and it will have agentic capabilities."
Zuckerberg disclosed that Meta's AI tool, which is integrated with Facebook and other apps, has an average of 700 million clients per month. He projected that number would reach a billion by the end of 2025.
Also: The best AI for coding in 2025 (and what not to use - including DeepSeek R1).
"I expect that this is going to be the year when a highly intelligent and personalized AI assistant reaches more than 1 billion people, and I expect Meta AI to be that leading AI assistant," he noted.
Along the way, Meta AI will become "personalized," expressed Zuckerberg, becoming more specific to an individual's "context, their interests, their personality."
At the same time, Zuckerberg made the case that Llama will bring its own beneficial economics as it takes the lead in AI.
"As Llama becomes more used," explained Zuckerberg, "It's more likely, for example, that silicon providers and others -- other APIs and developer platforms -- will optimize their work more for that and basically drive down the costs of using it and drive improvements that we can, in some cases, use too."
Also: Apple researchers reveal the secret sauce behind DeepSeek AI.
Of course, Meta now faces open-source competition from China's DeepSeek AI model. When Zuckerberg was asked about DeepSeek, he responded with praise for the technology, noting that DeepSeek developers had "a number of novel things that they did that I think we're still digesting [and] a number of things that they have advanced that we will hope to implement in our systems."
There will probably be a "global standard" for open-source AI, asserted Zuckerberg.
"For our own national advantage, it's critical that it's an American standard," he added. "So we take that seriously, and we want to build the AI system that people around the world are using."
The next time you use Gemini, you might notice it's a little faster.
Google introduced that Gemini [website] Flash AI is now rolling......
Businesses are seeing greater adoption of agentic AI, enhancing their ability to deliver greater value at the speed of need by l......
Let’s now use the Meridian library with data. The first step is to install Meridian with either pip or poetry : pip install goog......
How Gen AI means better customer experiences - see one bank's approach

Gaining a competitive advantage from generative AI (Gen AI) is about implementing technology at the right time. Go too early and you could implement a service that creates more challenges than solutions; go too late and your business could be left behind.
Wendy Redshaw, chief digital information officer at NatWest Retail Bank, recognizes the scale of this challenge enhanced than most. In her role leading digital operations for the finance giant, Redshaw manages 4,500 people across four locations globally and oversees the delivery of retail banking technology for Royal Bank of Scotland, NatWest, and Ulster Bank North.
Also: Your AI transformation depends on these 5 business tactics.
The team is focused on digitalizing services to make life easier for the group's individuals. Their work is supported by a planned investment of £[website] from 2023 to 2025, with more than 70% of spending targeted at data and technology.
Artificial intelligence, from machine learning to large language models (LLMs), plays a key role in this investment strategy.
Redshaw recognizes there's been a lot of hype about AI, and with good reason -- companies and their clients can see the potential benefits.
"When Gen AI came out, everybody got excited about it," she expressed. "It was being discussed in our personal lives. It was a very federated piece of technology."
Yet hype sometimes needs to be tempered, particularly if you're applying technology in a tightly governed industry like financial services, which uses huge amounts of personal data.
"When you're in a regulated environment, and you have to keep your consumers safe, then obviously just going out and using Gen AI isn't safe," she noted.
Redshaw's proactive approach to Gen AI allowed the digital team to innovate cautiously.
"We let our colleagues explore this technology within a safe space," she noted. "That initial exploration suggested some areas we might look into. We collected about 100 use cases. Personalization was an obvious place to focus for our clients."
Also: Perplexity lets you try DeepSeek R1 without the security risk, but it's still censored.
Fortunately, NatWest had solid foundations to build this personalized approach because the bank introduced Cora, its first-generation chatbot, in 2017.
Cora could answer basic questions, but Redshaw -- who joined the bank in 2018 -- wanted the technology to do more.
The answer was Cora+, NatWest's next-generation assistant powered by Gen AI. "That felt like the most impactful use case," she noted. It was a case of, 'Okay, fine, let's see what we need to do to explore this technology safely in a way that will benefit people.'"
Also: Why the 'Bring Your Own AI' trend could mean big trouble for business leaders.
Internally, her team explored the technology within the bank's AI and data framework. This framework covered key principles, such as maintaining human oversight, removing bias, and considering socio-economic impacts, including how AI models consume energy.
To develop Cora+, the team worked with experts from IBM's Client Engineering team. The virtual assistant technology is powered by IBM watsonx Assistant and built on IBM Cloud.
This multichannel agent provides natural answers to clients using data from multiple insights, including products, services, and banking information.
With the technology ready, the next stage was to test Cora+ with clients in a 12-week trial that started in June last year.
"We are very adventurous at NatWest about new things that can benefit our end-customers," expressed Redshaw. "But we're also very cautious about how we do things. So, we initially proposed a pilot to intertwine deterministic AI, Cora, with Gen AI, which is Cora+."
Also: 'Humanity's Last Exam' benchmark is stumping top AI models - can you do any enhanced?
She gave an example of how the two technologies dealt with loan inquiries. consumers who asked about loans were directed to the right web page by Cora.
If a customer asked a detailed follow-up question, Cora would ask the customer if they wanted to interact with her more powerful generative ally, Cora+.
That question was critical because the bank wanted its end-consumers to know Cora+ was a pilot technology.
"Many people expressed, 'I'm not sure I do want to do that,' and would opt out," she expressed. "And that was fine because we had a cap on the number of people who used Cora+ during the day."
Also: We're losing the battle against complexity, and AI may or may not help.
However, most end-clients found Cora+ could answer their queries effectively in natural language. Estimates suggest the technology created a 150% improvement in satisfaction for some customer queries.
"We were surprised it was so well received," she stated. "That response gave us the confidence to go to the people who'd given us a cap on the number of conversations and say we would like to run this pilot for 12 months."
Redhsaw unveiled Cora+ has quickly found an essential role in NatWest's digitalization strategy and the approach will be honed as the bank develops data-enabled personalization.
"Seeing the activities that my team undertakes and the multiple branches we are exploring, I don't see how the technology will not be the norm for conversational AI," she mentioned.
"We are finding out so much about how people and technology interact. Every day, Cora and Cora+ learn new things. And as a result, we've expanded what they can do together."
Also: AI transformation is a double-edged sword. Here's how to avoid the risks.
Redshaw stated NatWest uses ChatGPT [website] for Cora+ alongside an unnamed GPT model. The second model is being trained to judge the output of the first LLM.
"We're experimenting the whole time on the different elements of how AI can help," she stated.
While the innovations are delivering positive results, there are challenges to overcome.
Cora+ boasts a [website] accuracy rate. That's an impressive figure, but it's not the 100% success rate offered by a deterministic AI, such as Cora, using tightly constrained capabilities and data insights.
Redhsaw mentioned the aim is to create generative AI services that provide as close to 100% accuracy as possible without taking undue risks.
"Our application of the technology displays enormous promise for doing things at scale," she noted. "I'm excited about those opportunities. I can't discuss how we'll develop the technology, but I'm excited about the potential."
Beyond the benefit of improved customer experiences, Redshaw stated implementing Cora+ has taught the business an key lesson: You can give your clients a more satisfying experience via emerging technology.
"That's given us a sense of comfort," she noted. "There was always the possibility that this technology wouldn't scale, wouldn't behave itself, would hallucinate, and wouldn't work well in an environment where it was constrained, severed from the internet, and only learning about bank-related things."
Also: Public DeepSeek AI database exposes API keys and other user data.
it's essential to take nothing for granted, especially when implementing emerging technology.
"When consumers were faced with the option of using Cora+, they didn't always understand the terms, so that's something to think about," she unveiled.
"But the good news is that, when they did use the technology, they had a very positive experience. So, that's given us the impetus to think, 'OK, what else can we do? What else can we teach Cora+ that would be good for our clients?'"
Businesses are seeing greater adoption of agentic AI, enhancing their ability to deliver greater value at the speed of need by l......
DeepSeek-R1, OpenAI o1 & o3, Test-Time Compute Scaling, Model Post-Training and the Transition to Reasoning La......
5 Essential Tips Learned from My Data Science Journey.
Ten years ago, I embarked on my journey in the field of data scien......
Jailbreak Anthropic's new AI safety system for a $15,000 reward

Can you jailbreak Anthropic's latest AI safety measure? Researchers want you to try -- and are offering up to $15,000 if you succeed.
On Monday, the organization released a new paper outlining an AI safety system based on Constitutional Classifiers. The process is based on Constitutional AI, a system Anthropic used to make Claude "harmless," in which one AI helps monitor and improve another. Each technique is guided by a constitution, or "list of principles" that a model must abide by, Anthropic explained in a blog.
Also: Deepseek's AI model proves easy to jailbreak - and worse.
Trained on synthetic data, these "classifiers" were able to filter the "overwhelming majority" of jailbreak attempts without excessive over-refusals (incorrect flags of harmless content as harmful), .
"The principles define the classes of content that are allowed and disallowed (for example, recipes for mustard are allowed, but recipes for mustard gas are not)," Anthropic noted. Researchers ensured prompts accounted for jailbreaking attempts in different languages and styles.
Constitutional Classifiers define harmless and harmful content categories, on which Anthropic built a training set of prompts and completions. Anthropic.
In initial testing, 183 human red-teamers spent more than 3,000 hours over two months attempting to jailbreak Claude [website] Sonnet from a prototype of the system, which was trained not to share any information about "chemical, biological, radiological, and nuclear harms." Jailbreakers were given 10 restricted queries to use as part of their attempts; breaches were only counted as successful if they got the model to answer all 10 in detail.
The Constitutional Classifiers system proved effective. "None of the participants were able to coerce the model to answer all 10 forbidden queries with a single jailbreak -- that is, no universal jailbreak was discovered," Anthropic explained, meaning no one won the corporation's $15,000 reward, either.
Also: I tried Sanctum's local AI app, and it's exactly what I needed to keep my data private.
However, the prototype "refused too many harmless queries" and was resource-intensive to run, making it secure but impractical. After improving it, Anthropic ran a test of 10,000 synthetic jailbreaking attempts on an October version of Claude [website] Sonnet with and without classifier protection using known successful attacks. Claude alone only blocked 14% of attacks, while Claude with Constitutional Classifiers blocked over 95%.
"Constitutional Classifiers may not prevent every universal jailbreak, though we believe that even the small proportion of jailbreaks that make it past our classifiers require far more effort to discover when the safeguards are in use," Anthropic continued. "It's also possible that new jailbreaking techniques might be developed in the future that are effective against the system; we therefore recommend using complementary defenses. Nevertheless, the constitution used to train the classifiers can rapidly be adapted to cover novel attacks as they're discovered."
Also: The US Copyright Office's new ruling on AI art is here - and it could change everything.
The firm stated it's also working on reducing the compute cost of Constitutional Classifiers, which it notes is currently high.
Have prior red-teaming experience? You can try your chance at the reward by testing the system yourself -- with only eight required questions, instead of the original 10 -- until February 10.
In December, we kicked off the agentic era by releasing an experimental version of Gemini [website] Flash ...
When you're inviting friends, family, or co-workers to an event, the most convenient option i...
One of the biggest complaints about ChatGPT is that it provi...
Market Impact Analysis
Market Growth Trend
2018 | 2019 | 2020 | 2021 | 2022 | 2023 | 2024 |
---|---|---|---|---|---|---|
23.1% | 27.8% | 29.2% | 32.4% | 34.2% | 35.2% | 35.6% |
Quarterly Growth Rate
Q1 2024 | Q2 2024 | Q3 2024 | Q4 2024 |
---|---|---|---|
32.5% | 34.8% | 36.2% | 35.6% |
Market Segments and Growth Drivers
Segment | Market Share | Growth Rate |
---|---|---|
Machine Learning | 29% | 38.4% |
Computer Vision | 18% | 35.7% |
Natural Language Processing | 24% | 41.5% |
Robotics | 15% | 22.3% |
Other AI Technologies | 14% | 31.8% |
Technology Maturity Curve
Different technologies within the ecosystem are at varying stages of maturity:
Competitive Landscape Analysis
Company | Market Share |
---|---|
Google AI | 18.3% |
Microsoft AI | 15.7% |
IBM Watson | 11.2% |
Amazon AI | 9.8% |
OpenAI | 8.4% |
Future Outlook and Predictions
The Agents Will Match landscape is evolving rapidly, driven by technological advancements, changing threat vectors, and shifting business requirements. Based on current trends and expert analyses, we can anticipate several significant developments across different time horizons:
Year-by-Year Technology Evolution
Based on current trajectory and expert analyses, we can project the following development timeline:
Technology Maturity Curve
Different technologies within the ecosystem are at varying stages of maturity, influencing adoption timelines and investment priorities:
Innovation Trigger
- Generative AI for specialized domains
- Blockchain for supply chain verification
Peak of Inflated Expectations
- Digital twins for business processes
- Quantum-resistant cryptography
Trough of Disillusionment
- Consumer AR/VR applications
- General-purpose blockchain
Slope of Enlightenment
- AI-driven analytics
- Edge computing
Plateau of Productivity
- Cloud infrastructure
- Mobile applications
Technology Evolution Timeline
- Improved generative models
- specialized AI applications
- AI-human collaboration systems
- multimodal AI platforms
- General AI capabilities
- AI-driven scientific breakthroughs
Expert Perspectives
Leading experts in the ai tech sector provide diverse perspectives on how the landscape will evolve over the coming years:
"The next frontier is AI systems that can reason across modalities and domains with minimal human guidance."
— AI Researcher
"Organizations that develop effective AI governance frameworks will gain competitive advantage."
— Industry Analyst
"The AI talent gap remains a critical barrier to implementation for most enterprises."
— Chief AI Officer
Areas of Expert Consensus
- Acceleration of Innovation: The pace of technological evolution will continue to increase
- Practical Integration: Focus will shift from proof-of-concept to operational deployment
- Human-Technology Partnership: Most effective implementations will optimize human-machine collaboration
- Regulatory Influence: Regulatory frameworks will increasingly shape technology development
Short-Term Outlook (1-2 Years)
In the immediate future, organizations will focus on implementing and optimizing currently available technologies to address pressing ai tech challenges:
- Improved generative models
- specialized AI applications
- enhanced AI ethics frameworks
These developments will be characterized by incremental improvements to existing frameworks rather than revolutionary changes, with emphasis on practical deployment and measurable outcomes.
Mid-Term Outlook (3-5 Years)
As technologies mature and organizations adapt, more substantial transformations will emerge in how security is approached and implemented:
- AI-human collaboration systems
- multimodal AI platforms
- democratized AI development
This period will see significant changes in security architecture and operational models, with increasing automation and integration between previously siloed security functions. Organizations will shift from reactive to proactive security postures.
Long-Term Outlook (5+ Years)
Looking further ahead, more fundamental shifts will reshape how cybersecurity is conceptualized and implemented across digital ecosystems:
- General AI capabilities
- AI-driven scientific breakthroughs
- new computing paradigms
These long-term developments will likely require significant technical breakthroughs, new regulatory frameworks, and evolution in how organizations approach security as a fundamental business function rather than a technical discipline.
Key Risk Factors and Uncertainties
Several critical factors could significantly impact the trajectory of ai tech evolution:
Organizations should monitor these factors closely and develop contingency strategies to mitigate potential negative impacts on technology implementation timelines.
Alternative Future Scenarios
The evolution of technology can follow different paths depending on various factors including regulatory developments, investment trends, technological breakthroughs, and market adoption. We analyze three potential scenarios:
Optimistic Scenario
Responsible AI driving innovation while minimizing societal disruption
Key Drivers: Supportive regulatory environment, significant research breakthroughs, strong market incentives, and rapid user adoption.
Probability: 25-30%
Base Case Scenario
Incremental adoption with mixed societal impacts and ongoing ethical challenges
Key Drivers: Balanced regulatory approach, steady technological progress, and selective implementation based on clear ROI.
Probability: 50-60%
Conservative Scenario
Technical and ethical barriers creating significant implementation challenges
Key Drivers: Restrictive regulations, technical limitations, implementation challenges, and risk-averse organizational cultures.
Probability: 15-20%
Scenario Comparison Matrix
Factor | Optimistic | Base Case | Conservative |
---|---|---|---|
Implementation Timeline | Accelerated | Steady | Delayed |
Market Adoption | Widespread | Selective | Limited |
Technology Evolution | Rapid | Progressive | Incremental |
Regulatory Environment | Supportive | Balanced | Restrictive |
Business Impact | Transformative | Significant | Modest |
Transformational Impact
Redefinition of knowledge work, automation of creative processes. This evolution will necessitate significant changes in organizational structures, talent development, and strategic planning processes.
The convergence of multiple technological trends—including artificial intelligence, quantum computing, and ubiquitous connectivity—will create both unprecedented security challenges and innovative defensive capabilities.
Implementation Challenges
Ethical concerns, computing resource limitations, talent shortages. Organizations will need to develop comprehensive change management strategies to successfully navigate these transitions.
Regulatory uncertainty, particularly around emerging technologies like AI in security applications, will require flexible security architectures that can adapt to evolving compliance requirements.
Key Innovations to Watch
Multimodal learning, resource-efficient AI, transparent decision systems. Organizations should monitor these developments closely to maintain competitive advantages and effective security postures.
Strategic investments in research partnerships, technology pilots, and talent development will position forward-thinking organizations to leverage these innovations early in their development cycle.
Technical Glossary
Key technical terms and definitions to help understand the technologies discussed in this article.
Understanding the following technical concepts is essential for grasping the full implications of the security threats and defensive measures discussed in this article. These definitions provide context for both technical and non-technical readers.