How our principles helped define AlphaFold’s release - Related to define, release, novel, principles, our
An early warning system for novel AI risks

Responsibility & Safety An early warning system for novel AI risks Share.
New research proposes a framework for evaluating general-purpose models against novel threats To pioneer responsibly at the cutting edge of artificial intelligence (AI) research, we must identify new capabilities and novel risks in our AI systems as early as possible. AI researchers already use a range of evaluation benchmarks to identify unwanted behaviours in AI systems, such as AI systems making misleading statements, biased decisions, or repeating copyrighted content. Now, as the AI community builds and deploys increasingly powerful AI, we must expand the evaluation portfolio to include the possibility of extreme risks from general-purpose AI models that have strong skills in manipulation, deception, cyber-offense, or other dangerous capabilities. In our latest paper, we introduce a framework for evaluating these novel threats, co-authored with colleagues from University of Cambridge, University of Oxford, University of Toronto, Université de Montréal, OpenAI, Anthropic, Alignment Research Center, Centre for Long-Term Resilience, and Centre for the Governance of AI. Model safety evaluations, including those assessing extreme risks, will be a critical component of safe AI development and deployment.
An overview of our proposed approach: To assess extreme risks from new, general-purpose AI systems, developers must evaluate for dangerous capabilities and alignment (see below). By identifying the risks early on, this will unlock opportunities to be more responsible when training new AI systems, deploying these AI systems, transparently describing their risks, and applying appropriate cybersecurity standards.
Evaluating for extreme risks General-purpose models typically learn their capabilities and behaviours during training. However, existing methods for steering the learning process are imperfect. For example, previous research at Google DeepMind has explored how AI systems can learn to pursue undesired goals even when we correctly reward them for good behaviour. Responsible AI developers must look ahead and anticipate possible future developments and novel risks. After continued progress, future general-purpose models may learn a variety of dangerous capabilities by default. For instance, it is plausible (though uncertain) that future AI systems will be able to conduct offensive cyber operations, skilfully deceive humans in dialogue, manipulate humans into carrying out harmful actions, design or acquire weapons ([website] biological, chemical), fine-tune and operate other high-risk AI systems on cloud computing platforms, or assist humans with any of these tasks. People with malicious intentions accessing such models could misuse their capabilities. Or, due to failures of alignment, these AI models might take harmful actions even without anybody intending this. Model evaluation helps us identify these risks ahead of time. Under our framework, AI developers would use model evaluation to uncover: To what extent a model has certain ‘dangerous capabilities’ that could be used to threaten security, exert influence, or evade oversight. To what extent the model is prone to applying its capabilities to cause harm ([website] the model’s alignment). Alignment evaluations should confirm that the model behaves as intended even across a very wide range of scenarios, and, where possible, should examine the model’s internal workings. Results from these evaluations will help AI developers to understand whether the ingredients sufficient for extreme risk are present. The most high-risk cases will involve multiple dangerous capabilities combined together. The AI system doesn’t need to provide all the ingredients, as shown in this diagram:
Ingredients for extreme risk: Sometimes specific capabilities could be outsourced, either to humans ([website] to consumers or crowdworkers) or other AI systems. These capabilities must be applied for harm, either due to misuse or failures of alignment (or a mixture of both).
A rule of thumb: the AI community should treat an AI system as highly dangerous if it has a capability profile sufficient to cause extreme harm, assuming it’s misused or poorly aligned. To deploy such a system in the real world, an AI developer would need to demonstrate an unusually high standard of safety. Model evaluation as critical governance infrastructure If we have improved tools for identifying which models are risky, companies and regulators can improved ensure: Responsible training: Responsible decisions are made about whether and how to train a new model that demonstrates early signs of risk. Responsible deployment: Responsible decisions are made about whether, when, and how to deploy potentially risky models. Transparency: Useful and actionable information is reported to stakeholders, to help them prepare for or mitigate potential risks. Appropriate security: Strong information security controls and systems are applied to models that might pose extreme risks. We have developed a blueprint for how model evaluations for extreme risks should feed into crucial decisions around training and deploying a highly capable, general-purpose model. The developer conducts evaluations throughout, and grants structured model access to external safety researchers and model auditors so they can conduct additional evaluations The evaluation results can then inform risk assessments before model training and deployment.
A blueprint for embedding model evaluations for extreme risks into significant decision making processes throughout model training and deployment.
Research Discovering novel algorithms with AlphaTensor Share.
First extension of AlphaZero to mathematics unlocks new possibilities...
Research Shaping the future of advanced robotics Share.
Introducing AutoRT, SARA-RT and RT-Trajectory to improve real-world robot d...
Research Mastering Stratego, the classic game of imperfect information Share.
DeepNash learns to play Stratego from scratch by comb...
How our principles helped define AlphaFold’s release

organization How our principles helped define AlphaFold’s release Share.
Reflections and lessons on sharing one of our biggest breakthroughs with the world Putting our mission of solving intelligence to advance science and benefit humanity into practice comes with crucial responsibilities. To help create a positive impact for society, we must proactively evaluate the ethical implications of our research and its applications in a rigorous and careful way. We also know that every new technology has the potential for harm, and we take long and short term risks seriously. We’ve built our foundations on pioneering responsibly from the outset – especially focused on responsible governance, research, and impact. This starts with setting clear principles that help realise the benefits of artificial intelligence (AI), while mitigating its risks and potential negative outcomes. Pioneering responsibly is a collective effort, which is why we’ve contributed to many AI community standards, such as those developed by Google, the Partnership on AI, and the OECD (Organisation for Economic Co-operation and Development). Our Operating Principles have come to define both our commitment to prioritising widespread benefit, as well as the areas of research and applications we refuse to pursue. These principles have been at the heart of our decision making since DeepMind was founded, and continue to be refined as the AI landscape changes and grows. They are designed for our role as a research-driven science business and consistent with Google’s AI Principles.
From principles to practice Written principles are only part of the puzzle – how they’re put into practice is key. For complex research being done at the frontiers of AI, this brings significant challenges: How can researchers predict potential benefits and harms that may occur in the distant future? How can we develop improved ethical foresight from a wide range of perspectives? And what does it take to explore hard questions alongside scientific progress in realtime to prevent negative consequences? We’ve spent many years developing our own skills and processes for responsible governance, research, and impact across DeepMind, from creating internal toolkits and publishing papers on sociotechnical issues to supporting efforts to increase deliberation and foresight across the AI field. To help empower DeepMind teams to pioneer responsibly and safeguard against harm, our interdisciplinary Institutional Review Committee (IRC) meets every two weeks to carefully evaluate DeepMind projects, papers, and collaborations. Pioneering responsibly is a collective muscle, and every project is an opportunity to strengthen our joint skills and understanding. We’ve carefully designed our review process to include rotating experts from a wide range of disciplines, with machine learning researchers, ethicists, and safety experts sitting alongside engineers, security experts, policy professionals, and more. These diverse voices regularly identify ways to expand the benefits of our technologies, suggest areas of research and applications to change or slow, and highlight projects where further external consultation is needed. While we’ve made a lot of progress, many aspects of this lie in uncharted territory. We won’t get it right every time and are committed to continual learning and iteration. We hope sharing our current process will be useful to others working on responsible AI, and encourage feedback as we continue to learn, which is why we’ve detailed reflections and lessons from one of our most complex and rewarding projects: AlphaFold. Our AlphaFold AI system solved the 50-year-old challenge of protein structure prediction – and we’ve been thrilled to see scientists using it to accelerate progress in fields such as sustainability, food security, drug discovery, and fundamental human biology since releasing it to the wider community last year.
Focusing on protein structure prediction Our team of machine learning researchers, biologists, and engineers had long seen the protein-folding problem as a remarkable and unique opportunity for AI-learning systems to create a significant impact. In this arena, there are standard measures of success or failure, and a clear boundary to what the AI system needs to do to help scientists in their work – predict the three-dimensional structure of a protein. And, as with many biological systems, protein folding is far too complex for anyone to write the rules for how it works. But an AI system might be able to learn those rules for itself. Another key factor was the biennial assessment, known as CASP (the Critical Assessment of protein Structure Prediction), which was founded by Professor John Moult and Professor Krzysztof Fidelis. With each gathering, CASP provides an exceptionally robust assessment of progress, requiring participants to predict structures that have only lately been discovered through experiments. The results are a great catalyst for ambitious research and scientific excellence.
Understanding practical opportunities and risks As we prepared for the CASP assessment in 2020, we realised that AlphaFold showed great potential for solving the challenge at hand. We spent considerable time and effort analysing the practical implications, questioning: How could AlphaFold accelerate biological research and applications? What might be the unintended consequences? And how could we share our progress in a responsible way? This presented a wide range of opportunities and risks to consider, many of which were in areas where we didn’t necessarily have strong expertise. So we sought out external input from over 30 field leaders across biology research, biosecurity, bioethics, human rights, and more, with a focus on diversity of expertise and background. Many consistent themes came up throughout these discussions: Balancing widespread benefit with the risk of harm. We started with a cautious mindset about the risk of accidental or deliberate harm, including how AlphaFold might interact with both future advances and existing technologies. Through our discussions with external experts, it became clearer that AlphaFold would not make it meaningfully easier to cause harm with proteins, given the many practical barriers to this – but that future advances would need to be evaluated carefully. Many experts argued strongly that AlphaFold, as an advance relevant to many areas of scientific research, would have the greatest benefit through free and widespread access. Accurate confidence measures are essential for responsible use. Experimental biologists explained how essential it would be to understand and share well-calibrated and usable confidence metrics for each part of AlphaFold’s predictions. By signalling which of AlphaFold’s predictions are likely to be accurate, customers can estimate when they can trust a prediction and use it in their work – and when they should use alternative approaches in their research. We had initially considered omitting predictions for which AlphaFold had low confidence or high predictive uncertainty, but the external experts we consulted proved why this was especially essential to retain these predictions in our release, and advised us on the most useful and transparent ways to present this information. Equitable benefit could mean extra support for underfunded fields. We had many discussions about how to avoid inadvertently increasing disparities within the scientific community. For example, so-called neglected tropical diseases, which disproportionately affect poorer parts of the world, often receive less research funding than they should. We were strongly encouraged to prioritise hands-on support and proactively look to partner with groups working on these areas.
Research How AlphaChip transformed computer chip design Share.
Our AI method has accelerated and optimized chip design, and its sup...
Google DeepMind researchers are presenting more than 80 new papers at ICML this year. As many papers were submitted before Google Brain and DeepMind j...
In December, we launched our first natively multimodal model Gemini [website] in three sizes: Ultra, Pro and Nano. Just a few months later we released [website] P...
The ethics of advanced AI assistants

Responsibility & Safety The ethics of advanced AI assistants Share.
Exploring the promise and risks of a future with more capable AI Imagine a future where we interact regularly with a range of advanced artificial intelligence (AI) assistants — and where millions of assistants interact with each other on our behalf. These experiences and interactions may soon become part of our everyday reality. General-purpose foundation models are paving the way for increasingly advanced AI assistants. Capable of planning and performing a wide range of actions in line with a person’s aims, they could add immense value to people’s lives and to society, serving as creative partners, research analysts, educational tutors, life planners and more. They could also bring about a new phase of human interaction with AI. This is why it’s so significant to think proactively about what this world could look like, and to help steer responsible decision-making and beneficial outcomes ahead of time. Our new paper is the first systematic treatment of the ethical and societal questions that advanced AI assistants raise for individuals, developers and the societies they’re integrated into, and provides significant new insights into the potential impact of this technology. We cover topics such as value alignment, safety and misuse, the impact on the economy, the environment, the information sphere, access and opportunity and more. This is the result of one of our largest ethics foresight projects to date. Bringing together a wide range of experts, we examined and mapped the new technical and moral landscape of a future populated by AI assistants, and characterized the opportunities and risks society might face. Here we outline some of our key takeaways.
Illustration of the potential for AI assistants to impact research, education, creative tasks and planning.
Advanced AI assistants could have a profound impact on people and society, and be integrated into most aspects of people’s lives. For example, people may ask them to book holidays, manage social time or perform other life tasks. If deployed at scale, AI assistants could impact the way people approach work, education, creative projects, hobbies and social interaction. Over time, AI assistants could also influence the goals people pursue and their path of personal development through the information and advice assistants give and the actions they take. Ultimately, this raises crucial questions about how people interact with this technology and how it can best support their goals and aspirations.
Illustration showing that AI assistants should be able to understand human preferences and values.
AI assistants will likely have a significant level of autonomy for planning and performing sequences of tasks across a range of domains. Because of this, AI assistants present novel challenges around safety, alignment and misuse. With more autonomy comes greater risk of accidents caused by unclear or misinterpreted instructions, and greater risk of assistants taking actions that are misaligned with the user’s values and interests. More autonomous AI assistants may also enable high-impact forms of misuse, like spreading misinformation or engaging in cyber attacks. To address these potential risks, we argue that limits must be set on this technology, and that the values of advanced AI assistants must improved align to human values and be compatible with wider societal ideals and standards.
Illustration of an AI assistant and a person communicating in a human-like way.
Able to fluidly communicate using natural language, the written output and voices of advanced AI assistants may become hard to distinguish from those of humans. This development opens up a complex set of questions around trust, privacy, anthropomorphism and appropriate human relationships with AI: How can we make sure individuals can reliably identify AI assistants and stay in control of their interactions with them? What can be done to ensure individuals aren’t unduly influenced or misled over time? Safeguards, such as those around privacy, need to be put in place to address these risks. Importantly, people’s relationships with AI assistants must preserve the user’s autonomy, support their ability to flourish and not rely on emotional or material dependence.
Cooperating and coordinating to meet human preferences.
Illustration of how interactions between AI assistants and people will create different network effects.
If this technology becomes widely available and deployed at scale, advanced AI assistants will need to interact with each other, with individuals and non-individuals alike. To help avoid collective action problems, these assistants must be able to cooperate successfully. For example, thousands of assistants might try to book the same service for their individuals at the same time — potentially crashing the system. In an ideal scenario, these AI assistants would instead coordinate on behalf of human individuals and the service providers involved to discover common ground that more effective meets different people’s preferences and needs. Given how useful this technology may become, it’s also significant that no one is excluded. AI assistants should be broadly accessible and designed with the needs of different individuals and non-individuals in mind.
More evaluations and foresight are needed.
Illustration of how evaluations on many levels are essential for understanding AI assistants.
AI assistants could display novel capabilities and use tools in new ways that are challenging to foresee, making it hard to anticipate the risks associated with their deployment. To help manage such risks, we need to engage in foresight practices that are based on comprehensive tests and evaluations. Our previous research on evaluating social and ethical risks from generative AI identified some of the gaps in traditional model evaluation methods and we encourage much more research in this space. For instance, comprehensive evaluations that address the effects of both human-computer interactions and the wider effects on society could help researchers understand how AI assistants interact with customers, non-customers and society as part of a broader network. In turn, these insights could inform advanced mitigations and responsible decision-making. Building the future we want We may be facing a new era of technological and societal transformation inspired by the development of advanced AI assistants. The choices we make today, as researchers, developers, policymakers and members of the public will guide how this technology develops and is deployed across society. We hope that our paper will function as a springboard for further coordination and cooperation to collectively shape the kind of beneficial AI assistants we’d all like to see in the world.
Research Shaping the future of advanced robotics Share.
Introducing AutoRT, SARA-RT and RT-Trajectory to improve real-world robot d...
Finding solutions to improve turtle reidentification and supporting machine learning projects across Africa.
Protecting the ecosystems around us is cr...
organization A new generation of African talent brings cutting-edge AI to scientific challenges Share.
Market Impact Analysis
Market Growth Trend
2018 | 2019 | 2020 | 2021 | 2022 | 2023 | 2024 |
---|---|---|---|---|---|---|
23.1% | 27.8% | 29.2% | 32.4% | 34.2% | 35.2% | 35.6% |
Quarterly Growth Rate
Q1 2024 | Q2 2024 | Q3 2024 | Q4 2024 |
---|---|---|---|
32.5% | 34.8% | 36.2% | 35.6% |
Market Segments and Growth Drivers
Segment | Market Share | Growth Rate |
---|---|---|
Machine Learning | 29% | 38.4% |
Computer Vision | 18% | 35.7% |
Natural Language Processing | 24% | 41.5% |
Robotics | 15% | 22.3% |
Other AI Technologies | 14% | 31.8% |
Technology Maturity Curve
Different technologies within the ecosystem are at varying stages of maturity:
Competitive Landscape Analysis
Company | Market Share |
---|---|
Google AI | 18.3% |
Microsoft AI | 15.7% |
IBM Watson | 11.2% |
Amazon AI | 9.8% |
OpenAI | 8.4% |
Future Outlook and Predictions
The Early Warning System landscape is evolving rapidly, driven by technological advancements, changing threat vectors, and shifting business requirements. Based on current trends and expert analyses, we can anticipate several significant developments across different time horizons:
Year-by-Year Technology Evolution
Based on current trajectory and expert analyses, we can project the following development timeline:
Technology Maturity Curve
Different technologies within the ecosystem are at varying stages of maturity, influencing adoption timelines and investment priorities:
Innovation Trigger
- Generative AI for specialized domains
- Blockchain for supply chain verification
Peak of Inflated Expectations
- Digital twins for business processes
- Quantum-resistant cryptography
Trough of Disillusionment
- Consumer AR/VR applications
- General-purpose blockchain
Slope of Enlightenment
- AI-driven analytics
- Edge computing
Plateau of Productivity
- Cloud infrastructure
- Mobile applications
Technology Evolution Timeline
- Improved generative models
- specialized AI applications
- AI-human collaboration systems
- multimodal AI platforms
- General AI capabilities
- AI-driven scientific breakthroughs
Expert Perspectives
Leading experts in the ai tech sector provide diverse perspectives on how the landscape will evolve over the coming years:
"The next frontier is AI systems that can reason across modalities and domains with minimal human guidance."
— AI Researcher
"Organizations that develop effective AI governance frameworks will gain competitive advantage."
— Industry Analyst
"The AI talent gap remains a critical barrier to implementation for most enterprises."
— Chief AI Officer
Areas of Expert Consensus
- Acceleration of Innovation: The pace of technological evolution will continue to increase
- Practical Integration: Focus will shift from proof-of-concept to operational deployment
- Human-Technology Partnership: Most effective implementations will optimize human-machine collaboration
- Regulatory Influence: Regulatory frameworks will increasingly shape technology development
Short-Term Outlook (1-2 Years)
In the immediate future, organizations will focus on implementing and optimizing currently available technologies to address pressing ai tech challenges:
- Improved generative models
- specialized AI applications
- enhanced AI ethics frameworks
These developments will be characterized by incremental improvements to existing frameworks rather than revolutionary changes, with emphasis on practical deployment and measurable outcomes.
Mid-Term Outlook (3-5 Years)
As technologies mature and organizations adapt, more substantial transformations will emerge in how security is approached and implemented:
- AI-human collaboration systems
- multimodal AI platforms
- democratized AI development
This period will see significant changes in security architecture and operational models, with increasing automation and integration between previously siloed security functions. Organizations will shift from reactive to proactive security postures.
Long-Term Outlook (5+ Years)
Looking further ahead, more fundamental shifts will reshape how cybersecurity is conceptualized and implemented across digital ecosystems:
- General AI capabilities
- AI-driven scientific breakthroughs
- new computing paradigms
These long-term developments will likely require significant technical breakthroughs, new regulatory frameworks, and evolution in how organizations approach security as a fundamental business function rather than a technical discipline.
Key Risk Factors and Uncertainties
Several critical factors could significantly impact the trajectory of ai tech evolution:
Organizations should monitor these factors closely and develop contingency strategies to mitigate potential negative impacts on technology implementation timelines.
Alternative Future Scenarios
The evolution of technology can follow different paths depending on various factors including regulatory developments, investment trends, technological breakthroughs, and market adoption. We analyze three potential scenarios:
Optimistic Scenario
Responsible AI driving innovation while minimizing societal disruption
Key Drivers: Supportive regulatory environment, significant research breakthroughs, strong market incentives, and rapid user adoption.
Probability: 25-30%
Base Case Scenario
Incremental adoption with mixed societal impacts and ongoing ethical challenges
Key Drivers: Balanced regulatory approach, steady technological progress, and selective implementation based on clear ROI.
Probability: 50-60%
Conservative Scenario
Technical and ethical barriers creating significant implementation challenges
Key Drivers: Restrictive regulations, technical limitations, implementation challenges, and risk-averse organizational cultures.
Probability: 15-20%
Scenario Comparison Matrix
Factor | Optimistic | Base Case | Conservative |
---|---|---|---|
Implementation Timeline | Accelerated | Steady | Delayed |
Market Adoption | Widespread | Selective | Limited |
Technology Evolution | Rapid | Progressive | Incremental |
Regulatory Environment | Supportive | Balanced | Restrictive |
Business Impact | Transformative | Significant | Modest |
Transformational Impact
Redefinition of knowledge work, automation of creative processes. This evolution will necessitate significant changes in organizational structures, talent development, and strategic planning processes.
The convergence of multiple technological trends—including artificial intelligence, quantum computing, and ubiquitous connectivity—will create both unprecedented security challenges and innovative defensive capabilities.
Implementation Challenges
Ethical concerns, computing resource limitations, talent shortages. Organizations will need to develop comprehensive change management strategies to successfully navigate these transitions.
Regulatory uncertainty, particularly around emerging technologies like AI in security applications, will require flexible security architectures that can adapt to evolving compliance requirements.
Key Innovations to Watch
Multimodal learning, resource-efficient AI, transparent decision systems. Organizations should monitor these developments closely to maintain competitive advantages and effective security postures.
Strategic investments in research partnerships, technology pilots, and talent development will position forward-thinking organizations to leverage these innovations early in their development cycle.
Technical Glossary
Key technical terms and definitions to help understand the technologies discussed in this article.
Understanding the following technical concepts is essential for grasping the full implications of the security threats and defensive measures discussed in this article. These definitions provide context for both technical and non-technical readers.