AI achieves silver-medal standard solving International Mathematical Olympiad problems - Related to silver-medal, misuse, audio, mathematical, pushing

AI achieves silver-medal standard solving International Mathematical Olympiad problems

Research AI achieves silver-medal standard solving International Mathematical Olympiad problems Share.

Breakthrough models AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics Artificial general intelligence (AGI) with advanced mathematical reasoning has the potential to unlock new frontiers in science and technology. We’ve made great progress building AI systems that help mathematicians discover new insights, novel algorithms and answers to open problems. But current AI systems still struggle with solving general math problems because of limitations in reasoning skills and training data. Today, we present AlphaProof, a new reinforcement-learning based system for formal math reasoning, and AlphaGeometry 2, an improved version of our geometry-solving system. Together, these systems solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving the same level as a silver medalist in the competition for the first time.

Breakthrough AI performance solving complex math problems The IMO is the oldest, largest and most prestigious competition for young mathematicians, held annually since 1959. Each year, elite pre-college mathematicians train, sometimes for thousands of hours, to solve six exceptionally difficult problems in algebra, combinatorics, geometry and number theory. Many of the winners of the Fields Medal, one of the highest honors for mathematicians, have represented their country at the IMO. More lately, the annual IMO competition has also become widely recognised as a grand challenge in machine learning and an aspirational benchmark for measuring an AI system’s advanced mathematical reasoning capabilities. This year, we applied our combined AI system to the competition problems, provided by the IMO organizers. Our solutions were scored ’s point-awarding rules by prominent mathematicians Prof Sir Timothy Gowers, an IMO gold medalist and Fields Medal winner, and Dr Joseph Myers, a two-time IMO gold medalist and Chair of the IMO 2024 Problem Selection Committee.

“ The fact that the program can come up with a non-obvious construction like this is very impressive, and well beyond what I thought was state of the art. Prof Sir Timothy Gowers,.

IMO gold medalist and Fields Medal winner.

First, the problems were manually translated into formal mathematical language for our systems to understand. In the official competition, students submit answers in two sessions of [website] hours each. Our systems solved one problem within minutes and took up to three days to solve the others. AlphaProof solved two algebra problems and one number theory problem by determining the answer and proving it was correct. This included the hardest problem in the competition, solved by only five contestants at this year’s IMO. AlphaGeometry 2 proved the geometry problem, while the two combinatorics problems remained unsolved.

Each of the six problems can earn seven points, with a total maximum of 42. Our system achieved a final score of 28 points, earning a perfect score on each problem solved — equivalent to the top end of the silver-medal category. This year, the gold-medal threshold starts at 29 points, and was achieved by 58 of 609 contestants at the official competition.

Graph showing performance of our AI system relative to human competitors at IMO 2024. We earned 28 out of 42 total points, achieving the same level as a silver medalist in the competition.

AlphaProof: a formal approach to reasoning AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go. Formal languages offer the critical advantage that proofs involving mathematical reasoning can be formally verified for correctness. Their use in machine learning has, however, previously been constrained by the very limited amount of human-written data available. In contrast, natural language based approaches can hallucinate plausible but incorrect intermediate reasoning steps and solutions, despite having access to orders of magnitudes more data. We established a bridge between these two complementary spheres by fine-tuning a Gemini model to automatically translate natural language problem statements into formal statements, creating a large library of formal problems of varying difficulty. When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching over possible proof steps in Lean. Each proof that was found and verified is used to reinforce AlphaProof’s language model, enhancing its ability to solve subsequent, more challenging problems. We trained AlphaProof for the IMO by proving or disproving millions of problems, covering a wide range of difficulties and mathematical topic areas over a period of weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found.

Process infographic of AlphaProof’s reinforcement learning training loop: Around one million informal math problems are translated into a formal math language by a formalizer network. Then a solver network searches for proofs or disproofs of the problems, progressively training itself via the AlphaZero algorithm to solve more challenging problems.

A more competitive AlphaGeometry 2 AlphaGeometry 2 is a significantly improved version of AlphaGeometry. It’s a neuro-symbolic hybrid system in which the language model was based on Gemini and trained from scratch on an order of magnitude more synthetic data than its predecessor. This helped the model tackle much more challenging geometry problems, including problems about movements of objects and equations of angles, ratio or distances. AlphaGeometry 2 employs a symbolic engine that is two orders of magnitude faster than its predecessor. When presented with a new problem, a novel knowledge-sharing mechanism is used to enable advanced combinations of different search trees to tackle more complex problems. Before this year’s competition, AlphaGeometry 2 could solve 83% of all historical IMO geometry problems from the past 25 years, compared to the 53% rate achieved by its predecessor. For IMO 2024, AlphaGeometry 2 solved Problem 4 within 19 seconds after receiving its formalization.

Illustration of Problem 4, which asks to prove the sum of ∠KIL and ∠XPY equals 180°. AlphaGeometry 2 proposed to construct E, a point on the line BI so that ∠AEB = 90°. Point E helps give purpose to the midpoint L of AB, creating many pairs of similar triangles such as ABE ~ YBI and ALE ~ IPC needed to prove the conclusion.

New frontiers in mathematical reasoning As part of our IMO work, we also experimented with a natural language reasoning system, built upon Gemini and our latest research to enable advanced problem-solving skills. This system doesn’t require the problems to be translated into a formal language and could be combined with other AI systems. We also tested this approach on this year’s IMO problems and the results showed great promise. Our teams are continuing to explore multiple AI approaches for advancing mathematical reasoning and plan to release more technical details on AlphaProof soon. We’re excited for a future in which mathematicians work with AI tools to explore hypotheses, try bold new approaches to solving long-standing problems and quickly complete time-consuming elements of proofs — and where AI systems like Gemini become more capable at math and broader reasoning.

Technologies Generating audio for video Share.

Video-to-audio research uses video pixels and text prompts to generate rich soundtra...

This morning, Co-founder and CEO of Google DeepMind and Isomorphic Labs Sir Demis Hassabis, and Google DeepMind Director Dr. John Jumper were co-award...

A note from Google and Alphabet CEO Sundar Pichai:

Every technology shift is an opportunity to advance scientific discovery, accelerate human progres...

Pushing the frontiers of audio generation

Technologies Pushing the frontiers of audio generation Share.

Illuminate creates formal AI-generated discussions about research papers to help make knowledge more accessible and digestible. Here, we provide an overview of our latest speech generation research underpinning all of these products and experimental tools.

Pioneering techniques for audio generation For years, we've been investing in audio generation research and exploring new ways for generating more natural dialogue in our products and experimental tools. In our previous research on SoundStorm, we first demonstrated the ability to generate 30-second segments of natural dialogue between multiple speakers. This extended our earlier work, SoundStream and AudioLM, which allowed us to apply many text-based language modeling techniques to the problem of audio generation. SoundStream is a neural audio codec that efficiently compresses and decompresses an audio input, without compromising its quality. As part of the training process, SoundStream learns how to map audio to a range of acoustic tokens. These tokens capture all of the information needed to reconstruct the audio with high fidelity, including properties such as prosody and timbre. AudioLM treats audio generation as a language modeling task to produce the acoustic tokens of codecs like SoundStream. As a result, the AudioLM framework makes no assumptions about the type or makeup of the audio being generated, and can flexibly handle a variety of sounds without needing architectural adjustments — making it a good candidate for modeling multi-speaker dialogues.

Download audio Audio sample of two speakers demonstrating surprise and disbelief. Download audio Audio sample of two speakers with overlapping speech. Download audio Audio clip of two speakers telling a funny story, with laughter at the punchline. Download audio Audio clip of two speakers expressing excitement about a surprise birthday party.

Download audio Example of a multi-speaker dialogue generated by NotebookLM Audio Overview, based on a few potato-related documents.

Building upon this research, our latest speech generation technology can produce 2 minutes of dialogue, with improved naturalness, speaker consistency and acoustic quality, when given a script of dialogue and speaker turn markers. The model also performs this task in under 3 seconds on a single Tensor Processing Unit (TPU) v5e chip, in one inference pass. This means it generates audio over 40-times faster than real time.

Scaling our audio generation models Scaling our single-speaker generation models to multi-speaker models then became a matter of data and model capacity. To help our latest speech generation model produce longer speech segments, we created an even more efficient speech codec for compressing audio into a sequence of tokens, in as low as 600 bits per second, without compromising the quality of its output. The tokens produced by our codec have a hierarchical structure and are grouped by time frames. The first tokens within a group capture phonetic and prosodic information, while the last tokens encode fine acoustic details. Even with our new speech codec, producing a 2-minute dialogue requires generating over 5000 tokens. To model these long sequences, we developed a specialized Transformer architecture that can efficiently handle hierarchies of information, matching the structure of our acoustic tokens. With this technique, we can efficiently generate acoustic tokens that correspond to the dialogue, within a single autoregressive inference pass. Once generated, these tokens can be decoded back into an audio waveform using our speech codec.

Unmute video Mute video Pause video Play video Animation showing how our speech generation model produces a stream of audio tokens autoregressively, which are decoded back to a waveform consisting of a two-speaker dialogue.

To teach our model how to generate realistic exchanges between multiple speakers, we pretrained it on hundreds of thousands of hours of speech data. Then we finetuned it on a much smaller dataset of dialogue with high acoustic quality and precise speaker annotations, consisting of unscripted conversations from a number of voice actors and realistic disfluencies — the “umm”s and “aah”s of real conversation. This step taught the model how to reliably switch between speakers during a generated dialogue and to output only studio quality audio with realistic pauses, tone and timing. In line with our AI Principles and our commitment to developing and deploying AI technologies responsibly, we’re incorporating our SynthID technology to watermark non-transient AI-generated audio content from these models, to help safeguard against the potential misuse of this technology.

New speech experiences ahead We’re now focused on improving our model’s fluency, acoustic quality and adding more fine-grained controls for functions, like prosody, while exploring how best to combine these advances with other modalities, such as video. The potential applications for advanced speech generation are vast, especially when combined with our Gemini family of models. From enhancing learning experiences to making content more universally accessible, we’re excited to continue pushing the boundaries of what’s possible with voice-based technologies.

Research Building interactive agents in video game worlds Share.

Introducing a framework to create AI agents that can understand hu...

AI is revolutionizing the landscape of scientific research, enabling advancements at a pace that was once unimaginable — from accelerating drug discov...

Impact AlphaFold unlocks one of the greatest puzzles in biology Share.

AI system helps researchers piece together one of the larges...

Mapping the misuse of generative AI

Responsibility & Safety Mapping the misuse of generative AI Share.

New research analyzes the misuse of multimodal generative AI today, in order to help build safer and more responsible technologies Generative artificial intelligence (AI) models that can produce image, text, audio, video and more are enabling a new era of creativity and commercial opportunity. Yet, as these capabilities grow, so does the potential for their misuse, including manipulation, fraud, bullying or harassment. As part of our commitment to develop and use AI responsibly, we , in partnership with Jigsaw and [website], analyzing how generative AI technologies are being misused today. Teams across Google are using this and other research to develop more effective safeguards for our generative AI technologies, amongst other safety initiatives. Together, we gathered and analyzed nearly 200 media reports capturing public incidents of misuse, . From these reports, we defined and categorized common tactics for misusing generative AI and found novel patterns in how these technologies are being exploited or compromised. By clarifying the current threats and tactics used across different types of generative AI outputs, our work can help shape AI governance and guide companies like Google and others building AI technologies in developing more comprehensive safety evaluations and mitigation strategies.

Highlighting the main categories of misuse While generative AI tools represent a unique and compelling means to enhance creativity, the ability to produce bespoke, realistic content has the potential to be used in inappropriate ways by malicious actors. By analyzing media reports, we identified two main categories of generative AI misuse tactics: the exploitation of generative AI capabilities and the compromise of generative AI systems. Examples of the technologies being exploited included creating realistic depictions of human likenesses to impersonate public figures; while instances of the technologies being compromised included ‘jailbreaking’ to remove model safeguards and using adversarial inputs to cause malfunctions.

Relative frequency generative AI misuse tactics in our dataset. Any given case of misuse reported in the media could involve one or more tactics.

Cases of exploitation — involving malicious actors exploiting easily accessible, consumer-level generative AI tools, often in ways that didn’t require advanced technical skills — were the most prevalent in our dataset. For example, we reviewed a high-profile case from February 2024 where an international corporation reportedly lost HK$200 million (approx. US $26M) after an employee was tricked into making a financial transfer during an online meeting. In this instance, every other “person” in the meeting, including the corporation’s chief financial officer, was in fact a convincing, computer-generated imposter. Some of the most prominent tactics we observed, such as impersonation, scams, and synthetic personas, pre-date the invention of generative AI and have long been used to influence the information ecosystem and manipulate others. But wider access to generative AI tools may alter the costs and incentives behind information manipulation, giving these age-old tactics new potency and potential, especially to those who previously lacked the technical sophistication to incorporate such tactics.

Identifying strategies and combinations of misuse Falsifying evidence and manipulating human likenesses underlie the most prevalent tactics in real-world cases of misuse. In the time period we analyzed, most cases of generative AI misuse were deployed in efforts to influence public opinion, enable scams or fraudulent activities, or to generate profit. By observing how bad actors combine their generative AI misuse tactics in pursuit of their various goals, we identified specific combinations of misuse and labeled these combinations as strategies.

Diagram of how the goals of bad actors (left) map onto their strategies of misuse (right).

Emerging forms of generative AI misuse, which aren’t overtly malicious, still raise ethical concerns. For example, new forms of political outreach are blurring the lines between authenticity and deception, such as government officials suddenly speaking a variety of voter-friendly languages without transparent disclosure that they’re using generative AI, and activists using the AI-generated voices of deceased victims to plead for gun reform. While the study provides novel insights on emerging forms of misuse, it’s worth noting that this dataset is a limited sample of media reports. Media reports may prioritize sensational incidents, which in turn may skew the dataset towards particular types of misuse. Detecting or reporting cases of misuse may also be more challenging for those involved because generative AI systems are so novel. The dataset also doesn’t make a direct comparison between misuse of generative AI systems and traditional content creation and manipulation tactics, such as image editing or setting up 'content farms' to create large amounts of text, video, gifs, images and more. So far, anecdotal evidence hints at that traditional content manipulation tactics remain more prevalent.

Impact Stopping malaria in its tracks Share.

Developing a superior malaria vaccine with the help of AI that could save hundreds of th...

Research FermiNet: Quantum physics and chemistry from first principles Share.

Today we introduce Genie 2, a foundation world model capable of generating an endless variety of action-controllable, playable 3D environments for tra...

Market Impact Analysis

Market Growth Trend

2018	2019	2020	2021	2022	2023	2024
23.1%	27.8%	29.2%	32.4%	34.2%	35.2%	35.6%

Quarterly Growth Rate

Q1 2024	Q2 2024	Q3 2024	Q4 2024
32.5%	34.8%	36.2%	35.6%

Market Segments and Growth Drivers

Segment	Market Share	Growth Rate
Machine Learning	29%	38.4%
Computer Vision	18%	35.7%
Natural Language Processing	24%	41.5%
Robotics	15%	22.3%
Other AI Technologies	14%	31.8%

Technology Maturity Curve

Different technologies within the ecosystem are at varying stages of maturity:

Competitive Landscape Analysis

Company	Market Share
Google AI	18.3%
Microsoft AI	15.7%
IBM Watson	11.2%
Amazon AI	9.8%
OpenAI	8.4%

Future Outlook and Predictions

The Achieves Silver Medal landscape is evolving rapidly, driven by technological advancements, changing threat vectors, and shifting business requirements. Based on current trends and expert analyses, we can anticipate several significant developments across different time horizons:

Year-by-Year Technology Evolution

Based on current trajectory and expert analyses, we can project the following development timeline:

2024Early adopters begin implementing specialized solutions with measurable results

2025Industry standards emerging to facilitate broader adoption and integration

2026Mainstream adoption begins as technical barriers are addressed

2027Integration with adjacent technologies creates new capabilities

2028Business models transform as capabilities mature

2029Technology becomes embedded in core infrastructure and processes

2030New paradigms emerge as the technology reaches full maturity

Technology Maturity Curve

Different technologies within the ecosystem are at varying stages of maturity, influencing adoption timelines and investment priorities:

(Interactive diagram available in full report)

Innovation Trigger

Generative AI for specialized domains
Blockchain for supply chain verification

Peak of Inflated Expectations

Digital twins for business processes
Quantum-resistant cryptography

Trough of Disillusionment

Consumer AR/VR applications
General-purpose blockchain

Slope of Enlightenment

AI-driven analytics
Edge computing

Plateau of Productivity

Cloud infrastructure
Mobile applications

Technology Evolution Timeline

1-2 Years

Improved generative models
specialized AI applications

3-5 Years

AI-human collaboration systems
multimodal AI platforms

5+ Years

General AI capabilities
AI-driven scientific breakthroughs

Expert Perspectives

Leading experts in the ai tech sector provide diverse perspectives on how the landscape will evolve over the coming years:

"The next frontier is AI systems that can reason across modalities and domains with minimal human guidance."
— AI Researcher

"Organizations that develop effective AI governance frameworks will gain competitive advantage."
— Industry Analyst

"The AI talent gap remains a critical barrier to implementation for most enterprises."
— Chief AI Officer

Areas of Expert Consensus

Acceleration of Innovation: The pace of technological evolution will continue to increase
Practical Integration: Focus will shift from proof-of-concept to operational deployment
Human-Technology Partnership: Most effective implementations will optimize human-machine collaboration
Regulatory Influence: Regulatory frameworks will increasingly shape technology development

Short-Term Outlook (1-2 Years)

In the immediate future, organizations will focus on implementing and optimizing currently available technologies to address pressing ai tech challenges:

Improved generative models
specialized AI applications
enhanced AI ethics frameworks

These developments will be characterized by incremental improvements to existing frameworks rather than revolutionary changes, with emphasis on practical deployment and measurable outcomes.

Mid-Term Outlook (3-5 Years)

As technologies mature and organizations adapt, more substantial transformations will emerge in how security is approached and implemented:

AI-human collaboration systems
multimodal AI platforms
democratized AI development

This period will see significant changes in security architecture and operational models, with increasing automation and integration between previously siloed security functions. Organizations will shift from reactive to proactive security postures.

Long-Term Outlook (5+ Years)

Looking further ahead, more fundamental shifts will reshape how cybersecurity is conceptualized and implemented across digital ecosystems:

General AI capabilities
AI-driven scientific breakthroughs
new computing paradigms

These long-term developments will likely require significant technical breakthroughs, new regulatory frameworks, and evolution in how organizations approach security as a fundamental business function rather than a technical discipline.

Key Risk Factors and Uncertainties

Several critical factors could significantly impact the trajectory of ai tech evolution:

Ethical concerns about AI decision-making

Data privacy regulations

Algorithm bias

Organizations should monitor these factors closely and develop contingency strategies to mitigate potential negative impacts on technology implementation timelines.

Alternative Future Scenarios

The evolution of technology can follow different paths depending on various factors including regulatory developments, investment trends, technological breakthroughs, and market adoption. We analyze three potential scenarios:

Optimistic Scenario

Responsible AI driving innovation while minimizing societal disruption

Key Drivers: Supportive regulatory environment, significant research breakthroughs, strong market incentives, and rapid user adoption.

Probability: 25-30%

Base Case Scenario

Incremental adoption with mixed societal impacts and ongoing ethical challenges

Key Drivers: Balanced regulatory approach, steady technological progress, and selective implementation based on clear ROI.

Probability: 50-60%

Conservative Scenario

Technical and ethical barriers creating significant implementation challenges

Key Drivers: Restrictive regulations, technical limitations, implementation challenges, and risk-averse organizational cultures.

Probability: 15-20%

Scenario Comparison Matrix

Factor	Optimistic	Base Case	Conservative
Implementation Timeline	Accelerated	Steady	Delayed
Market Adoption	Widespread	Selective	Limited
Technology Evolution	Rapid	Progressive	Incremental
Regulatory Environment	Supportive	Balanced	Restrictive
Business Impact	Transformative	Significant	Modest

Transformational Impact

Redefinition of knowledge work, automation of creative processes. This evolution will necessitate significant changes in organizational structures, talent development, and strategic planning processes.

The convergence of multiple technological trends—including artificial intelligence, quantum computing, and ubiquitous connectivity—will create both unprecedented security challenges and innovative defensive capabilities.

Implementation Challenges

Ethical concerns, computing resource limitations, talent shortages. Organizations will need to develop comprehensive change management strategies to successfully navigate these transitions.

Regulatory uncertainty, particularly around emerging technologies like AI in security applications, will require flexible security architectures that can adapt to evolving compliance requirements.

Key Innovations to Watch

Multimodal learning, resource-efficient AI, transparent decision systems. Organizations should monitor these developments closely to maintain competitive advantages and effective security postures.

Strategic investments in research partnerships, technology pilots, and talent development will position forward-thinking organizations to leverage these innovations early in their development cycle.

Technical Glossary

Key technical terms and definitions to help understand the technologies discussed in this article.

Understanding the following technical concepts is essential for grasping the full implications of the security threats and defensive measures discussed in this article. These definitions provide context for both technical and non-technical readers.

platform intermediate

algorithm Platforms provide standardized environments that reduce development complexity and enable ecosystem growth through shared functionality and integration capabilities.

synthetic data intermediate

interface

reinforcement learning intermediate

platform

machine learning intermediate

encryption

algorithm intermediate

API

generative AI intermediate

cloud computing

AI achieves silver-medal standard solving International Mathematical Olympiad problems - Related to silver-medal, misuse, audio, mathematical, pushing