FunSearch: Making new discoveries in mathematical sciences using Large Language Models - Related to chemistry, language, building, quantum, agents

Building interactive agents in video game worlds

Research Building interactive agents in video game worlds Share.

Introducing a framework to create AI agents that can understand human instructions and perform actions in open-ended settings Human behaviour is remarkably complex. Even a simple request like, "Put the ball close to the box” still requires deep understanding of situated intent and language. The meaning of a word like ‘close’ can be difficult to pin down – placing the ball inside the box might technically be the closest, but it’s likely the speaker wants the ball placed next to the box. For a person to correctly act on the request, they must be able to understand and judge the situation and surrounding context. Most artificial intelligence (AI) researchers now believe that writing computer code which can capture the nuances of situated interactions is impossible. Alternatively, modern machine learning (ML) researchers have focused on learning about these types of interactions from data. To explore these learning-based approaches and quickly build agents that can make sense of human instructions and safely perform actions in open-ended conditions, we created a research framework within a video game environment. Today, we’re publishing a paper and collection of videos, showing our early steps in building video game AIs that can understand fuzzy human concepts – and therefore, can begin to interact with people on their own terms. Much of the recent progress in training video game AI relies on optimising the score of a game. Powerful AI agents for StarCraft and Dota were trained using the clear-cut wins/losses calculated by computer code. Instead of optimising a game score, we ask people to invent tasks and judge progress themselves. Using this approach, we developed a research paradigm that allows us to improve agent behaviour through grounded and open-ended interaction with humans. While still in its infancy, this paradigm creates agents that can listen, talk, ask questions, navigate, search and retrieve, manipulate objects, and perform many other activities in real-time. This compilation presents behaviours of agents following tasks posed by human participants:

We created a virtual "playhouse" with hundreds of recognisable objects and randomised configurations. Designed for simple and safe research, the interface includes a chat for unconstrained communication.

Learning in “the playhouse” Our framework begins with people interacting with other people in the video game world. Using imitation learning, we imbued agents with a broad but unrefined set of behaviours. This "behaviour prior" is crucial for enabling interactions that can be judged by humans. Without this initial imitation phase, agents are entirely random and virtually impossible to interact with. Further human judgement of the agent’s behaviour and optimisation of these judgements by reinforcement learning (RL) produces enhanced agents, which can then be improved again.

We built agents by (1) imitating human-human interactions, and then improving agents though a cycle of (2) human-agent interaction and human feedback, (3) reward model training, and (4) reinforcement learning.

First we built a simple video game world based on the concept of a child's “playhouse.” This environment provided a safe setting for humans and agents to interact and made it easy to rapidly collect large volumes of these interaction data. The house featured a variety of rooms, furniture, and objects configured in new arrangements for each interaction. We also created an interface for interaction. Both the human and agent have an avatar in the game that enables them to move within – and manipulate – the environment. They can also chat with each other in real-time and collaborate on activities, such as carrying objects and handing them to each other, building a tower of blocks, or cleaning a room together. Human participants set the contexts for the interactions by navigating through the world, setting goals, and asking questions for agents. In total, the project collected more than 25 years of real-time interactions between agents and hundreds of (human) participants. Observing behaviours that emerge The agents we trained are capable of a huge range of tasks, some of which were not anticipated by the researchers who built them. For instance, we discovered that these agents can build rows of objects using two alternating colours or retrieve an object from a house that’s similar to another object the user is holding. These surprises emerge because language permits a nearly endless set of tasks and questions via the composition of simple meanings. Also, as researchers, we do not specify the details of agent behaviour. Instead, the hundreds of humans who engage in interactions came up with tasks and questions during the course of these interactions. Building the framework for creating these agents To create our AI agents, we applied three steps. We started by training agents to imitate the basic elements of simple human interactions in which one person asks another to do something or to answer a question. We refer to this phase as creating a behavioural prior that enables agents to have meaningful interactions with a human with high frequency. Without this imitative phase, agents just move randomly and speak nonsense. They’re almost impossible to interact with in any reasonable fashion and giving them feedback is even more difficult. This phase was covered in two of our earlier papers, Imitating Interactive Intelligence, and Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning, which explored building imitation-based agents. Moving beyond imitation learning While imitation learning leads to interesting interactions, it treats each moment of interaction as equally essential. To learn efficient, goal-directed behaviour, an agent needs to pursue an objective and master particular movements and decisions at key moments. For example, imitation-based agents don’t reliably take shortcuts or perform tasks with greater dexterity than an average human player. Here we show an imitation-learning based agent and an RL-based agent following the same human instruction:

To endow our agents with a sense of purpose, surpassing what’s possible through imitation, we relied on RL, which uses trial and error combined with a measure of performance for iterative improvement. As our agents tried different actions, those that improved performance were reinforced, while those that decreased performance were penalised. In games like Atari, Dota, Go, and StarCraft, the score provides a performance measure to be improved. Instead of using a score, we asked humans to assess situations and provide feedback, which helped our agents learn a model of reward. Training the reward model and optimising agents To train a reward model, we asked humans to judge if they observed events indicating conspicuous progress toward the current instructed goal or conspicuous errors or mistakes. We then drew a correspondence between these positive and negative events and positive and negative preferences. Since they take place across time, we call these judgements “inter-temporal.” We trained a neural network to predict these human preferences and obtained as a result a reward (or utility / scoring) model reflecting human feedback. Once we trained the reward model using human preferences, we used it to optimise agents. We placed our agents into the simulator and directed them to answer questions and follow instructions. As they acted and spoke in the environment, our trained reward model scored their behaviour, and we used an RL algorithm to optimise agent performance. So where do the task instructions and questions come from? We explored two approaches for this. First, we recycled the tasks and questions posed in our human dataset. Second, we trained agents to mimic how humans set tasks and pose questions, as shown in this video, where two agents, one trained to mimic humans setting tasks and posing questions (blue) and one trained to follow instructions and answer questions (yellow), interact with each other:

Evaluating and iterating to continue improving agents We used a variety of independent mechanisms to evaluate our agents, from hand-scripted tests to a new mechanism for offline human scoring of open-ended tasks created by people, developed in our previous work Evaluating Multimodal Interactive Agents. Importantly, we asked people to interact with our agents in real-time and judge their performance. Our agents trained by RL performed much advanced than those trained by imitation learning alone.

We asked people to evaluate our agents in online real-time interactions. Humans gave instructions or questions for 5 min and judged the agents’ success. By using RL our agents obtain a higher success rate compared to imitation-learning alone, achieving 92%the performance of humans in similar conditions.

Finally, recent experiments show we can iterate the RL process to repeatedly improve agent behaviour. Once an agent is trained via RL, we asked people to interact with this new agent, annotate its behaviour, modification our reward model, and then perform another iteration of RL. The result of this approach was increasingly competent agents. For some types of complex instructions, we could even create agents that outperformed human players on average.

We iterated the human feedback and RL cycle on the problem of building towers. The imitation agent performs significantly worse than humans. Successive rounds of feedback and RL solve the tower-building problem more often than humans.

The future of training AI for situated human preferences The idea of training AI using human preferences as a reward has been around for a long time. In Deep reinforcement learning from human preferences, researchers pioneered recent approaches to aligning neural network based agents with human preferences. Recent work to develop turn-based dialogue agents explored similar ideas for training assistants with RL from human feedback. Our research has adapted and expanded these ideas to build flexible AIs that can master a broad scope of multi-modal, embodied, real-time interactions with people. We hope our framework may someday lead to the creation of game AIs that are capable of responding to our naturally expressed meanings, rather than relying on hand-scripted behavioural plans. Our framework could also be useful for building digital and robotic assistants for people to interact with every day. We look forward to exploring the possibility of applying elements of this framework to create safe AI that’s truly helpful. Excited to learn more? Check out our latest paper. Feedback and comments are welcome.

How summits in Seoul, France and beyond can galvanize international cooperation on frontier AI safety.

Last year, the UK Government hosted the first m...

AI is revolutionizing the landscape of scientific research, enabling advancements at a pace that was once unimaginable — from accelerating drug discov...

We’re partnering with six education charities and social enterprises in the United Kingdom (UK) to co-create a bespoke education programme to help tac...

FunSearch: Making new discoveries in mathematical sciences using Large Language Models

Research FunSearch: Making new discoveries in mathematical sciences using Large Language Models Share.

By searching for “functions” written in computer code, FunSearch made the first discoveries in open problems in mathematical sciences using LLMs revision: In December 2024, we . Large Language Models (LLMs) are useful assistants - they excel at combining concepts and can read, write and code to help people solve problems. But could they discover entirely new knowledge? As LLMs have been shown to “hallucinate” factually incorrect information, using them to make verifiably correct discoveries is a challenge. But what if we could harness the creativity of LLMs by identifying and building upon only their very best ideas? Today, in a paper , we introduce FunSearch, a method to search for new solutions in mathematics and computer science. FunSearch works by pairing a pre-trained LLM, whose goal is to provide creative solutions in the form of computer code, with an automated “evaluator”, which guards against hallucinations and incorrect ideas. By iterating back-and-forth between these two components, initial solutions “evolve” into new knowledge. The system searches for “functions” written in computer code; hence the name FunSearch. This work represents the first time a new discovery has been made for challenging open problems in science or mathematics using LLMs. FunSearch discovered new solutions for the cap set problem, a longstanding open problem in mathematics. In addition, to demonstrate the practical usefulness of FunSearch, we used it to discover more effective algorithms for the “bin-packing” problem, which has ubiquitous applications such as making data centers more efficient. Scientific progress has always relied on the ability to share new understanding. What makes FunSearch a particularly powerful scientific tool is that it outputs programs that reveal how its solutions are constructed, rather than just what the solutions are. We hope this can inspire further insights in the scientists who use FunSearch, driving a virtuous cycle of improvement and discovery. Driving discovery through evolution with language models FunSearch uses an evolutionary method powered by LLMs, which promotes and develops the highest scoring ideas. These ideas are expressed as computer programs, so that they can be run and evaluated automatically. First, the user writes a description of the problem in the form of code. This description comprises a procedure to evaluate programs, and a seed program used to initialize a pool of programs. FunSearch is an iterative procedure; at each iteration, the system selects some programs from the current pool of programs, which are fed to an LLM. The LLM creatively builds upon these, and generates new programs, which are automatically evaluated. The best ones are added back to the pool of existing programs, creating a self-improving loop. FunSearch uses Google’s PaLM 2, but it is compatible with other LLMs trained on code.

The FunSearch process. The LLM is shown a selection of the best programs it has generated so far (retrieved from the programs database), and asked to generate an even improved one. The programs proposed by the LLM are automatically executed, and evaluated. The best programs are added to the database, for selection in subsequent cycles. The user can at any point retrieve the highest-scoring programs discovered so far.

Discovering new mathematical knowledge and algorithms in different domains is a notoriously difficult task, and largely beyond the power of the most advanced AI systems. To tackle such challenging problems with FunSearch, we introduced multiple key components. Instead of starting from scratch, we start the evolutionary process with common knowledge about the problem, and let FunSearch focus on finding the most critical ideas to achieve new discoveries. In addition, our evolutionary process uses a strategy to improve the diversity of ideas in order to avoid stagnation. Finally, we run the evolutionary process in parallel to improve the system efficiency. Breaking new ground in mathematics We first address the cap set problem, an open challenge, which has vexed mathematicians in multiple research areas for decades. Renowned mathematician Terence Tao once described it as his favorite open question. We collaborated with Jordan Ellenberg, a professor of mathematics at the University of Wisconsin–Madison, and author of an critical breakthrough on the cap set problem. The problem consists of finding the largest set of points (called a cap set) in a high-dimensional grid, where no three points lie on a line. This problem is critical because it serves as a model for other problems in extremal combinatorics - the study of how large or small a collection of numbers, graphs or other objects could be. Brute-force computing approaches to this problem don’t work – the number of possibilities to consider quickly becomes greater than the number of atoms in the universe. FunSearch generated solutions - in the form of programs - that in some settings discovered the largest cap sets ever found. This represents the largest increase in the size of cap sets in the past 20 years. Moreover, FunSearch outperformed state-of-the-art computational solvers, as this problem scales well beyond their current capabilities.

Interactive figure showing the evolution from the seed program (top) to a new higher-scoring function (bottom). Each circle is a program, with its size proportional to the score assigned to it. Only ancestors of the program at the bottom are shown. The corresponding function produced by FunSearch for each node is shown on the right (see full program using this function in the paper).

These results demonstrate that the FunSearch technique can take us beyond established results on hard combinatorial problems, where intuition can be difficult to build. We expect this approach to play a role in new discoveries for similar theoretical problems in combinatorics, and in the future it may open up new possibilities in fields such as communication theory. FunSearch favors concise and human-interpretable programs While discovering new mathematical knowledge is significant in itself, the FunSearch approach offers an additional benefit over traditional computer search techniques. That’s because FunSearch isn’t a black box that merely generates solutions to problems. Instead, it generates programs that describe how those solutions were arrived at. This show-your-working approach is how scientists generally operate, with new discoveries or phenomena explained through the process used to produce them. FunSearch favors finding solutions represented by highly compact programs - solutions with a low Kolmogorov complexity†. Short programs can describe very large objects, allowing FunSearch to scale to large needle-in-a-haystack problems. Moreover, this makes FunSearch’s program outputs easier for researchers to comprehend. Ellenberg expressed: “FunSearch offers a completely new mechanism for developing strategies of attack. The solutions generated by FunSearch are far conceptually richer than a mere list of numbers. When I study them, I learn something”. What’s more, this interpretability of FunSearch’s programs can provide actionable insights to researchers. As we used FunSearch we noticed, for example, intriguing symmetries in the code of some of its high-scoring outputs. This gave us a new insight into the problem, and we used this insight to refine the problem introduced to FunSearch, resulting in even superior solutions. We see this as an exemplar for a collaborative procedure between humans and FunSearch across many problems in mathematics.

Left: Inspecting code generated by FunSearch yielded further actionable insights (highlights added by us). Right: The raw “admissible” set constructed using the (much shorter) program on the left.

“ The solutions generated by FunSearch are far conceptually richer than a mere list of numbers. When I study them, I learn something. Jordan Ellenberg, collaborator and professor of mathematics at the University of Wisconsin–Madison.

Addressing a notoriously hard challenge in computing Encouraged by our success with the theoretical cap set problem, we decided to explore the flexibility of FunSearch by applying it to an crucial practical challenge in computer science. The “bin packing” problem looks at how to pack items of different sizes into the smallest number of bins. It sits at the core of many real-world problems, from loading containers with items to allocating compute jobs in data centers to minimize costs. The online bin-packing problem is typically addressed using algorithmic rules-of-thumb (heuristics) based on human experience. But finding a set of rules for each specific situation - with differing sizes, timing, or capacity – can be challenging. Despite being very different from the cap set problem, setting up FunSearch for this problem was easy. FunSearch delivered an automatically tailored program (adapting to the specifics of the data) that outperformed established heuristics – using fewer bins to pack the same number of items.

Pause video Play video Illustrative example of bin packing using existing heuristic – Best-fit heuristic (left), and using a heuristic discovered by FunSearch (right).

Hard combinatorial problems like online bin packing can be tackled using other AI approaches, such as neural networks and reinforcement learning. Such approaches have proven to be effective too, but may also require significant resources to deploy. FunSearch, on the other hand, outputs code that can be easily inspected and deployed, meaning its solutions could potentially be slotted into a variety of real-world industrial systems to bring swift benefits. improvement: Enhancing human performance in combinatorial competitive programming In December 2024, we . In traditional coding contests like Codeforces which was targeted by AlphaCode, competitors need to provide complete solutions to classical algorithmic challenges in a time- and memory-constrained setting. In comparison, combinatorial contests feature highly complex problems where the objective is not to find the right answer but the best possible approximate solution, similar to problems like finding cap sets. Given the hardness of these problems for humans, our method can produce solutions that outperform ones that were found by the top percentile of competitors. And it uses an approach that lends itself well to human-AI collaboration: human programmers write the ‘backbone’ of the solution code and then allow an LLM to creatively evolve the function that steers it.

“ This is an exciting approach to combine work of human competitive programmers and LLMs, to achieve results that neither would achieve on their own. — Petr Mitrichev, Software Engineer, Google, World-class Competitive Programmer.

With improved generalist LLMs, we no longer require code-specialised models and can build on Gemini [website] Flash. Beyond competitive programming, we used FunSearch to find more effective ways to optimize functions within the framework of Bayesian optimization. LLM-driven discovery for science and beyond FunSearch demonstrates that if we safeguard against LLMs’ hallucinations, the power of these models can be harnessed not only to produce new mathematical discoveries, but also to reveal potentially impactful solutions to significant real-world problems. We envision that for many problems in science and industry - longstanding or new - generating effective and tailored algorithms using LLM-driven approaches will become common practice. Indeed, this is just the beginning. FunSearch will improve as a natural consequence of the wider progress of LLMs, and we will also be working to broaden its capabilities to address a variety of society’s pressing scientific and engineering challenges.

Impact AlphaFold unlocks one of the greatest puzzles in biology Share.

AI system helps researchers piece together one of the larges...

Research Google DeepMind at NeurIPS 2024 Share.

Building adaptive, smart, and safe AI Agents LLM-based AI agents are showing promis...

Technologies Gemma Scope: helping the safety community shed light on the inner workings of language models Share.

FermiNet: Quantum physics and chemistry from first principles

Research FermiNet: Quantum physics and chemistry from first principles Share.

Note: This blog was first . Following the publication of our breakthrough work on excited states in Science on 22 August 2024, we’ve made minor updates and added a section below about this new phase of work. Using deep learning to solve fundamental problems in computational quantum chemistry and explore how matter interacts with light In an article , we showed how deep learning can help solve the fundamental equations of quantum mechanics for real-world systems. Not only is this an essential fundamental scientific question, but it also could lead to practical uses in the future, allowing researchers to prototype new materials and chemical syntheses using computer simulation before trying to make them in the lab. Our neural network architecture, FermiNet (Fermionic Neural Network), is well-suited to modeling the quantum state of large collections of electrons, the fundamental building blocks of chemical bonds. We released the code from this study so computational physics and chemistry communities can build on our work and apply it to a wide range of problems. FermiNet was the first demonstration of deep learning for computing the energy of atoms and molecules from first principles that was accurate enough to be useful, and Psiformer, our novel architecture based on self-attention, remains the most accurate AI method to date. We hope the tools and ideas developed in our artificial intelligence (AI) research can help solve fundamental scientific problems, and FermiNet joins our work on protein folding, glassy dynamics, lattice quantum chromodynamics and many other projects in bringing that vision to life. A brief history of quantum mechanics Mention “quantum mechanics” and you’re more likely to inspire confusion than anything else. The phrase conjures up images of Schrödinger’s cat, which can paradoxically be both alive and dead, and fundamental particles that are also, somehow, waves. In quantum systems, a particle such as an electron doesn’t have an exact location, as it would in a classical description. Instead, its position is described by a probability cloud — it’s smeared out in all places it’s allowed to be. This counterintuitive state of affairs led Richard Feynman to declare: “If you think you understand quantum mechanics, you don’t understand quantum mechanics.” Despite this spooky weirdness, the meat of the theory can be reduced down to just a few straightforward equations. The most famous of these, the Schrödinger equation, describes the behavior of particles at the quantum scale in the same way that Newton’s laws of motion describe the behavior of objects at our more familiar human scale. While the interpretation of this equation can cause endless head-scratching, the math is much easier to work with, leading to the common exhortation from professors to “shut up and calculate” when pressed with thorny philosophical questions from students. These equations are sufficient to describe the behavior of all the familiar matter we see around us at the level of atoms and nuclei. Their counterintuitive nature leads to all sorts of exotic phenomena: superconductors, superfluids, lasers and semiconductors are only possible because of quantum effects. But even the humble covalent bond — the basic building block of chemistry — is a consequence of the quantum interactions of electrons. Once these rules were worked out in the 1920s, scientists realized that, for the first time, they had a detailed theory of how chemistry works. In principle, they could just set up these equations for different molecules, solve for the energy of the system, and figure out which molecules were stable and which reactions would happen spontaneously. But when they sat down to actually calculate the solutions to these equations, they found that they could do it exactly for the simplest atom (hydrogen) and virtually nothing else. Everything else was too complicated.

“ The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble. It therefore becomes desirable that approximate practical methods of applying quantum mechanics should be developed. Paul Dirac, founder of quantum mechanics, 1929.

Many took up Dirac’s charge, and soon physicists built mathematical techniques that could approximate the qualitative behavior of molecular bonds and other chemical phenomena. These methods started from an approximate description of how electrons behave that may be familiar from introductory chemistry. In this description, each electron is assigned to a particular orbital, which gives the probability of a single electron being found at any point near an atomic nucleus. The shape of each orbital then depends on the average shape of all other orbitals. As this “mean field” description treats each electron as being assigned to just one orbital, it’s a very incomplete picture of how electrons actually behave. Nevertheless, it’s enough to estimate the total energy of a molecule with only about [website] error.

Illustration of atomic orbitals. The surface denotes the area of high probability of finding an electron. In the blue region, the wavefunction is positive, while in the purple region it’s negative.

Unfortunately, [website] error still isn’t enough to be useful to the working chemist. The energy in molecular bonds is just a tiny fraction of the total energy of a system, and correctly predicting whether a molecule is stable can often depend on just [website] of the total energy of a system, or about [website] of the remaining “correlation” energy. For instance, while the total energy of the electrons in a butadiene molecule is almost 100,000 kilocalories per mole, the difference in energy between different possible shapes of the molecule is just 1 kilocalorie per mole. That means that if you want to correctly predict butadiene’s natural shape, then the same level of precision is needed as measuring the width of a football field down to the millimeter. With the advent of digital computing after World War II, scientists developed a wide range of computational methods that went beyond this mean field description of electrons. While these methods come in a jumble of abbreviations, they all generally fall somewhere on an axis that trades off accuracy with efficiency. At one extreme are essentially exact methods that scale worse than exponentially with the number of electrons, making them impractical for all but the smallest molecules. At the other extreme are methods that scale linearly, but are not very accurate. These computational methods have had an enormous impact on the practice of chemistry — the 1998 Nobel Prize in chemistry was awarded to the originators of many of these algorithms. Fermionic neural networks Despite the breadth of existing computational quantum mechanical tools, we felt a new method was needed to address the problem of efficient representation. There’s a reason that the largest quantum chemical calculations only run into the tens of thousands of electrons for even the most approximate methods, while classical chemical calculation techniques like molecular dynamics can handle millions of atoms. The state of a classical system can be described easily — we just have to track the position and momentum of each particle. Representing the state of a quantum system is far more challenging. A probability has to be assigned to every possible configuration of electron positions. This is encoded in the wavefunction, which assigns a positive or negative number to every configuration of electrons, and the wavefunction squared gives the probability of finding the system in that configuration. The space of all possible configurations is enormous — if you tried to represent it as a grid with 100 points along each dimension, then the number of possible electron configurations for the silicon atom would be larger than the number of atoms in the universe. This is exactly where we thought deep neural networks could help. In the last several years, there have been huge advances in representing complex, high-dimensional probability distributions with neural networks. We now know how to train these networks efficiently and scalably. We guessed that, given these networks have already proven their ability to fit high-dimensional functions in AI problems, maybe they could be used to represent quantum wavefunctions as well. Researchers such as Giuseppe Carleo, Matthias Troyer and others have shown how modern deep learning could be used for solving idealized quantum problems. We wanted to use deep neural networks to tackle more realistic problems in chemistry and condensed matter physics, and that meant including electrons in our calculations. There is just one wrinkle when dealing with electrons. Electrons must obey the Pauli exclusion principle, which means that they can’t be in the same space at the same time. This is because electrons are a type of particle known as fermions, which include the building blocks of most matter: protons, neutrons, quarks, neutrinos, etc. Their wavefunction must be antisymmetric. If you swap the position of two electrons, the wavefunction gets multiplied by -1. That means that if two electrons are on top of each other, the wavefunction (and the probability of that configuration) will be zero. This meant we had to develop a new type of neural network that was antisymmetric with respect to its inputs, which we called FermiNet. In most quantum chemistry methods, antisymmetry is introduced using a function called the determinant. The determinant of a matrix has the property that if you swap two rows, the output gets multiplied by -1, just like a wavefunction for fermions. So, you can take a bunch of single-electron functions, evaluate them for every electron in your system, and pack all of the results into one matrix. The determinant of that matrix is then a properly antisymmetric wavefunction. The major limitation of this approach is that the resulting function — known as a Slater determinant — is not very general. Wavefunctions of real systems are usually far more complicated. The typical way to improve on this is to take a large linear combination of Slater determinants — sometimes millions or more — and add some simple corrections based on pairs of electrons. Even then, this may not be enough to accurately compute energies.

Animation of a Slater determinant. Each curve is a slice through one of the orbitals shown above. When electrons 1 and 2 swap positions, the rows of the Slater determinant swap, and the wavefunction is multiplied by -1. This guarantees that the Pauli exclusion principle is obeyed.

Deep neural networks can often be far more efficient at representing complex functions than linear combinations of basis functions. In FermiNet, this is achieved by making each function going into the determinant a function of all electrons (see footnote). This goes far beyond methods that just use one- and two-electron functions. FermiNet has a separate stream of information for each electron. Without any interaction between these streams, the network would be no more expressive than a conventional Slater determinant. To go beyond this, we average together information from across all streams at each layer of the network, and pass this information to each stream at the next layer. That way, these streams have the right symmetry properties to create an antisymmetric function. This is similar to how graph neural networks aggregate information at each layer. Unlike the Slater determinants, FermiNets are universal function approximators, at least in the limit where the neural network layers become wide enough. That means that, if we can train these networks correctly, they should be able to fit the nearly-exact solution to the Schrödinger equation.

Animation of FermiNet. A single stream of the network (blue, purple or pink) functions very similarly to a conventional orbital. FermiNet introduces symmetric interactions between streams, making the wavefunction far more general and expressive. Just like a conventional Slater determinant, swapping two electron positions still leads to swapping two rows in the determinant, and multiplying the overall wavefunction by -1.

We fit FermiNet by minimizing the energy of the system. To do that exactly, we would need to evaluate the wavefunction at all possible configurations of electrons, so we have to do it approximately instead. We pick a random selection of electron configurations, evaluate the energy locally at each arrangement of electrons, add up the contributions from each arrangement and minimize this instead of the true energy. This is known as a Monte Carlo method, because it’s a bit like a gambler rolling dice over and over again. While it’s approximate, if we need to make it more accurate we can always roll the dice again. Since the wavefunction squared gives the probability of observing an arrangement of particles in any location, it’s most convenient to generate samples from the wavefunction itself — essentially, simulating the act of observing the particles. While most neural networks are trained from some external data, in our case the inputs used to train the neural network are generated by the neural network itself. This means we don’t need any training data other than the positions of the atomic nuclei that the electrons are dancing around. The basic idea, known as variational quantum Monte Carlo (or VMC for short), has been around since the ‘60s, and it’s generally considered a cheap but not very accurate way of computing the energy of a system. By replacing the simple wavefunctions based on Slater determinants with FermiNet, we’ve dramatically increased the accuracy of this approach on every system we looked at.

Simulated electrons sampled from FermiNet move around the bicyclobutane molecule.

To make sure that FermiNet represents an advance in the state of the art, we started by investigating simple, well-studied systems, like atoms in the first row of the periodic table (hydrogen through neon). These are small systems — 10 electrons or fewer — and simple enough that they can be treated by the most accurate (but exponential scaling) methods. FermiNet outperforms comparable VMC calculations by a wide margin — often cutting the error relative to the exponentially-scaling calculations by half or more. On larger systems, the exponentially-scaling methods become intractable, so instead we use the coupled cluster method as a baseline. This method works well on molecules in their stable configuration, but struggles when bonds get stretched or broken, which is critical for understanding chemical reactions. While it scales much more effective than exponentially, the particular coupled cluster method we used still scales as the number of electrons raised to the seventh power, so it can only be used for medium-sized molecules. We applied FermiNet to progressively larger molecules, starting with lithium hydride and working our way up to bicyclobutane, the largest system we looked at, with 30 electrons. On the smallest molecules, FermiNet captured an astounding [website] of the difference between the coupled cluster energy and the energy you get from a single Slater determinant. On bicyclobutane, FermiNet still captured 97% or more of this correlation energy, a huge accomplishment for such a simple approach.

Graphic depiction of the fraction of correlation energy that FermiNet captures on molecules. The purple bar indicates 99% of correlation energy. Left to right: lithium hydride, nitrogen, ethene, ozone, ethanol and bicyclobutane.

Technologies New generative AI tools open the doors of music creation Share.

Our latest AI music technologies are now available in ...

Technologies Transforming the future of music creation Share.

Announcing our most advanced music generation model, and two new AI e...

Technologies GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy Share.

Market Impact Analysis

Market Growth Trend

2018	2019	2020	2021	2022	2023	2024
23.1%	27.8%	29.2%	32.4%	34.2%	35.2%	35.6%

Quarterly Growth Rate

Q1 2024	Q2 2024	Q3 2024	Q4 2024
32.5%	34.8%	36.2%	35.6%

Market Segments and Growth Drivers

Segment	Market Share	Growth Rate
Machine Learning	29%	38.4%
Computer Vision	18%	35.7%
Natural Language Processing	24%	41.5%
Robotics	15%	22.3%
Other AI Technologies	14%	31.8%

Technology Maturity Curve

Different technologies within the ecosystem are at varying stages of maturity:

Competitive Landscape Analysis

Company	Market Share
Google AI	18.3%
Microsoft AI	15.7%
IBM Watson	11.2%
Amazon AI	9.8%
OpenAI	8.4%

Future Outlook and Predictions

The Building Interactive Agents landscape is evolving rapidly, driven by technological advancements, changing threat vectors, and shifting business requirements. Based on current trends and expert analyses, we can anticipate several significant developments across different time horizons:

Year-by-Year Technology Evolution

Based on current trajectory and expert analyses, we can project the following development timeline:

2024Early adopters begin implementing specialized solutions with measurable results

2025Industry standards emerging to facilitate broader adoption and integration

2026Mainstream adoption begins as technical barriers are addressed

2027Integration with adjacent technologies creates new capabilities

2028Business models transform as capabilities mature

2029Technology becomes embedded in core infrastructure and processes

2030New paradigms emerge as the technology reaches full maturity

Technology Maturity Curve

Different technologies within the ecosystem are at varying stages of maturity, influencing adoption timelines and investment priorities:

(Interactive diagram available in full report)

Innovation Trigger

Generative AI for specialized domains
Blockchain for supply chain verification

Peak of Inflated Expectations

Digital twins for business processes
Quantum-resistant cryptography

Trough of Disillusionment

Consumer AR/VR applications
General-purpose blockchain

Slope of Enlightenment

AI-driven analytics
Edge computing

Plateau of Productivity

Cloud infrastructure
Mobile applications

Technology Evolution Timeline

1-2 Years

Improved generative models
specialized AI applications

3-5 Years

AI-human collaboration systems
multimodal AI platforms

5+ Years

General AI capabilities
AI-driven scientific breakthroughs

Expert Perspectives

Leading experts in the ai tech sector provide diverse perspectives on how the landscape will evolve over the coming years:

"The next frontier is AI systems that can reason across modalities and domains with minimal human guidance."
— AI Researcher

"Organizations that develop effective AI governance frameworks will gain competitive advantage."
— Industry Analyst

"The AI talent gap remains a critical barrier to implementation for most enterprises."
— Chief AI Officer

Areas of Expert Consensus

Acceleration of Innovation: The pace of technological evolution will continue to increase
Practical Integration: Focus will shift from proof-of-concept to operational deployment
Human-Technology Partnership: Most effective implementations will optimize human-machine collaboration
Regulatory Influence: Regulatory frameworks will increasingly shape technology development

Short-Term Outlook (1-2 Years)

In the immediate future, organizations will focus on implementing and optimizing currently available technologies to address pressing ai tech challenges:

Improved generative models
specialized AI applications
enhanced AI ethics frameworks

These developments will be characterized by incremental improvements to existing frameworks rather than revolutionary changes, with emphasis on practical deployment and measurable outcomes.

Mid-Term Outlook (3-5 Years)

As technologies mature and organizations adapt, more substantial transformations will emerge in how security is approached and implemented:

AI-human collaboration systems
multimodal AI platforms
democratized AI development

This period will see significant changes in security architecture and operational models, with increasing automation and integration between previously siloed security functions. Organizations will shift from reactive to proactive security postures.

Long-Term Outlook (5+ Years)

Looking further ahead, more fundamental shifts will reshape how cybersecurity is conceptualized and implemented across digital ecosystems:

General AI capabilities
AI-driven scientific breakthroughs
new computing paradigms

These long-term developments will likely require significant technical breakthroughs, new regulatory frameworks, and evolution in how organizations approach security as a fundamental business function rather than a technical discipline.

Key Risk Factors and Uncertainties

Several critical factors could significantly impact the trajectory of ai tech evolution:

Ethical concerns about AI decision-making

Data privacy regulations

Algorithm bias

Organizations should monitor these factors closely and develop contingency strategies to mitigate potential negative impacts on technology implementation timelines.

Alternative Future Scenarios

The evolution of technology can follow different paths depending on various factors including regulatory developments, investment trends, technological breakthroughs, and market adoption. We analyze three potential scenarios:

Optimistic Scenario

Responsible AI driving innovation while minimizing societal disruption

Key Drivers: Supportive regulatory environment, significant research breakthroughs, strong market incentives, and rapid user adoption.

Probability: 25-30%

Base Case Scenario

Incremental adoption with mixed societal impacts and ongoing ethical challenges

Key Drivers: Balanced regulatory approach, steady technological progress, and selective implementation based on clear ROI.

Probability: 50-60%

Conservative Scenario

Technical and ethical barriers creating significant implementation challenges

Key Drivers: Restrictive regulations, technical limitations, implementation challenges, and risk-averse organizational cultures.

Probability: 15-20%

Scenario Comparison Matrix

Factor	Optimistic	Base Case	Conservative
Implementation Timeline	Accelerated	Steady	Delayed
Market Adoption	Widespread	Selective	Limited
Technology Evolution	Rapid	Progressive	Incremental
Regulatory Environment	Supportive	Balanced	Restrictive
Business Impact	Transformative	Significant	Modest

Transformational Impact

Redefinition of knowledge work, automation of creative processes. This evolution will necessitate significant changes in organizational structures, talent development, and strategic planning processes.

The convergence of multiple technological trends—including artificial intelligence, quantum computing, and ubiquitous connectivity—will create both unprecedented security challenges and innovative defensive capabilities.

Implementation Challenges

Ethical concerns, computing resource limitations, talent shortages. Organizations will need to develop comprehensive change management strategies to successfully navigate these transitions.

Regulatory uncertainty, particularly around emerging technologies like AI in security applications, will require flexible security architectures that can adapt to evolving compliance requirements.

Key Innovations to Watch

Multimodal learning, resource-efficient AI, transparent decision systems. Organizations should monitor these developments closely to maintain competitive advantages and effective security postures.

Strategic investments in research partnerships, technology pilots, and talent development will position forward-thinking organizations to leverage these innovations early in their development cycle.

Technical Glossary

Key technical terms and definitions to help understand the technologies discussed in this article.

Understanding the following technical concepts is essential for grasping the full implications of the security threats and defensive measures discussed in this article. These definitions provide context for both technical and non-technical readers.

platform intermediate

algorithm Platforms provide standardized environments that reduce development complexity and enable ecosystem growth through shared functionality and integration capabilities.

reinforcement learning intermediate

interface

machine learning intermediate

platform

neural network intermediate

encryption

deep learning intermediate

API

algorithm intermediate

cloud computing

large language model intermediate

middleware

interface intermediate

scalability Well-designed interfaces abstract underlying complexity while providing clearly defined methods for interaction between different system components.

generative AI intermediate

DevOps

API beginner

microservices APIs serve as the connective tissue in modern software architectures, enabling different applications and services to communicate and share data according to defined protocols and data formats.

How APIs enable communication between different software systems

Example: Cloud service providers like AWS, Google Cloud, and Azure offer extensive APIs that allow organizations to programmatically provision and manage infrastructure and services.

FunSearch: Making new discoveries in mathematical sciences using Large Language Models - Related to chemistry, language, building, quantum, agents

Building interactive agents in video game worlds

SHARE

FunSearch: Making new discoveries in mathematical sciences using Large Language Models

SHARE

FermiNet: Quantum physics and chemistry from first principles

SHARE

Market Impact Analysis

Market Growth Trend

Quarterly Growth Rate

Market Segments and Growth Drivers

Technology Maturity Curve

Competitive Landscape Analysis

Future Outlook and Predictions

Year-by-Year Technology Evolution

Technology Maturity Curve

Innovation Trigger

Peak of Inflated Expectations

Trough of Disillusionment

Slope of Enlightenment

Plateau of Productivity

Technology Evolution Timeline

Expert Perspectives

Areas of Expert Consensus

Short-Term Outlook (1-2 Years)

Mid-Term Outlook (3-5 Years)

Long-Term Outlook (5+ Years)

Key Risk Factors and Uncertainties

Alternative Future Scenarios

Optimistic Scenario

Base Case Scenario

Conservative Scenario

Scenario Comparison Matrix

Transformational Impact

Implementation Challenges

Key Innovations to Watch

Technical Glossary

platform intermediate

Related Terms

reinforcement learning intermediate

machine learning intermediate

neural network intermediate

deep learning intermediate

algorithm intermediate

large language model intermediate

interface intermediate

Related Terms

generative AI intermediate

API beginner

Related Terms

Related Articles

GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy - Related to a, advances, forecasting, faster, computing

AI achieves silver-medal standard solving International Mathematical Olympiad problems - Related to silver-medal, misuse, audio, mathematical, pushing

Google's new 'Ask For Me' AI tool calls businesses to get your questions answered - Related to invites, here's, advancing, works, new