With the recent Nobel Prize awarded for radical advances in protein discovery, foundation models (FMs) for exploring large combinatorial spaces promise to revolutionize many scientific fields. Artificial Life (ALife) has not yet integrated FMs, thus presenting a major opportunity for the field to alleviate the historical burden of relying chiefly on manual design and trial-and-error to discover the configurations of lifelike simulations. This paper presents, for the first time, a successful realization of this opportunity using vision-language FMs. The proposed approach, called Automated Search for Artificial Life (ASAL), (1) finds simulations that produce target phenomena, (2) discovers simulations that generate temporally open-ended novelty, and (3) illuminates an entire space of interestingly diverse simulations. Because of the generality of FMs, ASAL works effectively across a diverse range of ALife substrates including Boids, Particle Life, Game of Life, Lenia, and Neural Cellular Automata. A major result highlighting the potential of this technique is the discovery of previously unseen Lenia and Boids lifeforms, as well as cellular automata that are open-ended like Conway's Game of Life. Additionally, the use of FMs allows for the quantification of previously qualitative phenomena in a human-aligned way. This new paradigm promises to accelerate ALife research beyond what is possible through human ingenuity alone.
A core philosophy driving Artificial Life (ALife) is to study not only "life as we know it" but also "life as it could be"
While the specific mechanisms for evolution and learning within ALife simulations are rich and diverse, a major obstacle so far to fundamental advances in the field has been the lack of a systematic method for searching through all the possible simulation configurations themselves. Without such a method, researchers must resort to intuitions and hunches when devising perhaps the most important aspect of an artificial world--the rules of the world itself.
Part of the challenge is that large-scale interactions of simple parts can lead to complex emergent phenomena that are difficult, if not impossible, to predict in advance
Given this present improvisational state of the field, a method to automate the search for simulations themselves would transform the practice of ALife by significantly scaling the scope of exploration. Instead of probing for rules and interactions that feel right, researchers could refocus their attention to the higher-level question of how to best describe the phenomena we ultimately want to emerge as an outcome, and let the automated process of searching for those outcomes then take its course.
Describing target phenomena for simulations is challenging in its own right, which in part explains why automated search for the right simulation to obtain target phenomena has languished
While we don't yet understand why or how our universe came to be so complex, rich, and interesting, we can still use it as a guide to create compelling ALife worlds.
Foundation models (FMs) trained on large amounts of natural data possess representations often similar to humans
In this spirit, we propose a new paradigm for ALife research called Automated Search for Artificial Life (ASAL). The researcher starts by defining a set of simulations of interest, referred to as the substrate. Then, as shown in the figure above, ASAL enables three distinct methods for FMs to identify interesting ALife simulations:
The promise of this new automated approach is demonstrated on a diverse range of ALife substrates including Boids, Particle Life, Game of Life, Lenia, and Neural Cellular Automatas. In each substrate, ASAL discovered previously unseen lifeforms and expanded the frontier of emergent structures in ALife. For example, ASAL revealed exotic flocking patterns in Boids, new self-organizing cells in Lenia, and identified cellular automata which are open-ended like the famous Conway's Game of Life. In addition to facilitating discovery, ASAL's FM framework allows for quantitative analysis of previously qualitative phenomena in ALife simulations, providing a human-like approach to measuring complexity. ASAL is agnostic to both the specific FM and the simulation substrate, enabling compatibility with future FMs and ALife substrates.
Overall, our new FM-based paradigm serves as a valuable tool for future ALife research by stepping towards the field's ultimate goal of exploring the vast space of artificial life forms. To the best of our knowledge, this is the first work to drive ALife simulation discovery through foundation models.
The figure above depicts our proposed paradigm, Automated Search for Artificial Life (ASAL), which includes three algorithms built on vision-language FMs. Each method discovers ALife simulations through a different kind of automated search. Before diving into the details, relevant concepts and notations are introduced next.
An ALife substrate, encompasses any set of ALife simulations of interest (e.g. the set of all Lenia simulations). These could vary in the initial states, transition rules, or both. is parameterized by , which defines a single simulation with three components:
While parameterizing and searching for a renderer is often not needed, it becomes necessary when dealing with state values that are uninterpretable a priori. Chaining these terms together, we define a function of that samples an initial state , runs the simulation for steps, and renders that final state as an image:
Finally, two additional functions and embed images and natural language text through the vision-language FM, along with a corresponding inner product to facilitate similarity measurements for that embedding space.
An important goal in ALife is to find simulations where a desired event or sequence of events take place
For this purpose, ASAL searches for a simulation that produces images that match a target natural language prompt in the FM's representation. The researcher has control of which prompt, if any, to apply at each timestep.
A grand challenge of ALife is finding open-ended simulations
Although open-endedness is subjective and hard to define, novelty in the right representation space captures a general notion of open-endedness
With this novel capability, ASAL searches for a simulation which produces images that are historically novel in the FM's representation. Some preliminary experiments showed that historical nearest neighbor novelty produces better results than variance based novelty.
Another key goal in ALife is to automatically illuminate the entire space of diverse phonemena that can emerge within a substrate, motivated by the quest to understand "life as it could be"
Towards this aim, ASAL searches for a set of simulations that produce images that are far from their nearest neighbor in the FM's representation. We find that nearest neighbor diversity produces better illumination than variance based diversity.
This section experimentally validates the effectiveness of ASAL across various substrates, then presents novel quantitative analyses of some of the discovered simulations, facilitated by the FM. Before presenting the experiments, here is a summary of the FMs and substrates used. The appendix includes additional details about the substrates and experimental setups, as well as supplementary experiments.
Foundation Models
Substrates
This section explores both single targets and sequences of targets over time.
The effectiveness of searching for target simulations specified by a single prompt is explored in Lenia, Boids, and Particle Life.
The supervised target equation is optimized with the prompt applied once after simulation timesteps.
CLIP is the FM and Sep-CMA-ES
The figure below shows the optimization works well from a qualitative perspective at finding simulations matching the specified prompt. The results of more prompts are shown in appendix. Some of the failure modes suggest that when optimization fails, it is often caused by the lack of expressivity of the substrate rather than the optimization process itself.
We investigate the effectiveness of searching for simulations producing a target sequence of events using the NCA substrate.
We optimize the supervised target equation with a list of prompts, each applied at evenly spaced time intervals of the simulation rollout.
We use CLIP for the FM.
Following the original NCA paper, we use backpropagation through time and gradient descent with the Adam optimizer for the optimization algorithm
The figure below shows it is possible to find simulations that produce trajectories following a sequence of prompts.
By specifying the desired evolutionary trajectories and employing a constraining substrate, ASAL can identify update rules that embody the essence of the desired evolutionary process.
For instance, when the sequence of prompts is "one cell"
then "two cells"
, the corresponding update rule inherently enables self-replication.
To investigate the effectiveness of searching for open-ended simulations, we use the Life-Like CAs substrate and optimize the open-endedness score. CLIP serves as the FM. Because the search space is relatively small with only , simulations, brute force search is employed.
The figure below reveals the potential for open-endedness in the Life-like CAs.
The famous Conway's Game of Life ranks among the top 5% most open-ended CAs according to our open-endedness metric.
The top subfigure shows the most open-ended CAs demonstrate nontrivial dynamic patterns that lie on edge of chaos, since they neither plateau or explode
We use the Lenia and Boids substrates to study the effectiveness of the illumination algorithm. CLIP is the FM. A custom genetic algorithm performs the search: at each generation, it randomly selects parents, creates mutated children, then keeps the most diverse subset of solutions.
The resultant set of simulations is shown in the "Simulation Atlas" in the figure below. This visualization highlights the diversity of the discovered behaviors organized by visual similarity. In Lenia, ASAL discovers many previously unseen lifeforms resembling cells and bacteria organized by color and shape. In Boids, ASAL rediscovers flocking behavior, as well as additional behaviors such as snaking, grouping, circling, and other variations. Larger simulation atlases are shown in appendix.
Not only can FMs facilitate the search for interesting phenomena, but they also enable the quantification of phenomena previously only amenable to qualitative analysis, as shown in this section.
The following figures show different ways of quantifying the emergent behaviors of these complex systems.
In the figure below, we linearly interpolate the parameters between two Boids simulations. The intermediate simulations lack the characteristics of either simulation and appear disordered, demonstrating the nonlinear, chaotic nature of the boids parameter space. Importantly, this qualitative observation is now possible to support quantitatively by measuring the CLIP similarity of the final states of the intermediate simulation to both the original simulations.
The figure below evaluates the effect of the number of particles in Particle Life on its ability to represent certain lifeforms.
In this case we search for "a caterpillar"
and find that they can only be found when there are at least particles in the simulation, matching the "more is different" observation
The next figure quantifies the importance of each of the simulation parameters in Particle Life by individually sweeping each parameter and measuring the resulting standard deviation of the CLIP prompt alignment score. After identifying the most important parameter, this corresponds with the strength of interaction between the green and yellow particles, which is critical for the formation of the caterpillar.
The figure below shows that the speed of change of the CLIP vector over simulation time for a Lenia simulation. This metric plateaus exactly when the simulation looks to have become qualitatively static, providing a useful simulation halting condition.
To study the importance of using the proper representation space, we ablate the FM used during illumination of Lenia and Boids. We use CLIP, DINOv2, and a low level pixel representation.
As shown in the figure below, for producing human-aligned diversity, CLIP seems slightly better than DINOv2, but both are qualitatively better than the pixel representation. This result highlights the importance of deep FM representations over low-level metrics when measuring human notions of diversity.
ALife Motivations
ALife is a diverse field that studies life through artificial simulations, with the key difference from biology being its pursuit of the general properties of all life rather than just the specific instantiation of life on Earth
ALife Substrates
Many substrates are used in ALife to study phenomena at different levels of abstraction and imposed structure, ranging from modeling chemistry to societies.
Conway's Game of Life and other "Life-Like" cellular automatas (CA) were critical to the field in the early days and are used to study how complexity may emerge from simple local rules
Our method aims to be substrate agnostic, with the constraint that the substrate can be displayed as an image.
The majority of ALife substrates are made renderable for human interpretability, including the non-visual substrates like the program based ones
Automatic Search Algorithms in ALife
Automatic search has been a useful tool in ALife whenever the target outcome is well defined.
In the early days, genetic algorithms were used to evolve CAs to produce a target computation
Novelty search
Intrinsically motivated discovery uses search in the representation space of an autoencoder to discover new self-organizing patterns
Characterizing Emergence
Many attempt have been made to quantify complexity
Open-endedness (OE) is the ability of a system to keep generating interesting artifacts forever, which is one of the defining features of natural evolution
Foundation Models for Automatic Search
Large pretrained neural networks, often referred to as foundation models (FMs)
The potential of using FMs, particularly large language models (LLMs), for ALife and vice versa was highlighted in
In this work, we use CLIP, an image-language embedding model trained with contrastive learning to align text-image representations on an internet-scale dataset
Summary This project launches a new paradigm in ALife by taking the first step towards using FMs to automate the search for interesting simulations. Our approach is effective in finding target, open-ended, and diverse simulations over a wide spectrum of substrates. Additionally, FMs enable the quantification of many qualitative phenomena in ALife, offering a path to replacing low-level complexity metrics with deep representations aligned with humans.
Discussion
Because this project is agnostic to the FM and substrate used, it raises the question of which ones to use.
The choice of FMs seems to not matter much from our experiments, and FMs in general may also be converging to similar representations of reality
The proper substrate largely depends on the phenomena that is being studied (e.g. self-organization, open-ended evolution, etc.). The most expressive substrate would simply parameterize all the RGB pixels of an entire video, but is useless for studying emergence. The most insightful substrates bake in as little information as possible, while maintaining vast emergent capabilities. For example, the periodic table of elements can be defined with little information, yet gives rise to the entirety of the observable universe.
Eventually, with the proper substrate, more powerful FMs, and enough compute, this paradigm may allow researchers to automatically search for worlds which start off as "simple cells in primordial soup"
, then undergo "a Cambrian explosion of complexity"
, and eventually become "an artificial alien civilization"
.
Researchers could alternatively search for hypothetical worlds where life evolves without DNA.
Finding open-ended worlds would solve one of ALife's grand challenges
This work can be generalized by replacing the image-language FM with video-language FMs that natively process the temporal nature of simulations
If you would like to discuss any issues or give feedback, please visit the GitHub repository of this page for more information.
We thank Ettore Randazzo for a discussion on the framing of the project. This work was supported by an NSF GRFP Fellowship to A.K., a Packard Fellowship and a Sloan Research Fellowship to P.I., and by ONR MURI grant N00014-22-1-2740.
For attribution in academic contexts, please cite this work as
Akarsh Kumar and Chris Lu and Louis Kirsch and Yujin Tang and Kenneth O. Stanley and Phillip Isola and David Ha, "Automating the Search for Artificial Life with Foundation Models", 2024.
BibTeX citation
@article{kumar2024asal, title = {Automating the Search for Artificial Life with Foundation Models}, author = {Akarsh Kumar and Chris Lu and Louis Kirsch and Yujin Tang and Kenneth O. Stanley and Phillip Isola and David Ha}, year = {2024}, url = {https://asal.sakana.ai/} }
We release our code for this project here.
Please view the PDF version of the paper for the appendix, which contains additional details and experiments.