Automating the Search for Artificial Life with Foundation Models

Examples of discovered artificial lifeforms.

In this work, we automatically search for interesting Artificial Life (ALife) simulations using a Foundation-Model-based framework called ASAL. Here, we show examples of the simulations ASAL discovered. In Lenia (left), ASAL discovers a diverse set of dynamic self-organizing patterns reminiscent of real cells. In Boids (middle left), ASAL discovers exotic emergent flocking behavior. In Particle Life and Particle Life++ (middle right), ASAL discovers dynamic open-ended ecosystems of agentic patterns. In Game of Life (right), ASAL identifies novel cellular automata rules that are more open-ended and expressive than the original Conway’s Game of Life.

Abstract

With the recent Nobel Prize awarded for radical advances in protein discovery, foundation models (FMs) for exploring large combinatorial spaces promise to revolutionize many scientific fields. Artificial Life (ALife) has not yet integrated FMs, thus presenting a major opportunity for the field to alleviate the historical burden of relying chiefly on manual design and trial-and-error to discover the configurations of lifelike simulations. This paper presents, for the first time, a successful realization of this opportunity using vision-language FMs. The proposed approach, called Automated Search for Artificial Life (ASAL), (1) finds simulations that produce target phenomena, (2) discovers simulations that generate temporally open-ended novelty, and (3) illuminates an entire space of interestingly diverse simulations. Because of the generality of FMs, ASAL works effectively across a diverse range of ALife substrates including Boids, Particle Life, Game of Life, Lenia, and Neural Cellular Automata. A major result highlighting the potential of this technique is the discovery of previously unseen Lenia and Boids lifeforms, as well as cellular automata that are open-ended like Conway's Game of Life. Additionally, the use of FMs allows for the quantification of previously qualitative phenomena in a human-aligned way. This new paradigm promises to accelerate ALife research beyond what is possible through human ingenuity alone.

Overview: Our method, ASAL, searches for interesting ALife simulations by using a vision-language foundation model to evaluate the simulation's produced videos. ALife lifeforms are discovered across different substrates with three different mechanisms: (1) found via a text prompt, (2) found via searching for open-ended simulations, and (3) illuminating a set of diverse simulations.

Introduction

A core philosophy driving Artificial Life (ALife) is to study not only "life as we know it" but also "life as it could be" . Because ALife primarily studies life through computational simulations, this approach necessarily means searching through and mapping out an entire space of possible simulations rather than investigating any single simulation. By doing so, researchers can study why and how different simulation configurations give rise to distinct emergent behaviors. In this paper, we aim, for the first time, to automate this search through simulations with help from foundation models from AI.

While the specific mechanisms for evolution and learning within ALife simulations are rich and diverse, a major obstacle so far to fundamental advances in the field has been the lack of a systematic method for searching through all the possible simulation configurations themselves. Without such a method, researchers must resort to intuitions and hunches when devising perhaps the most important aspect of an artificial world--the rules of the world itself.

Part of the challenge is that large-scale interactions of simple parts can lead to complex emergent phenomena that are difficult, if not impossible, to predict in advance . This disconnect between the simulation configuration and its resulting behavior makes it difficult to intuitively design simulations that exhibit self-replication, ecosystem-like dynamics, or open-ended properties. As a result, the field often delivers manually designed simulations tailored to simple and anticipated outcomes, limiting the potential for unexpected discoveries.

Given this present improvisational state of the field, a method to automate the search for simulations themselves would transform the practice of ALife by significantly scaling the scope of exploration. Instead of probing for rules and interactions that feel right, researchers could refocus their attention to the higher-level question of how to best describe the phenomena we ultimately want to emerge as an outcome, and let the automated process of searching for those outcomes then take its course.

Describing target phenomena for simulations is challenging in its own right, which in part explains why automated search for the right simulation to obtain target phenomena has languished . Of course, there have been many previous attempts to quantify ALife through intricate measures of life , complexity , or "interestingness" . However, these metrics almost always fail to fully capture the nuanced human notions they try to measure .

While we don't yet understand why or how our universe came to be so complex, rich, and interesting, we can still use it as a guide to create compelling ALife worlds. Foundation models (FMs) trained on large amounts of natural data possess representations often similar to humans and may even be converging toward a 'platonic' representation of the statistics of our real world . This novel property makes them appealing candidates for quantifying human notions of complexity in ALife.

In this spirit, we propose a new paradigm for ALife research called Automated Search for Artificial Life (ASAL). The researcher starts by defining a set of simulations of interest, referred to as the substrate. Then, as shown in the figure above, ASAL enables three distinct methods for FMs to identify interesting ALife simulations:

The promise of this new automated approach is demonstrated on a diverse range of ALife substrates including Boids, Particle Life, Game of Life, Lenia, and Neural Cellular Automatas. In each substrate, ASAL discovered previously unseen lifeforms and expanded the frontier of emergent structures in ALife. For example, ASAL revealed exotic flocking patterns in Boids, new self-organizing cells in Lenia, and identified cellular automata which are open-ended like the famous Conway's Game of Life. In addition to facilitating discovery, ASAL's FM framework allows for quantitative analysis of previously qualitative phenomena in ALife simulations, providing a human-like approach to measuring complexity. ASAL is agnostic to both the specific FM and the simulation substrate, enabling compatibility with future FMs and ALife substrates.

Overall, our new FM-based paradigm serves as a valuable tool for future ALife research by stepping towards the field's ultimate goal of exploring the vast space of artificial life forms. To the best of our knowledge, this is the first work to drive ALife simulation discovery through foundation models.

Methods: Automated Search for Artificial Life

ASAL: Our proposed framework, ASAL, uses vision-language foundation models to discover ALife simulations by formulating the processes as three search problems. Supervised Target: To find target simulations, ASAL searches for a simulation which produces a trajectory in the foundation model space that aligns with a given sequence of prompts. Open-Endedness: To find open-ended simulations, ASAL searches for a simulation which produces a trajectory that has high historical novelty during each timestep. Illumination: To illuminate the set of simulations, ASAL searches for a set of diverse simulations which are far from their nearest neighbor.

The figure above depicts our proposed paradigm, Automated Search for Artificial Life (ASAL), which includes three algorithms built on vision-language FMs. Each method discovers ALife simulations through a different kind of automated search. Before diving into the details, relevant concepts and notations are introduced next.

An ALife substrate,

\mathcal{S}

encompasses any set of ALife simulations of interest (e.g. the set of all Lenia simulations). These could vary in the initial states, transition rules, or both.

\mathcal{S}

is parameterized by

\theta

, which defines a single simulation with three components:

While parameterizing and searching for a renderer is often not needed, it becomes necessary when dealing with state values that are uninterpretable a priori. Chaining these terms together, we define a function of

\theta

that samples an initial state

s_0

, runs the simulation for

T

steps, and renders that final state as an image:

Finally, two additional functions

\mathtt{VLM}_\mathtt{img}(\cdot)

and

\mathtt{VLM}_\mathtt{txt}(\cdot)

embed images and natural language text through the vision-language FM, along with a corresponding inner product

\left<\cdot, \cdot\right>

to facilitate similarity measurements for that embedding space.

Supervised Target

An important goal in ALife is to find simulations where a desired event or sequence of events take place . Such discovery would allow researchers to identify worlds similar to our own or test whether certain counterfactual evolutionary trajectories are even possible in the given substrate, thus giving insights about the feasibility of certain lifeforms.

For this purpose, ASAL searches for a simulation that produces images that match a target natural language prompt in the FM's representation. The researcher has control of which prompt, if any, to apply at each timestep.

Open-Endedness

A grand challenge of ALife is finding open-ended simulations . Finding such worlds is necessary for replicating the explosion of never-ending interesting novelty that the real world is known for.

Although open-endedness is subjective and hard to define, novelty in the right representation space captures a general notion of open-endedness . This formulation outsources the subjectivity of measuring open-endedness to the construction of the representation function, which embodies the observer. In this paper, the vision-language FM representations act as a proxy for a human's representation .

With this novel capability, ASAL searches for a simulation which produces images that are historically novel in the FM's representation. Some preliminary experiments showed that historical nearest neighbor novelty produces better results than variance based novelty.

Illumination

Another key goal in ALife is to automatically illuminate the entire space of diverse phonemena that can emerge within a substrate, motivated by the quest to understand "life as it could be" . Such illumination is the first step to mapping out and taxonomizing an entire substrate.

Towards this aim, ASAL searches for a set of simulations that produce images that are far from their nearest neighbor in the FM's representation. We find that nearest neighbor diversity produces better illumination than variance based diversity.

Experiments

This section experimentally validates the effectiveness of ASAL across various substrates, then presents novel quantitative analyses of some of the discovered simulations, facilitated by the FM. Before presenting the experiments, here is a summary of the FMs and substrates used. The appendix includes additional details about the substrates and experimental setups, as well as supplementary experiments.

Searching for Target Simulations

Single Target

The effectiveness of searching for target simulations specified by a single prompt is explored in Lenia, Boids, and Particle Life. The supervised target equation is optimized with the prompt applied once after

T

simulation timesteps. CLIP is the FM and Sep-CMA-ES is the optimization algorithm.

The figure below shows the optimization works well from a qualitative perspective at finding simulations matching the specified prompt. The results of more prompts are shown in appendix. Some of the failure modes suggest that when optimization fails, it is often caused by the lack of expressivity of the substrate rather than the optimization process itself.

Discovered target simulations: Using the supervised target equation, ASAL discovered simulations that result in a final state which matches the specified prompt. Results are shown for three different substrates.

Temporal Targets

We investigate the effectiveness of searching for simulations producing a target sequence of events using the NCA substrate. We optimize the supervised target equation with a list of prompts, each applied at evenly spaced time intervals of the simulation rollout. We use CLIP for the FM. Following the original NCA paper, we use backpropagation through time and gradient descent with the Adam optimizer for the optimization algorithm .

The figure below shows it is possible to find simulations that produce trajectories following a sequence of prompts. By specifying the desired evolutionary trajectories and employing a constraining substrate, ASAL can identify update rules that embody the essence of the desired evolutionary process. For instance, when the sequence of prompts is "one cell" then "two cells", the corresponding update rule inherently enables self-replication.

Discovered temporal target simulations: Using the supervised target equation, ASAL discovered simulations that produce a sequence of events which match a list of prompts. The second row shows how the first simulation generalizes to a different initial state. The results are shown for the NCA substrate.

Searching for Open-Ended Simulations

To investigate the effectiveness of searching for open-ended simulations, we use the Life-Like CAs substrate and optimize the open-endedness score. CLIP serves as the FM. Because the search space is relatively small with only

262

144

simulations, brute force search is employed.

The figure below reveals the potential for open-endedness in the Life-like CAs. The famous Conway's Game of Life ranks among the top 5% most open-ended CAs according to our open-endedness metric. The top subfigure shows the most open-ended CAs demonstrate nontrivial dynamic patterns that lie on edge of chaos, since they neither plateau or explode . The bottom left subfigure traces out the trajectories of three CAs in CLIP space over simulation time. Because the FM's representations are related to human representations, producing novelty in the trajectory through the FM's representation space yields a sequence of novelty to a human observer as well. The bottom right subfigure visualizes all Life-Like CAs with a UMAP plot of their CLIP embeddings colored by open-endedness score, and shows that meaningful structure emerges: the most open-ended CAs lie close together on a small island outside the main island of simulations. More discovered CAs are shown in appendix.

Discovered open-ended simulations: Using the open-endedness equation, ASAL discovered open-ended simulations in the Life-Like CAs substrate. Simulations are labeled in Golly notation to denote the number of living neighbors required for birth and survival. (a) The discovered CAs rendered over a simulation rollout. (b) The temporal trajectories of three simulations in CLIP space. The pixel-space simulation (red) exhibits a convergent trajectory, whereas the FM-space simulation (green) demonstrates a more divergent trajectory, even exceeding that of Conway's Game of Life (blue). (a) All Life-like CAs plotted based on the UMAP projection of the CLIP embedding of their final state, colored by open-endedness score. The resulting structure reveals distinct islands of similar simulations, with the most open-ended CAs grouped together near the bottom.

Illuminating Entire Substrates

We use the Lenia and Boids substrates to study the effectiveness of the illumination algorithm. CLIP is the FM. A custom genetic algorithm performs the search: at each generation, it randomly selects parents, creates mutated children, then keeps the most diverse subset of solutions.

The resultant set of simulations is shown in the "Simulation Atlas" in the figure below. This visualization highlights the diversity of the discovered behaviors organized by visual similarity. In Lenia, ASAL discovers many previously unseen lifeforms resembling cells and bacteria organized by color and shape. In Boids, ASAL rediscovers flocking behavior, as well as additional behaviors such as snaking, grouping, circling, and other variations. Larger simulation atlases are shown in appendix.

Simulation Atlas: ASAL discovered a large set of diverse simulations by using the illumination algorithm on the Lenia and Boids substrates. The resulting final states of these simulations are then embedded using CLIP and projected into 2-D with UMAP . This space is then grid sampled and the nearest simulation within that tile is shown. This simulation atlas maps all discovered simulations in an organized manner. The top left insets show randomly sampled simulations without illumination. Larger simulation atlases can be found in the appendix.

Quantifying ALife

Not only can FMs facilitate the search for interesting phenomena, but they also enable the quantification of phenomena previously only amenable to qualitative analysis, as shown in this section.

The following figures show different ways of quantifying the emergent behaviors of these complex systems.

In the figure below, we linearly interpolate the parameters between two Boids simulations. The intermediate simulations lack the characteristics of either simulation and appear disordered, demonstrating the nonlinear, chaotic nature of the boids parameter space. Importantly, this qualitative observation is now possible to support quantitatively by measuring the CLIP similarity of the final states of the intermediate simulation to both the original simulations.

The figure below evaluates the effect of the number of particles in Particle Life on its ability to represent certain lifeforms. In this case we search for "a caterpillar" and find that they can only be found when there are at least

1,000

particles in the simulation, matching the "more is different" observation .

The next figure quantifies the importance of each of the simulation parameters in Particle Life by individually sweeping each parameter and measuring the resulting standard deviation of the CLIP prompt alignment score. After identifying the most important parameter, this corresponds with the strength of interaction between the green and yellow particles, which is critical for the formation of the caterpillar.

The figure below shows that the speed of change of the CLIP vector over simulation time for a Lenia simulation. This metric plateaus exactly when the simulation looks to have become qualitatively static, providing a useful simulation halting condition.

Agnostic to Foundation Model

To study the importance of using the proper representation space, we ablate the FM used during illumination of Lenia and Boids. We use CLIP, DINOv2, and a low level pixel representation.

As shown in the figure below, for producing human-aligned diversity, CLIP seems slightly better than DINOv2, but both are qualitatively better than the pixel representation. This result highlights the importance of deep FM representations over low-level metrics when measuring human notions of diversity.

Importance of Foundation Models: The FM is ablated in the illumination experiments. CLIP seems to be slightly better than DINOv2 in creating human-aligned diversity, but both are significantly better than a pixel based representations.

Related Works

ALife Motivations ALife is a diverse field that studies life through artificial simulations, with the key difference from biology being its pursuit of the general properties of all life rather than just the specific instantiation of life on Earth . ALife systems range widely from cellular automata to neural network agents, but the field generally focuses on emergent phenomena like self-organization, open-ended evolution, agentic behavior, and collective intelligence . These ALife ideas have also trickled into AI .

ALife Substrates Many substrates are used in ALife to study phenomena at different levels of abstraction and imposed structure, ranging from modeling chemistry to societies. Conway's Game of Life and other "Life-Like" cellular automatas (CA) were critical to the field in the early days and are used to study how complexity may emerge from simple local rules . Lenia generalizes these to continuous dynamics , and inspired future variants like FlowLenia and ParticleLenia . Neural Cellular Automata (NCA) further generalize Lenia by modeling any continuous CA by parameterizing the update rule with a neural network . Instead of operating in a 2-D grid, ParticleLife (or Clusters) uses particles in euclidean space interacting with each other to create dynamic self-organizing patterns . Similarly, Boids uses bird-like objects to model the flocking behavior of real birds and fish . BioMaker CA and JaxLife are structured substrates designed to study the agentic behavior of plants and societies , joining other notable substrates like Evolved Virtual Creatures , Polyworld , and Neural MMO that focus on studying survival in natural environments. Some substrates are constructed to study more exotic phenomena, such as self-replicating programs . All of these systems are designed to explore specific aspects of life, with a common theme of emergence from simple components.

Our method aims to be substrate agnostic, with the constraint that the substrate can be displayed as an image. The majority of ALife substrates are made renderable for human interpretability, including the non-visual substrates like the program based ones .

Automatic Search Algorithms in ALife Automatic search has been a useful tool in ALife whenever the target outcome is well defined. In the early days, genetic algorithms were used to evolve CAs to produce a target computation . More recently, uses objectives specific to Lenia (e.g. speed) to search for new organisms. BioMaker uses a objective which measured how many agents survived after many timesteps . NCA's objective is to make the final state look like a given target image .

Novelty search is a search algorithm inspired by ALife but requires a good representation space to be effective. MAP-Elites is a search algorithm which searches along two predefined axis of interest.

Intrinsically motivated discovery uses search in the representation space of an autoencoder to discover new self-organizing patterns . LeniaBreeder uses MAP-Elites to search for organisms which have specific properties for e.g. mass and speed. Although LeniaBreeder does provide an unsupervised algorithm as well, it has only been shown to work in Lenia and cannot guide searches via prompt or an open-ended formulation. Additionally, learning an autoencoder using only images from the substrate may not learn human-aligned representations due to the lack of data diversity .

Characterizing Emergence Many attempt have been made to quantify complexity . In information theory, Kolmogorov complexity measures the length of the shortest computer program that produces an artifact . Rather than measuring the complexity of an artifact directly, sophistication measures the complexity of the set in which the artifact is a "generic" member . Stemming from biochemistry, assembly theory hopes to quantify evolution by measuring the minimal number of steps required to assemble an artifact from atomic building blocks or previously assembled pieces . Although theoretically compelling, these metrics are not computable or fail to capture the nuanced human notions of complexity .

claims that most complex systems are subject to computational irreducibility, meaning that the emergent behavior of such systems cannot be reduced to a simple theory.

Open-endedness (OE) is the ability of a system to keep generating interesting artifacts forever, which is one of the defining features of natural evolution . Many necessary conditions for OE have been identified but are yet to be realized . There have been some attempts at quantifying OE , but some argue that OE cannot be quantified by definition . In one case study, human intervention was essential for achieving OE evolution , suggesting that OE may depend on novelty within a particular representation space. This aligns with the idea that while all interesting things are novel, not all novel things are inherently interesting .

Foundation Models for Automatic Search Large pretrained neural networks, often referred to as foundation models (FMs) , are currently revolutionizing many scientific domains. In medicine, FMs transformed the drug discovery process by enabling accurate predictions of protein folding . In robotics, LLMs have automated the design of reward functions, alleviating a typically tedious task for humans . In physics, large models are used to predict complex systems and are later distilled into symbolic equations . AI systems have even reached Olympiad-level performance in solving geometry problems .

The potential of using FMs, particularly large language models (LLMs), for ALife and vice versa was highlighted in . LLMs have been applied in ALife-like contexts as code mutation operators and for proposing next goals . These applications are limited to text-based search spaces and often rely on LLMs within the inner simulation loop, which is not applicable when analyzing dynamical systems governed by simple update rules.

In this work, we use CLIP, an image-language embedding model trained with contrastive learning to align text-image representations on an internet-scale dataset . CLIP’s simplicity and generality enable it to effectively render images from vector strokes or shapes , and to guide generation in other generative models such as VQ-GAN and Diffusion . We apply CLIP to search for ALife simulations instead of static images, resulting in analyzable artifacts that provide valuable insights for ALife research.

Conclusion

Summary This project launches a new paradigm in ALife by taking the first step towards using FMs to automate the search for interesting simulations. Our approach is effective in finding target, open-ended, and diverse simulations over a wide spectrum of substrates. Additionally, FMs enable the quantification of many qualitative phenomena in ALife, offering a path to replacing low-level complexity metrics with deep representations aligned with humans.

Discussion Because this project is agnostic to the FM and substrate used, it raises the question of which ones to use. The choice of FMs seems to not matter much from our experiments, and FMs in general may also be converging to similar representations of reality .

The proper substrate largely depends on the phenomena that is being studied (e.g. self-organization, open-ended evolution, etc.). The most expressive substrate would simply parameterize all the RGB pixels of an entire video, but is useless for studying emergence. The most insightful substrates bake in as little information as possible, while maintaining vast emergent capabilities. For example, the periodic table of elements can be defined with little information, yet gives rise to the entirety of the observable universe.

Eventually, with the proper substrate, more powerful FMs, and enough compute, this paradigm may allow researchers to automatically search for worlds which start off as "simple cells in primordial soup", then undergo "a Cambrian explosion of complexity", and eventually become "an artificial alien civilization". Researchers could alternatively search for hypothetical worlds where life evolves without DNA. Finding open-ended worlds would solve one of ALife's grand challenges . Illuminating such a substrate could help map the space of possible lifeforms and intelligences, giving a taxonomy of life as it could be in the computational universe.

This work can be generalized by replacing the image-language FM with video-language FMs that natively process the temporal nature of simulations or with 3-D FMs to handle 3-D simulations. To leverage the recent advances of LLMs, images can be converted to text via image-to-text models, allowing all analyses to be done in text space. Instead of searching for ALife simulations, a similar approach could be constructed for low-level physics research. For example, in Wolfram's Physics Project , one could search for the hypergraph update rule which emerges structures that an FM considers natural. At a meta-level, LLMs could be useful for generating code that describes the substrates themselves, driven by higher-level research agendas, similar to .

If you would like to discuss any issues or give feedback, please visit the GitHub repository of this page for more information.

This page requires Javascript. Please enable it to view the website.

Automating the Search for Artificial Life with Foundation Models

Abstract

Introduction

Methods: Automated Search for Artificial Life

Supervised Target

Open-Endedness

Illumination

Experiments

Searching for Target Simulations

Single Target

Temporal Targets

Searching for Open-Ended Simulations

Illuminating Entire Substrates

Quantifying ALife

Agnostic to Foundation Model

Related Works

Conclusion

Acknowledgements

Citation

Open Source Code

Appendix