Digital Ecosystems: Interactive Multi-Agent Neural Cellular Automata

Neural Cellular Automata learn local update rules through backpropagation, producing emergent global behaviour from purely local computation. Most NCA research is non-interactive: you set up an experiment, run it, and analyse the results after the fact. Batch sweeps map equilibria across parameter space but miss path-dependent transients that require sequential changes from a specific dynamical state.

This work lets you reach into the petri dish while it is running. We present a browser-based platform for live parameter steering of multi-agent neural cellular automata: observe a live ecosystem of competing neural species, adjust parameters and watch the response immediately, then save promising states and compare divergent outcomes from the same starting point — echoing the branching exploration of Picbreeder .

The system extends the Petri Dish NCA (PD-NCA) framework , where

N

neural species compete on a shared 2D grid via online gradient descent. We introduce algorithmic updates, including a tunable growth gate whose steepness acts as a learned Langton-

\lambda

parameter , controlling whether the ecosystem sits in a frozen, critical, or turbulent regime.

Background

A neural cellular automaton (NCA) is a 2D grid in which every cell carries a small state vector and a shared neural network repeatedly maps each cell's local neighbourhood to a state update; iterating produces emergent global structure from local rules , with extensions to texture synthesis , goal-conditioned control , and high-resolution rendering via implicit decoders .

Multi-agent extensions — where several species share a grid — have been explored in continuous domains like Lenia and Flow-Lenia , and in discrete CA settings like Biomaker CA and Coralai . Computational Life shows that self-replicating programs emerge spontaneously from simple computational substrates. These systems use evolutionary or rule-based dynamics; none trains multiple competing species by gradient descent during simulation. PD-NCA does:

N

neural species compete for territory via attack/defence vectors with online backpropagation. Ours is the first work to make such a system directly interactive in real time.

Interactive evolutionary computation and Picbreeder showed that human-guided branching discovers artefacts inaccessible to automated search; NetLogo , ALiEn , and recent automated open-ended search provide complementary exploration environments. For full details on the original PD-NCA framework, see the PD-NCA blog post.

Method

We introduce six algorithmic updates to PD-NCA, grouped below by what they achieve.

Making it stable New in v2

Two mechanisms prevent the ecosystem from collapsing during interactive exploration. Presence gating restricts competition to cells where a species already has territory: only cells within a dilated 3×3 neighbourhood of a species' existing footprint participate in the competition softmax. This prevents phantom influence from extinct regions and stabilises the dynamics.

Emergency respawn fires when a species' total aliveness drops below 1, injecting five seed cells at random non-wall locations. In the original PD-NCA, terminal extinction was permanent — a single bad gradient step could end a species forever. Respawn ensures the ecosystem remains explorable even after drastic parameter changes.

Targeting the edge of chaos New in v2

The growth gate controls which cells can participate in updates via a smooth sigmoid:

In PD-NCA v1, an aliveness threshold (default 0.4) zeroed out cells below the threshold with a hard cutoff, preventing gradient flow through the aliveness decision. The soft gate replaces this with a differentiable sigmoid. At the default

k_{\text{gate}}=20

, the sigmoid approximates a step function — but lowering it to $\sim$5 widens the transition zone, permitting stable intermediate aliveness values and pushing the system toward edge-of-chaos dynamics (see Case Study 1). The default threshold also shifts from 0.4 to 0.5, now viable because the smooth gate avoids the uninteresting dynamics v1 reported at higher thresholds (species expand until they meet, then freeze).

How species compete Updated from v1

Each species emits per-pixel attack and defence vectors. As in PD-NCA v1, competition uses cosine similarity: each species' L2-normalised attack vector is scored against the aggregate defence across all species. A temperature-controlled softmax (parameter

\tau

) over the resulting scores yields per-species competition weights that determine how much influence each species has at each cell. Lower

\tau

sharpens the competition toward winner-take-all. Presence gating ensures only species with local territory participate in this softmax.

A spatial concentration mechanism uses a sliding-window softmax over local attack/defence energy to focus each species' competitive force on high-activity cells rather than spreading it uniformly. This prevents dilute species from having disproportionate influence over distant cells.

Making it fair Updated from v1

Win-rate feedback is a positive-feedback mechanism: a per-cell exponential moving average tracks maximum local aliveness, and cells with high aliveness update more frequently, reinforcing established territories, while weakly-held cells update less often, damping oscillatory boundary dynamics and breaking synchronous-update artefacts.

The loss function replaces PD-NCA's simple log-population objective with a soft-minimum over per-species aliveness plus an entropy bonus:

\mathcal{L} = k^{-1}\log\sum_i \exp\bigl(-k \cdot \beta \cdot \operatorname{asinh}(\bar{a}_i / \beta)\bigr) - w_d \cdot \frac{H(\mathbf{p})}{\log N}

The soft-min (with

k=8

\beta=0.4

) focuses gradient on the weakest species, while the normalised entropy bonus (

w_d=0.4

) pushes toward balanced populations. The

\operatorname{asinh}

compression ensures small-population doublings are rewarded as much as large-population gains.

The model architecture is updated from v1's simple convolutions to MobileNetV2-style inverted residual blocks with a grouped per-species decoder, running entirely in the browser via TensorFlow.js .

The Platform

The platform is organised around live parameter steering: observe a running ecosystem, change something and see the response at once, then save states to compare what happens against the road not taken.

Researchers observe transient phenomena as they occur, form hypotheses, and test them by perturbing a live system from a saved state. The workflow has four key capabilities:

Real-time parameter control. Over 40 parameters — from competition temperature and learning rate to model architecture depth and kernel size — can be adjusted while the simulation runs. Changes take effect immediately, letting the researcher see how the ecosystem responds to each perturbation.

Timeline dashboard. A stacked area chart tracks per-species population over time, overlaid with checkpoint markers, parameter-change indicators, and sparklines for diversity and training loss. Hover reveals exact metrics at any step.

Checkpoint system. Five manual slots plus rolling auto-save. Each checkpoint stores the complete state: grid tensor, network weights, optimiser buffers, and metrics history. Loading a checkpoint restores the exact system state, enabling controlled comparisons. Checkpoints can be exported as .petri files for sharing.

Drawing tools. The researcher can paint walls (impassable barriers), erase them, and seed individual species by hand. Walls create biogeographic niches; seeding tests invasion dynamics. The platform runs entirely in the browser using TensorFlow.js , the SwissGL rendering library , and WebGL — no installation, no server, no GPU drivers. URL recipes encode seed, configuration, and event logs in $<$2KB fragments for sharing reproducible starting points.

The Control Surface

Growth-gate steepness (

k_{\text{gate}}

), survival threshold (

\theta

), and competition temperature (

\tau

) together span a three-dimensional regime space. The platform lets users move through it live — adjusting any of the three while the simulation runs and watching the dynamics respond.

The widget below is a flat map of that space. Toggle between τ vs θ (at a chosen

k_{\text{gate}}

) and $k_{\text{gate}}$ vs θ (at a chosen

\tau

); the heatmap colour is the gate survival probability — the chance a species passes the growth gate at each point, accounting for the flicker variance that arises at low

\tau

. At low

\tau

, per-cell winners flip every step, increasing aliveness variance and effectively smearing the gate boundary into the curved contours visible below. The dashed line marks

\theta = 1/N

: below it, a uniform population of

N

species survives the gate. Drag the sliders, jump to a case study with a preset, or use the Edge/Default ticks under

k_{\text{gate}}

; the pipeline panel shows the resulting softmax → gate → aliveness chain at the current point.

Case Studies

We report five case studies, each exercising a different mode of interaction with the platform.

Most of our parameter changes affect how the species train — learning rates, loss weights, optimiser choices. The growth gate steepness is different: it changes the simulator itself, reshaping which cells are alive and which are dead on every step.

At $k_{\text{gate}}=17.5$ , the sigmoid is nearly a step function. Species territories are flat, bistable, and clean. Perturbations heal within a few steps — a stable, ordered regime .

Reduce steepness to $k_{\text{gate}} \approx 4.9$ and the system crosses into the edge band (the critical regime). The sigmoid's transition zone widens, allowing stable intermediate aliveness values. Species develop distinct phenotypes: some churn aggressively at borders, others maintain stable cores with negotiating boundaries. Texture regenerates autonomously without any further parameter changes — confirming that edge dynamics are intrinsic, not transient.

Push further to $k_{\text{gate}} \approx 4.6$ and coherent structure collapses into the chaotic regime — flickering pixels with no stable territories.

Ordered

k_gate=17.5 step 350 α=0.90

Edge

k_gate=4.9 step 3237 α=0.48

Chaotic

k_gate=4.6 step 591 α=0.33

Succession at the edge (k_gate=4.6, steps 733–757) →

Step 733

Step 735

Step 739

Step 741

Step 743

Step 747

Step 749

Step 751

Step 755

Step 757

Baseline Edge

k_gate=4.8 τ=1.01 step 5181

+ Temperature

k_gate=4.8 τ=2.51 step 5424

+ Gate Up

k_gate=5.0 τ=1.01 step 1622

Re-Excite

k_gate=4.9 τ=1.01 step 3575

Growth-gate steepness as an edge-of-chaos control. Top: three regimes along one trajectory — ordered (flat bistable territories), edge (grooves, cooperation fronts, expanding waves), chaotic (mid-collapse). Middle: real-time succession at $k_{\text{gate}}$=4.6; texture regenerates autonomously without parameter changes. Bottom: perturbations from a stable edge state — raising temperature erases texture, raising the gate restores coverage, re-lowering re-introduces grooves.

The growth gate steepness plays the role of Langton's $\lambda$ parameter , but inside a learning system where species adapt to whatever value the user sets. A caveat: starting cold at $k_{\text{gate}} \lesssim 5$ collapses immediately — the system must warm up at high steepness first, then ramp down. Path dependence matters.

At $\tau=0.1$ (the GUI slider labelled "Sharpness"), the inter-species softmax is nearly one-hot. Small fluctuations determine the per-cell winner on each step, and that winner flips the next step as the losing species' gradients catch up. The result is a persistent flicker-mixing attractor: every pixel flickers between species on every step, yet the system as a whole is stable.

The survival threshold acts as a temporal filter. Low threshold ( $\leq 0.1$ ) shows the instantaneous competition: pixel-scale noise with no spatial coherence. At threshold 0.5, diffuse blobs emerge as the gate averages over a few steps. At threshold 0.8, only the durably dominant cells survive the strict gate, revealing fine-grained stable territories underneath the flickering.

Threshold 0.1: pixel flicker — `thr ≤ 0.1` — Pixel-level flicker

Threshold 0.5: diffuse blobs — `thr = 0.5` — Diffuse blobs

Threshold 0.8: stable territories — `thr = 0.8` — Stable territories

The homogeneous architecture cannot escape this attractor: competition dynamics produce rapid alternating winners, which only a strict gate filters into visible structure.

Every species in PD-NCA optimises a single objective: grow. There is no cooperation reward. PD-NCA v1 already observed that cooperation can emerge from purely competitive objectives; the interactive platform enables controlled investigation of how and when this happens. Under the right conditions, species learn to coexist in fine-grained interleaved patterns that persist for thousands of steps.

We apply a three-stage threshold protocol:

Permissive Mixing

thr = 0.2 (steps 0–500). All species intermingle freely.

→

Crystallisation

Ramp thr to 0.51 (steps 500–1250). Boundaries sharpen into clean territories.

→

Cooperative Relaxation

Reduce thr to 0.39. Former neighbours begin sharing territory.

The key finding: cooperation emerges selectively between species that were territorial neighbours during Stage 2. Non-neighbours remain separated, suggesting the crystallised "social structure" seeds the relaxed state. Specific species pairs develop checkerboard and dithered co-occupation of shared zones.

A cold-start control at constant $\tau=1.0$ , $\text{thr}=0.39$ also reaches cooperation by step $\sim$1500, so the dynamics appear to be an attractor of the regime itself. However, the cycled run concentrates cooperation into one dominant dyad while the cold-start distributes it across multiple pairs. We have not replicated across seeds and phrase this as "cycling appears to bias which species cooperate" rather than a general path-dependence claim.

View full comparison (cycled vs. cold-start control)

Cycled

Cold-start

Top: threshold-cycled protocol. Bottom: cold-start control at constant thr=0.39. Both reach cooperation, but via different paths.

We save a single state, branch it six ways, and observe three qualitatively different ecosystems emerge from identical initial conditions.

From a settled checkpoint ( $\tau=1.0$ , $\text{thr}=0.39$ ), we branch into three optimisers — plain SGD, SGD with momentum, and Adam — each at two learning rates.

Population dynamics diverge based on optimiser and learning rate. From identical initial conditions, SGD converges to equilibria, momentum produces rotating dominance, and Adam generates waves and bursts. Hover for exact values.

Plain SGD produces slow, monotonic growth into clean territorial equilibria at both learning rates. The system settles around 2,500–5,000 cells per species.

SGD with momentum generates coherent rotating dominance cycles: the momentum buffer carries updates long enough to push a species into runaway expansion before it crashes, handing dominance to the next species. At high learning rate, one species is permanently locked out — the only branch where a species effectively dies.

Adam produces wave-like repainting at low learning rate, with species swings of $\sim$9$\times$ amplitude over $\sim$200 steps. At high learning rate, single species capture $\sim$75% of the grid before collapsing in dominance bursts.

The mechanistic explanation is straightforward. SGD applies each gradient independently, so updates are small and self-correcting. Momentum accumulates gradient direction over time: when species A is winning, B's gradients consistently point toward attacking A's defence; momentum locks this in as inertia, and when B overtakes A the accumulated momentum overshoots, enabling C. The rotation is the momentum vector precessing through strategy space. Adam's per-parameter adaptive learning rate divides by running variance; during stable periods variance drops, the effective LR creeps up, and a small perturbation triggers a disproportionately large update, producing the burst-quiescence cycle.

In this experiment, optimiser choice determines the qualitative character of the simulation more than learning rate does. The differences are visible within $\sim$50 steps of branching.

This case study perturbs the environment rather than parameters. We painted an intricate wall field on a 200×200 grid and seeded five species by hand, then ran SGD with $\tau=0.99$ , $\text{thr}=0.43$ , lr$=1.3 \times 10^{-3}$.

Colonise + 1st cut

a. seed (14)

b. growing (344)

c. settled (399)

d. 1st cut (556)

e. filling (609)

f. invaded (722)

Fight-back + equilibrium

g. fighting (831)

h. pushing (887)

i. equilib. (997)

j. stable (1062)

k. dwell (1395)

l. long dwell (2020)

2nd cut + regrowth

m. 2nd cut (2102)

n. +2 steps

o. +7 steps

p. +13 steps

q. +38 steps

r. +70 steps

Biogeographic construction over ~2,200 steps. Top (a–f): colonisation, then wall erasure; nearest species race into the gap. Middle (g–l): fight-back, wall-hugging equilibrium persists >1,000 steps. Bottom (m–r): two eraser slashes open corridors; one species dominates briefly before the balancing loss restores equilibrium.

Colonisation and first cut. After seeding (a), species grow out to fill their chambers (b–c). At step ~550 we erase a section of wall near the centre-bottom (d). The nearest species races into the opened space and temporarily dominates it (e–f).

Fight-back and wall-hugging. Over the next ~400 steps the population-balancing loss slows the invaders and the displaced species pushes back (g–h), reaching a new equilibrium (i). In this settled state, species exhibit a notable behaviour: they hug walls, growing along them and using them as defensible safe zones that anchor territory. Species borders follow wall geometry rather than cutting across open space, and enclaves persist for over 1,000 steps inside protective alcoves (j–l).

Second cut and regrowth. At step ~2,100 we slash two large eraser strokes across the grid (m), clearing species and walls. One species dominates the gap initially (n–p), but the balancing loss slows it and competitors grow back within ~70 steps (q–r). Unlike scattering seed cells (which the equilibrium absorbs within ~10 steps), this perturbation is not absorbed: the topology of territorial boundaries has changed.

Walls shape which configurations the competition dynamics can sustain. Drawing or erasing them in situ is a direct way to test how geometry constrains dynamics.

Discussion

Interactive exploration is most valuable in the hypothesis-generation phase. Each of our five case studies revealed something that would be difficult to discover through batch parameter sweeps alone: the edge-of-chaos transition emerges from gradual ramping, not cold-start; cooperation is path-dependent, seeded by territorial history; optimiser dynamics diverge within 50 steps of branching; and wall geometry shapes equilibria in ways that are impossible to anticipate without drawing them.

The growth gate steepness suggests a pattern worth testing in other differentiable ALife systems with threshold mechanisms: a single scalar can move the system between frozen, critical, and turbulent regimes, and exposing it as an interactive parameter lets the user navigate between them in real time.

The emergency respawn re-seeds a species when its total aliveness drops below 1, so the dynamics here are quasi-stationary coexistence under an immigration floor, not emergent stability. Disabling respawn would reveal whether any observed equilibria are self-sustaining.

The parameter space contains many more phenomena than five studies can sample. Holding the survival threshold at 0.6 produces a rigid equilibrium; raising

\tau > 3

at that threshold collapses species into isolated migrating blobs. These are visible within a few seconds of adjusting the sliders.

We acknowledge important limitations. Each case study is a single trajectory (

n=1

) — an observation, not a statistical claim. WebGL simulation is not bitwise-reproducible across devices ($\sim$26% cell disagreement at boundaries after 94 steps), though qualitative dynamics are consistent. The current platform supports

N=5

species; scaling to larger populations remains unexplored. We have explored five paths through a vast parameter space. There are hundreds more.

Live parameter steering bridges interactive discovery and systematic replication: a researcher can explore by hand, identify a candidate phenomenon, export the checkpoint, and hand it to a scripted sweep for quantification. Automated evolutionary approaches — for example, population-based training with novelty-driven competition across many parallel worlds — offer a third, complementary mode for surfacing diverse phenomena at scale.

Digital Ecosystems

Digital Ecosystems

Introduction

Background

Method

Making it stable New in v2

Targeting the edge of chaos New in v2

How species compete Updated from v1

Making it fair Updated from v1

The Platform

The Control Surface

Case Studies

Growth Gate as Learned Edge of Chaos

Extreme Competition Temperature

Emergent Cooperation via Threshold Cycling

Permissive Mixing

Crystallisation

Cooperative Relaxation

Optimiser Choice and Learning Rate

Biogeographic Construction

Discussion

Citation

Code

Acknowledgements