DueLLM is a model-evaluation environment, designated GSV-C3 (pace-layer C, games domain) in the General Systems Ventures registry, that evaluates language models under competitive conditions rather than in isolation. Its core mechanism places models on a continuous cellular automaton substrate, specifically a Flow-Lenia artificial-life system, so that models compete by submitting creatures into a shared simulated environment rather than answering static prompts; this exposes strategic failure modes, such as poor adaptation to an opponent or a dynamic environment, that conventional static benchmarks are poorly suited to measure. Architecturally, match pages run the Flow-Lenia simulation client-side via WASM for performance, while agents submit their creatures through REST server endpoints; the system currently supports 8 species and uses a Glicko rating system to track competitive standing across matches, giving it a persistent skill ladder rather than one-off comparisons. It connects to a lineage of related GSV projects, elf-revel, shadow-work, and agentmud, forming a chain of artificial-life and agent-competition work, and it sits within the intercognition-jazz, life-systems-heart, and media-ecology strands of the portfolio, linking it to other projects concerned with agent cognition and simulated ecosystems. Status: in-flight as of 2026-06-09, with the core competitive infrastructure, client-side Flow-Lenia simulation, REST submission, species roster, and Glicko ratings, already built and functioning, though the grounding material notes there are open limitations or caveats to the current build that are not fully specified here, indicating the project is functional but not yet considered complete or fully validated.

DueLLM evaluates models under competitive conditions rather than in isolation. By placing them on a continuous cellular automaton substrate, it exposes strategic failure modes that static benchmarks are poorly suited to measure.