SO WE KILLED
THE CONTEXT
WINDOW.
Managed recursive AI coding CLI. The root model writes code; agent swarms decompose impossible problems in parallel.
No context limits. No API keys. No config. The prompt is the environment, not the input.
Recursive Language
Model architecture.
A root orchestrator decomposes your task. Worker agents execute in parallel. Results synthesize back. No single model sees everything — but everything gets seen.
Your codebase loads as a variable in a sandboxed REPL — never into a model's context window. The orchestrator sees only metadata.
The root model generates JavaScript that runs in a Cloudflare V8 isolate. It reads slices of context, regex-filters with priors, chunks by AST.
llm_batch() dispatches worker agents in parallel. Each worker gets isolated context, returns a 1–2K token summary.
The result is built up across REPL turns and returned via FINAL_VAR() — bypassing every model's generation length cap.
Fully managed.
You just write code.
Cloudflare runs the models, orchestration, routing, and scaling on hardened infrastructure. Your source stays local. Only targeted context reaches workers in encrypted, ephemeral channels. Code is never stored or trained on.
Real numbers.
Reproducible.
Recursive Language Models close the gap on long-context benchmarks where base models collapse — at median cost equal to or below the base model.
| Benchmark | Input size | Base GPT-5 | RLM (GPT-5) | Avg cost |
|---|---|---|---|---|
| BrowseComp+ | 6–11M tokens | 0% | 91.3% | $0.99 |
| OOLONG | 131K tokens | 44.0% | 56.5% | $0.43 |
| OOLONG-Pairs | 32K tokens | 0.04% | 58.0% | $0.33 |
| CodeQA | 23K – 4.2M tokens | 24.0% | 62.0% | $0.11 |
▸ ZHANG, KRASKA, KHATTAB. MIT CSAIL. arXiv:2512.24601, JAN 2026.
50+ agents.
One command.
One bill.
A live snapshot of the orchestrator at work. Red nodes are active scouts; tan nodes are idle workers waiting for dispatch. The swarm churns through chunks at sub-second cadence.
Hot copy.
Cold start.
One npx command. Sign in. The swarm dispatches.