An open-source orchestrator for scalable, disposable development environments – built for training reinforcement learning models and running AI agents.
Training LLMs with reinforcement learning (RL) on coding tasks requires running tens or hundreds of thousands of generation-reward cycles. And each cycle needs a clean, isolated environment with the right source code and tools.
Scale
A single machine can't run hundreds of parallel environments that reset between iterations in seconds. You need a distributed system built for that from the ground up.
Isolation
Agents must not interfere with each other. Every environment needs its own filesystem, process tree, and network to fully reset between episodes.
Latency
Spinning up a new container for every cycle is too slow at scale. Environments need to restart or reset in under a second to keep training throughput high.
What IdeGYM does
IdeGYM is a Kubernetes-based orchestration framework that manages the full lifecycle of development environments, from image build to teardown.
Environments
IdeGYM spins up isolated environments and tears them down when finished — no manual cleanup.
Any project, any image
IdeGYM loads projects from a Git URL, archive, or mounted volume, and it builds custom Docker images via a plugin API.
Request forwarding
IdeGYM proxies requests from your training loop directly to running pods and returns responses so you can compute rewards offline or replay episodes.