Product Jun 6, 2026 9 min read

~ / freestyle-team ❯

The Best AI Sandbox for Background Processes

The hardest part of an AI sandbox is not running a command.

The hard part is everything that keeps running after the command returns.

A serious coding agent starts dev servers, test watchers, package installers, database daemons, queue workers, browser automation processes, language servers, build tools, log streams, and repair loops. Some of those processes should exit. Some should restart. Some should stay alive while the user closes the browser. Some should be visible in a terminal later, after the agent has already moved on to another task.

That is where narrow code sandboxes get uncomfortable. They are usually designed around a single request: send code in, get output back. Background processes are not shaped like that. They are machine behavior.

The best AI sandbox for background processes is a real Linux VM. Freestyle VMs are the most powerful VMs for AI agents: hardware-virtualized, real Linux machines that can run forever when configured that way, while still exposing programmatic controls for commands, lifecycle, terminals, ports, snapshots, and cleanup.

The background process test

If you are evaluating an AI agent sandbox, ask it to do something that requires more than one foreground command.

Start a web server. Keep a file watcher running. Run a worker that prints one line every few seconds. Install a service under systemd. Follow its logs from a terminal. Detach. Reconnect. Expose the server on HTTPS. Stop the machine intentionally. Start it again. Resize it if the workload grows. Delete it cleanly when the session is done.

That sequence sounds ordinary because it is ordinary. It is how software behaves before it is packaged into a polished platform.

The sandbox either supports that lifecycle naturally, or your application has to simulate it. You add a job table, a log buffer, a process registry, a preview router, a reconnect path, a restart policy, and a cleanup system. At some point the "sandbox" becomes a half-built operating system with a provider-specific API in front of it.

Agents do not need a more elaborate imitation of a computer. They need a real one.

Background work is not a return value

A command runner has a simple mental model:

run this
return stdout
return stderr
return exit code

That model is useful. Freestyle VMs expose vm.exec() because many tasks really are one-shot operations. Install a package, run a test, write a script, inspect a directory, or check a process.

But background processes do not fit cleanly into a single response. A dev server is useful because it keeps serving. A queue worker is useful because it waits for future jobs. A file watcher is useful because it reacts to changes. A language server is useful because it stays warm. A database is useful because other processes can connect to it.

If the platform treats those as failed commands, the agent loses the most important part of the environment. It can run code, but it cannot operate software.

That distinction matters for product teams building coding agents, app builders, browser agents, eval harnesses, internal automation, and data tools. The agent may begin with code execution, but the product quickly becomes process orchestration. The runtime needs to hold multiple living things at once and let the agent inspect them through normal Linux tools.

A process needs a supervisor

The clean way to run background software on Linux is not to hope that a shell stays open forever. It is to use the operating system.

Freestyle's docs show this pattern across service guides. A Vite dev server can run under systemd so the service stays alive and restarts. A VM domain mapping can route public HTTPS traffic to a port inside the VM while the process listens on 0.0.0.0. Docker can run as a normal daemon under systemd, with Compose stacks, published ports, live logs, native overlayfs storage, cgroup v2, and bridge networking.

Those details are not ornamental. They are the difference between a sandbox that can run a demo and a sandbox that can host the messy middle of agent work.

When an agent starts a service under systemd, the product gets operating-system semantics:

the service has a unit name
the process has a restart policy
logs are available through the journal
the service can bind a port
the service can start on boot
the service can be stopped without losing the whole machine

That is the interface agents already know how to use. They can run systemctl status, follow journalctl, inspect open ports, update a unit, restart a daemon, and keep working. The platform does not need to invent a custom background-job protocol for every kind of process.

Terminals are how agents debug running things

Logs are not just a storage problem. They are an interaction problem.

When a process is still running, the agent often needs to watch it, interrupt it, send input, resize a terminal, detach from the session, and come back later. exec() is intentionally too coarse for that. It buffers a command and returns when the command ends.

Freestyle VMs expose persistent PTY sessions for the cases where the program is interactive or long-lived. A PTY is a real pseudo-terminal inside the VM. It can be opened over WebSocket, written to, detached from, and reattached later. Sessions survive client disconnects and VM suspends. They can drive REPLs, editors, debuggers, package managers, dev servers, and log tails without forcing the product to respawn each command.

That matters because background processes usually fail in motion. The install script hangs on a prompt. The server prints an error only after the first request. The watcher recompiles on save. The test runner waits for a file change. The debugger needs a signal. The agent needs to read the last screen, send Ctrl-C, edit something, and try again.

Those are terminal workflows. They should remain terminal workflows.

Ports are part of the process model

Many background processes are valuable because they accept traffic.

An app builder needs a preview. A browser agent needs a target application. A webhook tester needs a callback. A notebook server needs an HTTPS URL. A worker dashboard needs a port. A local API needs another service to call it.

Freestyle VM domains map public HTTPS traffic from a hostname to a port inside a VM. The docs show the normal sequence: verify the domain, point DNS at Freestyle, map the domain to a VM port, and run a service in the VM that listens on that port. For HTTP servers, the service listens on 0.0.0.0, and HTTPS is provisioned automatically.

That is exactly how background process infrastructure should feel. The preview is not a screenshot feature or a temporary artifact from a command response. It is a real service running on a real machine, reachable through normal network routing.

This becomes important when agents build full applications. The generated app may run a frontend on one port, an API on another, a worker beside it, and a database daemon underneath. The agent should be able to inspect all of that with standard tools instead of translating the environment into a set of bespoke sandbox concepts.

Snapshots make services reusable

Background processes also change how you think about setup cost.

If every run starts from an empty sandbox, the agent pays the same setup penalty again and again: install the runtime, install packages, configure the service, start the daemon, wait until the port answers, then finally do useful work.

Freestyle snapshots let you turn a prepared VM into a reusable base. The docs show snapshots after installing language runtimes, starting Vite under systemd, starting Docker, and confirming services answer before capturing. A VM created from that snapshot can come up with the expensive pieces already prepared.

This is especially useful for agent products because most sessions have a common shape. Your product may always need Node, Python, Docker, a browser, a package cache, a specific CLI, or a service supervisor. Bake that baseline once, then create machines from it as work arrives.

The result is not just faster startup. It is a simpler agent loop. The agent begins in an environment that already looks like the work it is about to do, instead of spending its first several minutes rebuilding the same foundation.

Idle timeout is a product decision

Background processes raise a practical question: should the machine stay up?

There is no single answer. A quick code execution task should probably stop when idle. A preview session during active editing should stay responsive. A worker that is waiting for future input may need to run indefinitely. A heavyweight setup may be worth suspending and waking later rather than deleting.

Freestyle VMs make that lifecycle explicit. A VM can run, stop, start again, resize, and delete. If a workload should stay running until your application stops or deletes it, set idleTimeoutSeconds to null. If the workload should be reclaimed after inactivity, give it an idle timeout.

That is the right abstraction boundary. The platform gives you real machine lifecycle controls. Your product decides when a background process is valuable enough to keep alive.

For agents, this is not just a billing setting. It is part of the user experience. A support agent, app builder, browser agent, or internal automation tool may need to resume the same process context later. A batch task may be disposable. A preview may need to be live only while the user is present. Treating those as distinct lifecycle choices keeps the product honest.

What to look for

When you evaluate an AI sandbox for background processes, ignore the happy-path runCode() demo for a minute.

Ask whether the sandbox can run a normal service supervisor. Ask whether it can expose ports directly from processes inside the environment. Ask whether it can stream logs while the process is still alive. Ask whether a terminal can detach and reattach. Ask whether a process can survive a client disconnect. Ask whether the environment can run indefinitely when the product needs it. Ask whether you can stop, start, resize, snapshot, and delete it through an API.

Also ask what happens when the agent does something unplanned. Can it install the missing package? Can it follow the service logs? Can it inspect the port? Can it restart the daemon? Can it open a second terminal? Can it use the same Linux commands a developer would use on a server?

If the answer is no, you are not just missing a feature. You are adding application complexity around every process-shaped task the agent will encounter.

The bottom line

The best AI sandbox for background processes is not a better command runner. It is a real VM.

Agents need to run software, not just snippets. Software includes foreground commands, but it also includes services, daemons, ports, terminals, logs, supervisors, watchers, queues, and long-running repair loops. Those things already have a mature interface: Linux.

Freestyle VMs give agents that interface with the isolation and API surface a product needs. They are hardware-virtualized Linux VMs that can run forever when configured to do so, expose services over HTTPS, keep interactive PTYs alive across reconnects, run supervised processes, snapshot prepared environments, resize for heavier work, and cleanly stop or delete when the job is done.

If your agent product depends on background work, choose the sandbox that treats background work as normal. Choose the computer.