Verl-SWE-RL

Run Training

Backends

Kubernetes vs Docker sandboxes for Harbor trials

Every trial runs in a fresh sandbox. rl supports two interchangeable backends; dryrun.sh auto-detects which one is configured and validates it accordingly.

Kubernetes (production)

Each trial is a pod. This is the default for large, parallel runs.

runtime_info:
  input:
    harbor_agent:
      environment_import_path: harbor_patch.environments.remote_docker:RemoteDockerEnvironment
      agent_name: claude-code
    k8s:
      kubeconfig: /path/to/kubeconfig.yaml
      namespace: default

Preflight checks kubectl reachability against the kubeconfig. Pods are labeled harbor-run=<pod_name_prefix>; scripts/clean.sh --pods deletes them by that label (and refuses if the prefix is empty, to avoid deleting unrelated pods).

Docker (minimal)

Each trial is a local or remote container — no cluster required.

Local Docker (just needs docker on the training host):

harbor_agent:
  environment_import_path: harbor.environments.docker.docker:DockerEnvironment
  environment_force_build: true
  docker_host: ""        # empty → unix:///var/run/docker.sock

Remote Docker (sandbox on a separate machine):

harbor_agent:
  environment_import_path: harbor_patch.environments.remote_docker:RemoteDockerEnvironment
  environment_force_build: true
  docker_host: "tcp://<docker-host-ip>:2376"

Docker daemon security

tcp://<ip>:2375 is the unencrypted Docker daemon port — anyone who can reach it has root-equivalent access to that host. Use TLS (:2376 with dockerd --tlsverify) for shared/production environments; :2375 is acceptable only on isolated test networks.

The docker SDK

Docker mode needs the Python docker SDK in the venv (uv pip install --python $VENV/bin/python docker) — the docker CLI is not enough. Watch for namespace shadowing: a docker/ directory on sys.path (e.g. verl/docker/) can mask the real SDK. Verify with $VENV/bin/python -c "from docker import DockerClient; print('ok')".

The k8s section is ignored in Docker mode, and vice versa.

On this page