Verl-SWE-RL

Reference

Status & State

Where live run state lives and how it is written

config.yaml is one-shot per run — every key is configuration. Live state (what is running, what just finished) lives separately so the config stays a clean run profile.

Live run state — the process table

There is no status section in config.yaml. Whether a run is live right now is answered by the process table:

pgrep -af 'sync_1node_cc|train_1node_cc'   # launch wrappers
pgrep -af 'main_ppo'                       # verl trainer

/rl:check and /rl:run both probe these. /rl:run refuses to start while a matching training PID is alive; once nothing matches, the box is free — there is no stale flag to clean up after a crash.

artifacts/index.yaml

The append-only run ledger, one row per run, written by scripts/archive_run.sh's EXIT trap:

- id: run_001
  started_at: "..."
  completed_at: "..."
  status: completed        # running | completed | failed
  archive: artifacts/archives/run_001/
  notes: "one-line summary"

This is the source of truth for run history. The newest entry tells you what the block last did.

Visual state

For a live view, the dashboard reads the run logs directly — no status file required. It infers a run's state from log file mtime/size (running vs finished) and renders the per-step metrics.

On this page