BenchCI Agent¶
BenchCI Agent runs on machines connected to real hardware. It exposes an HTTP API for remote execution, registered benches, run status, events, artifacts, and remote GPIO.
The Agent can also run in Cloud Mode, where it polls the BenchCI backend for assignments and reports results back to the backend.
What the Agent does¶
The Agent can:
accept uploaded-bench runs
accept registered-bench runs
queue runs
enforce one active run per bench
execute
run_local(...)near the hardwareexpose structured run events
package and serve artifacts
provide remote GPIO endpoints for split deployments
connect to the BenchCI backend as a cloud execution worker
Where the Agent fits¶
The Agent is the remote execution layer in BenchCI.
It is used when you want:
developer machines to stay separate from hardware machines
CI pipelines to trigger real hardware tests over the network
multiple reusable benches behind one machine
registered bench IDs instead of repeatedly uploading bench definitions
split deployments where a Linux machine exposes GPIO remotely
In the backend-controlled cloud path, the Agent also acts as the execution worker that polls the backend for assigned runs and uploads results after execution.
Start the Agent¶
benchci agent serve
Default settings:
host: 0.0.0.0
port: 8080
Authentication¶
If BENCHCI_AGENT_TOKEN is set, the Agent requires Authorization: Bearer <token> for protected endpoints.
Example:
export BENCHCI_AGENT_TOKEN=secure-token
benchci agent serve
Health check¶
curl http://localhost:8080/health
The health response includes information such as:
service name
Agent version
agent_idagent_namequeue depth
whether auth is enabled
number of active GPIO sessions
whether registered bench mode is enabled
Registered benches¶
The Agent can optionally load agent.yaml and register benches at startup.
Example:
agent:
id: lab-agent-1
name: Main hardware lab
benches:
- id: nucleo-uart
bench_file: /opt/benchci/benches/nucleo-uart/bench.yaml
tags: [stm32, uart]
- id: plc-modbus
bench_file: /opt/benchci/benches/plc-modbus/bench.yaml
tags: [modbus, industrial]
When agent.yaml is present, the Agent loads the file, resolves each bench_file, and exposes those benches through the API.
Bench endpoints¶
List benches:
curl -H "Authorization: Bearer $BENCHCI_AGENT_TOKEN" \
http://localhost:8080/v1/benches
Get one bench:
curl -H "Authorization: Bearer $BENCHCI_AGENT_TOKEN" \
http://localhost:8080/v1/benches/nucleo-uart
Bench summaries include:
bench_idagent_idbench name and description
tags
busy/idle status
current run ID
capability summary:
transports
whether GPIO exists
whether power resources exist
flash backends
node count
Run submission modes¶
Uploaded-bench mode¶
The client uploads:
bench.yamlsuite.yamloptional artifact file
skip_flashoptional
verbose
This is backward-compatible and does not require a pre-registered bench.
Registered-bench mode¶
The client submits JSON containing:
bench_idsuite_yamloptional uploaded artifact payload
skip_flashoptional
verbose
This mode reuses a bench already known by the Agent and is the preferred model for shared hardware infrastructure.
Run endpoints¶
Create uploaded-bench run:
POST /v1/runs
Create registered-bench run:
POST /v1/runs/json
Fetch run status:
GET /v1/runs/{run_id}
Fetch run events:
GET /v1/runs/{run_id}/events
Download artifacts ZIP:
GET /v1/runs/{run_id}/artifacts.zip
Cloud Agent mode¶
In Cloud Agent mode, the Agent does not wait for direct run submissions from a user or CI runner. Instead, it polls the BenchCI backend.
Example:
benchci agent cloud \
--backend https://benchci-backend.fly.dev \
--token YOUR_AGENT_TOKEN \
--bench bench.yaml \
--bench-id my-bench \
--agent-name "Lab Agent 01"
The cloud Agent loop:
sends heartbeat
syncs bench summary/capabilities
polls for the next assignment
executes the assigned suite with
run_local(...)sends structured events
uploads artifacts
reports completion
The Agent token is created during workspace/customer onboarding by the BenchCI owner/admin process.
Run lifecycle¶
Agent runs move through these states:
queuedpreparingrunninguploading_artifactsdonefailed
Each run stores:
current status
mode (
uploadedorregistered)optional
bench_idexit_codetimestamps
current test and step
structured events
Verbose runs¶
BenchCI supports verbose execution mode for both uploaded-bench and registered-bench runs.
When verbose mode is enabled:
the runner produces richer step-level diagnostics
failure messages include additional context
more detailed log information may appear in artifacts
event payloads may include richer execution details
run behavior stays the same; only observability changes
Verbose mode is controlled by the CLI using --verbose and is propagated through the Agent to the underlying run_local(...) execution.
Notes:
verbose output is primarily visible in artifacts and structured events
the Agent does not stream raw verbose terminal output directly to clients
the best place to inspect verbose remote diagnostics is the downloaded artifact ZIP
Scheduling model¶
The Agent keeps:
one queue for submitted runs
one lock per registered bench
This means:
a bench can only run one job at a time
different benches can still be queued independently
uploaded-bench runs are serialized through a synthetic uploaded-bench lock
Events¶
The Agent stores structured events emitted by the runner, such as:
run.startedtest.startedstep.startedstep.finishedstep.failedrun.finishedrun.failed
These events make it possible to build richer CLI or UI progress views.
Remote GPIO¶
The Agent also exposes remote GPIO endpoints used by remote_gpio benches:
POST /v1/gpio/session/startPOST /v1/gpio/session/stopPOST /v1/gpio/setPOST /v1/gpio/getPOST /v1/gpio/wait_valuePOST /v1/gpio/wait_edge
This allows one Linux machine to control GPIO on behalf of a runner or another Agent-accessible workflow.
Artifacts¶
Runs produce artifacts such as:
results.jsonper-node transport logs
flash.loggpio.logpower logs when power resources are used
Verbose runs may produce more detailed artifact content for debugging and failure analysis.
The Agent packages the run results directory as a ZIP and serves it through the artifacts endpoint, or uploads it to the backend in Cloud Agent mode.