Skip to content

v2.0.0 — "shell is pure client" arc

Released 2026-04-24. 21 commits from v1.0.0v2.0.0.

v1.0.0 introduced multi-host orchestration (convocate-host + convocate-agent, SSH peering, rsyslog TLS forwarding). v2.0.0 finishes the split: convocate no longer runs containers locally. Every session lives on a convocate-agent. The shell host is an orchestration client plus the image-build and DNS authority.

Plan (executed in order)

Phase Feature Commits
A Refuse to run convocate/convocate-agent under the wrong uid ec5b65e
B Agent list stamps live running state per session 7790eda
C Remote attach via convocate-agent-attach from the TUI 0314741
D Agent-selector dropdown in Create dialog 26e3237
E Strip local docker from convocate; orphan pre-v2 sessions be6a977
F Adopt pre-existing sessions during convocate-agent install 3716e73
G Shell→agent heartbeat + auto-reconnect in CRUDClient d0eac88
H1 Tag images with binary Version; Runner reads current-image pointer 00d80bf
H2 docker save | gzip | sha256 | ssh | docker load transfer helper af97c5d
H3 init-agent pushes current image + writes current-image 812df1c
H4 convocate-host update pushes image after binary update aedca6c
H5 Daily image-prune cron on agent (keep in-use + latest) 4e6427a
convocate-sessions.slice 90% cgroup cap on each agent 3efa55a
#2 Port split: agent :222, shell status :223 ec94090
#3 convocate-host migrate-session for orphan → agent moves c030f7a
#4 Restore DNS registration, map names to agent IPs 64943a2
#5 Move skel + claude-CLI provisioning to agent install cd8f8df
#6 Mark orphans with O status in the TUI 63e6501
#7 Route Enter on stopped remote session to not-running dialog 368d575
#8 Track attach state per session so TUI renders C for remote c54d555
#11 init-agent --ca-cert / --ca-key for off-shell workstations 0e35628

Architectural snapshot after v2.0.0

Image distribution. convocate install builds convocate:<semver> tagged with the shell binary's Version ldflag. init-agent + update ship that image to each agent via docker save | gzip | ssh | docker load, verifying a SHA-256 over the gzipped tarball on both ends. Each agent keeps a /etc/convocate-agent/current-image pointer file; container.Runner reads it at every Restart so upgrades roll forward session-by-session — already-running containers keep their original tag until restarted.

Retention: a daily cron on each agent keeps every convocate:* image that any container (running or stopped) references plus the current pointer target; everything else is docker rmi'd. Policy choice: no rollback, roll forward only.

Capacity. Every session container enrolls under convocate-sessions.slice via docker run --cgroup-parent=convocate-sessions.slice. The slice unit is rendered at convocate-agent install time from host totals: CPUQuota = nproc * 90%, MemoryMax = MemTotal * 90%. Kernel-enforced aggregate ceiling with 10% headroom for operator intervention.

Ports. Agent :222 serves the CRUD + attach SSH subsystems. Shell :223 hosts the status listener agents push heartbeats / lifecycle events to. Combined hosts (shell + agent on one machine) can coexist.

Peering. convocate-host init-agent mints two ed25519 keypairs and installs them on both ends. init-agent --ca-cert / --ca-key lets operators run the subcommand from a workstation instead of pinning to the shell host.

DNS. The shell is the cluster's dnsmasq authority. After every CRUD op, the router writes /var/lib/convocate/dnsmasq-hosts with one record per remote session's DNSName, pointing at the agent's resolved IP. Orphans are excluded.

Attach awareness. Agent-side TrackAttach / TrackDetach counters stamp Attached on every list / get response. The router reads it for IsLocked, which drives the TUI's C status indicator: operator B sees C on any row operator A is currently attached to.

Orphans. Pre-v2 local session.json files surface in the TUI marked with the O status. Every write op (Kill/Background/Restart/Override/ Update/Clone) returns errOrphanNeedsMigration. Delete still routes to the local Manager so operators can clean up after migrating.

Heartbeats. Both directions hold persistent SSH connections: StatusEmitter (agent→shell) and CRUDClient (shell→agent), each pinging every 30s and redialling with exponential backoff on failure.

Known limitations (deferred to v2.1+)

  1. Explicit cross-channel failure notification. Mutual heartbeats provide an implicit signal — each side's continuing heartbeat from the healthy direction proves the peer is alive even when the detecting direction is broken. An explicit "my outbound is broken, reconnecting" message would require a topology change (bidirectional status channel or a new subsystem). Deferred.
  2. Siloed heartbeat view. The systemd convocate-status.service and the TUI observe agent heartbeats independently; no shared cache. Deferred.
  3. Window-change resize is shell→agent only. Agent-side pty changes don't propagate back. Rarely matters in practice.
  4. No "Agent" column in the session table. Orphans are flagged with O; users distinguish remote sessions by UUID / name. Adding a column would widen the table past the test-harness screen width (110 cols) and requires a bigger churn.
  5. Remote clone stays on the source agent. Cross-host clones would need the migration pipeline applied to clone output.
  6. No rollback. Intentional. Roll forward only per project policy; keep every in-use image around so already-running containers stay functional across upgrades.

Upgrade procedure

From v1.0.0 hosts:

  1. git pull && make clean lint test build install on the shell host.
  2. convocate install — rebuilds the image with the v2 tag.
  3. For each registered agent: convocate-host update --host <agent>. This pushes the new binary AND the new image, rewrites /etc/convocate-agent/current-image, restarts convocate-agent.service.
  4. For each orphan session still needed: convocate-host migrate-session --agent <id> --session <uuid>. (docker stop convocate-session-<uuid> first if one is still running locally.)
  5. Verify in the TUI: every live session shows R or C status on its agent; any O-marked rows are legitimate pending migrations.

No downgrade path. Old containers keep running on their original image tag until Restart cuts them over to the current pointer.