Capacity and isolation¶
convocate enforces resource limits at two layers. Both layers cap at 90% of the host's CPU and memory — a deliberate buffer so the host itself, plus the agent process, plus any non-convocate workloads, have at least 10% headroom.
Layer 1 — admission control (in-process, at create time)¶
When the operator presses (N)ew, the agent's
SessionOrchestrator.Create runs an admission check before
docker run:
- Sum the CPU + memory + disk of every existing container in
convocate-sessions.slice(usesdocker inspect). - Compute the delta the new session would add (from the operator's port/protocol/DNS spec; CPU/RAM are baseline-per-session).
- If
existing + delta > 0.9 × host capacity, refuse with a quantitative error message:
admission denied: would push convocate-sessions.slice to 92% of host
CPU (8.6 of 9.0 cores allowed); current usage 7.4 cores across 6
sessions
This catches greedy creates before they happen, gives a clear error to the operator, and prevents the kernel-level cap from having to intervene.
Layer 2 — kernel-enforced cgroup cap¶
On every agent, convocate-host init-agent writes
/etc/systemd/system/convocate-sessions.slice:
[Unit]
Description=Slice for all convocate session containers (90% host cap)
[Slice]
CPUAccounting=yes
MemoryAccounting=yes
CPUQuota=<nproc * 90>%
MemoryMax=<MemTotal * 0.9>
Every session container is created with --cgroup-parent
convocate-sessions.slice, so the kernel enforces the aggregate cap
regardless of what Layer 1 did. If a session manages to spawn beyond
its planned quota (e.g. a process forks aggressively), the kernel
caps it.
This is belt-and-braces by design:
- Layer 1 gives nice errors but only catches what comes through the
RPC path. A session created directly via
docker run(out-of-band) bypasses Layer 1 entirely. - Layer 2 is the kernel — it enforces no matter what, even on sessions we didn't admit.
Why 90%, not 100%?¶
The 10% reserve goes to:
- The host operating system, journald, sshd, etc.
- The
convocate-agentprocess itself - The Docker daemon
- Any non-convocate workload on the same machine
Without this reserve, a hot session can push the host into OOM territory where the kernel starts reaping random processes, and "random" in OOM-killer language often means "the agent" — at which point your control plane is dead and you can't even tell the rogue session to stop.
Why per-host, not global?¶
The cap is per-agent, not cluster-wide. There's no global capacity tracker. Reasons:
- Each agent host has its own physical resources; aggregate cap doesn't translate cleanly across machines.
- Distributed capacity tracking introduces a coordinator that becomes a single point of failure.
- Operators can already see global utilization in the TUI's session list and route new sessions to a less-loaded agent themselves.
Other isolation primitives¶
Per-session container¶
Each session has its own convocate-session-<uuid> container. Sessions
can't see each other's processes, files, or network namespaces.
Per-session home dir¶
Each session has its own /home/claude/<uuid>/ directory on the
agent. That directory is bind-mounted into the container as
/home/claude/. State (Claude conversation history, project files,
git checkouts) is per-session, persists across detach/reattach, and
is destroyed on (D)elete.
Read-only shares¶
The host's claude user's ~/.claude/, ~/.ssh/, and ~/.gitconfig
are bind-mounted read-only into every container. Sessions get the
same Claude account, the same SSH identity, the same git identity —
but can't modify the source-of-truth files.
No host network¶
Sessions use Docker's default bridge network. They can reach the
internet (for apt install, git clone, etc.) and the host's
loopback (for the dnsmasq integration to resolve their own DNS
names), but they can't see other containers' interfaces or other
hosts' private networks.