Command Queue (2026-01-16)
We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.Why
- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.
How it works
- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).
runEmbeddedPiAgentenqueues by session key (lanesession:<key>) to guarantee only one active run per session.- Each session run is then queued into a global lane (
mainby default) so overall parallelism is capped byagents.defaults.maxConcurrent. - When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.
Queue modes (per channel)
Inbound messages can steer the current run, wait for a followup turn, or do both:steer: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.followup: enqueue for the next agent turn after the current run ends.collect: coalesce all queued messages into a single followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.steer-backlog(akasteer+backlog): steer now and preserve the message for a followup turn.interrupt(legacy): abort the active run for that session, then run the newest message.queue(legacy alias): same assteer.
collect/steer if you want
one response per inbound message.
Send /queue collect as a standalone command (per-session) or set messages.queue.byChannel.discord: "collect".
Defaults (when unset in config):
- All surfaces →
collect
messages.queue:
Queue options
Options apply tofollowup, collect, and steer-backlog (and to steer when it falls back to followup):
debounceMs: wait for quiet before starting a followup turn (prevents “continue, continue”).cap: max queued messages per session.drop: overflow policy (old,new,summarize).
debounceMs: 1000, cap: 20, drop: summarize.
Per-session overrides
- Send
/queue <mode>as a standalone command to store the mode for the current session. - Options can be combined:
/queue collect debounce:2s cap:25 drop:summarize /queue defaultor/queue resetclears the session override.
Scope and guarantees
- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
- Default lane (
main) is process-wide for inbound + main heartbeats; setagents.defaults.maxConcurrentto allow multiple sessions in parallel. - Additional lanes may exist (e.g.
cron,subagent) so background jobs can run in parallel without blocking inbound replies. - Per-session lanes guarantee that only one agent run touches a given session at a time.
- No external dependencies or background worker threads; pure TypeScript + promises.
Troubleshooting
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
- If you need queue depth, enable verbose logs and watch for queue timing lines.
