Mono-run is the architectural choice that makes Seshat feel like a real runtime instead of a loose collection of wrappers. One process owns the session, prompt assembly, provider call, tool execution, permissions, compaction, persistence, and recovery. The system is modular internally, but the execution contract is unified.
Mono-run does not mean one giant package. It means one runtime boundary. The same executable owns the conversation state, the model call, the tool surface, the permission rules, and the local persistence tree, instead of delegating those responsibilities to external orchestrators.
In Seshat, the execution path is intentionally narrow. The entry surface may change, but the runtime core does not. A CLI command, a gRPC call, or a higher-level product like seshat-ai all converge on the same core layers: sdk.Client, Engine, Session, and Loop.
That matters because the hard parts of agent execution are not split across unrelated services. Prompt assembly, provider routing, permissions, tool execution, transcript compaction, and session persistence live inside the same runtime boundary. The result is less orchestration drift, fewer duplicated contracts, and much stronger recovery semantics.
Terminal command surface
Remote runtime access
Embedding and local control
Desktop and product layer
Wires providers, prompt, tools, permissions, runtime memory, browser, persistence, and monitoring once.
Owns long-lived runtime wiring and session creation.
Owns canonical transcript, counters, tool surface, and permission state.
Runs the turn until a legitimate terminal state exists.
Builder + Assembler compose the provider-facing system prompt.
Safety checks, modes, prompts, and session approvals stay in-path.
Registry + SurfaceBuilder derive one deterministic tool surface.
Prompt assembly, model call, tool execution, recovery, persistence, and continuation stay inside the same execution owner.
Streaming request path, retry strategy, fallback client, and circuit breaker.
Orchestrator validates, gates, batches, executes, and formats tool calls.
Compaction, checkpoints, artifacts, plans, and local session persistence.
One local root holds sessions, checkpoints, artifacts, logs, plans, skills, caches, and runtime-owned state. Recovery stays physically attached to execution.
The outer surfaces are replaceable. The inner execution path is not. That is the core mono-run invariant: every serious interaction with the runtime collapses into the same session engine rather than creating parallel orchestration systems for each interface.
The most explicit wiring path lives in pkg/sdk/client.go. The SDK constructs the provider client, execution orchestrator, compaction engine, prompt assembler, permission engine, tool registry, session store, browser manager, and memory service, then injects them into engine.NewEngine().
Mono-run in Seshat is session-centric. Engine.NewSessionFromState() creates the live tool surface for that session, loads memory, restores metadata, and constructs a SessionState object that survives across turns. The session is not a thin wrapper around a provider API. It owns canonical messages, permission context, discovered deferred tools, token counters, compaction metadata, and turn advancement.
This is one of the most important design decisions in the codebase. The fundamental unit is not a raw request or a transient workflow node. It is a durable session that can resume, recover, and keep responsibility for the evolving transcript.
The loop does not stop at the first model response. It keeps ownership until it can legitimately terminate or continue with recovery.
Initialize turn state and create the bounded execution owner for this query.
Prepare counters, transcript state, recovery context, and compact early if the context window is already tight.
Send the provider request, stream the answer, classify failures, and preserve recovery context for the same turn.
If the model wants tools, the turn is not done yet.
Validate, gate, batch, execute, format, append results, then feed them back into the same transcript.
If not, the loop can still nudge, recover, or run stop hooks.
Continuation nudge, max-token resume, or recoverable retry path.
Hooks can inject follow-up work or allow a clean break.
The real runtime heart is internal/engine/loop.go. A turn is not a single model request. It is a bounded execution cycle where the runtime may compact context, call the provider, parse tool uses, execute tools, refresh the prompt, recover from retryable failures, and continue until it reaches a legitimate terminal state.
The builder prepares stable and dynamic sections, then the assembler injects runtime variables and preserves the cache-safe breakpoint.
The prompt path is one of the places where mono-run becomes operationally useful. The builder and assembler do not produce a random string on the side. They produce the provider-facing system prompt from the exact same session, mode, tool surface, and runtime state that the loop will execute against.
The orchestrator decides which tool calls can run together and which ones must stay serial.
The tool is resolved from the registry, checked for availability, normalized, and enriched before policy decisions.
Pre-tool hooks fire, then deny rules, tool-owned checks, allow rules, and permission modes decide allow, deny, or ask.
The runtime calls the tool, runs post-tool hooks, serializes the result, applies size limits, and records traces.
Tool results become conversation blocks for the same session and the same loop, not an external workflow hop.
The important point is not the count. It is that the same orchestrator owns them all under one permission and transcript contract.
The orchestrator in internal/execution/orchestrator.go is where tool usage stops being decorative and becomes runtime behavior. It partitions concurrent and serial work, validates inputs, runs hooks, applies permissions, executes tools, formats results, and pushes those results back into the same conversation.
Permissions in Seshat are not a UI-only feature layered on top of agent output. The runtime enforces them in the same path that executes tools. The orchestrator runs safety checks and permission resolution before a tool call, while the permission integrator manages session approvals, prompt-based confirmation, dont-ask mode, auto mode, and persisted per-session allowances.
The runtime keeps the same session owner while it retries, falls back, and classifies recoverable failures.
Create message stream against the selected provider and model.
Rate limit, timeout, network, or temporary overload.
Exponential backoff with jitter inside the same provider.
Try the next configured model in the routing chain.
Escalate to the next provider client when models are exhausted.
The runtime can recover at several levels without breaking session ownership. A timeout, rate limit, or overload error does not automatically terminate the session. The loop can retry, continue with recovery context, or move to fallback models and fallback providers while preserving the same turn.
The runtime root gives the architecture a physical center of gravity. Sessions, checkpoints, artifacts, logs, plans, skills, caches, vector data, and temporary task state all resolve under one local tree viaSESHAT_RUNTIME_ROOT.
Internally, the runtime is deliberately split into subsystems: engine, execution,prompt, providers, permissions, runtime, tools,web, memory, and more. Mono-run does not deny modularity. It constrains how those modules cooperate at runtime.
Clear packages, explicit wiring, replaceable subsystems, and a stable engine/session/loop core.
One runtime owns the turn lifecycle, rather than outsourcing each concern to a different service.
Different interfaces exist, but they all collapse into the same runtime owner instead of forking execution logic per surface.
The engine creates sessions, sessions own the durable work state, and the loop owns turn execution until a real stop reason exists.
Stable and per-turn prompt sections are assembled with a clear cache boundary, stage overlays, and runtime context injection.
Tool resolution, validation, batching, hooks, execution, and serialization are all in the same turn path.
Permission modes, safe-path matching, approval prompts, and denial degradation are runtime behavior, not UI cosmetics.
Compaction, checkpoints, canonical transcript persistence, and restoration live under the same operational root.
Streaming, retry, fallback models, fallback providers, and circuit-breaking are first-class runtime behavior.
Project memory, user memory, caches, skills, and reusable assets attach to the runtime root instead of floating outside it.
The advantage of mono-run is not aesthetic purity. It is operational coherence. The provider request, the prompt cache boundary, the permission mode, the tool map, the transcript, the checkpoint, and the artifact paths all belong to one runtime owner. That makes the system easier to reason about, easier to recover, and much harder to split into contradictory states.
Once mono-run is clear, the next useful layer is understanding how the same runtime can expose several host surfaces while also switching session behavior between execute, plan, and pair-programming styles.