Runtime modes in Seshat are easiest to understand when you separate two axes that are often mixed together: the surface that hosts the runtime, and the execution mode that changes how the session behaves. The codebase already supports multiple entry surfaces, while the session model also tracks execution behavior such as execute, plan, and pair_programming.
A runtime mode is not just a UI choice. In Seshat, it can mean either a hosting surface like the CLI app, gRPC, HTTP API, or embedded Go SDK, or a session execution behavior like planning instead of executing. The same engine powers all of them, but not all of them change the runtime in the same way.
These are host environments and transport layers: cmd/cli, cmd/grpc, the HTTP API in seshat-ai, and direct embedding through pkg/sdk.
These are session behaviors carried in the permission context. They decide whether the agent executes, plans, or collaborates in pair-programming style.
The practical consequence is important: a CLI app and a gRPC server are not two different engines. They are two surfaces over the same runtime. By contrast, plan is not a second surface at all. It is a behavioral mode inside the same session contract.
The same binary exposes single-turn run, text chat, and interactive TUI chat.
cmd/cliA long-running service wraps the same SDK and streams runtime events over RPC.
cmd/grpcseshat-ai adds REST + SSE on top of the runtime for desktop and product integrations.
seshat-ai/cmd/apiApplications can instantiate the runtime directly and keep full control of callbacks and wiring.
pkg/sdkProviders, tools, permissions, browser, memory, persistence, hooks, storage, and monitoring are wired here once.
Owns long-lived runtime wiring and session creation.
Owns transcript, tool surface, memory, and permission context.
Runs the turn until a legitimate terminal state exists.
This is the cleanest way to read the code. The CLI, the gRPC server, the product-layer HTTP API, and the embedded Go SDK all collapse into sdk.Client, which then wires the same engine, session, loop, providers, tools, memory, permissions, and persistence stack.
The CLI surface is more than one experience. In cmd/cli/run.go, seshat chat switches into the full TUI when stdout is a real terminal and --no-tui is not set. Otherwise it falls back to a text-mode chat loop. Separate from both, seshat run is a single-turn, non-interactive execution path.
The gRPC server in cmd/grpc/main.go does not implement a different runtime. It builds an SDK client for each request, persists sessions, optionally resumes by context ID, supports both unary and streaming query paths, and can restrict the live tool surface per request.
The HTTP API in seshat-ai/cmd/api/main.go follows the same idea from the product layer. It bootstraps an application using the Seshat runtime, then exposes REST and SSE endpoints for desktop and other clients. This means the product layer is not a parallel engine. It is another host over the same runtime foundation.
For developers, the embedded runtime is the most explicit form. sdk.NewClient() constructs the provider client, orchestrator, compactor, prompt assembler, permission engine, registry, browser manager, memory services, monitoring, and session store. If you embed Seshat this way, you do not re-create the engine logic yourself. You host it directly.
Hosts can subscribe to chunks, tool progress, runtime events, and session title generation without forking the loop.
Apps can set PromptConfig, including stage overlays such as plan mode.
Tool registries, MCP servers, browser managers, storage backends, and long-term memory are composed through the client config.
The runtime keeps permissions inside execution. These modes do not define the transport surface, and they do not replace execution modes. They govern approval behavior around tool use.
Default interactive mode. The runtime asks when an action should be escalated to the user.
Good default for real interactive work.
The runtime attempts automatic approval through its classifier and safety pipeline.
Fast when the action is judged safe enough to avoid user interruption.
Safe file operations inside the working directory can be approved automatically.
Useful when local code edits should flow without repeated confirmations.
Permission checks are bypassed entirely.
Powerful and dangerous. Mostly appropriate for tightly controlled automation paths.
The runtime never asks the user. If approval would be needed, the action fails immediately.
Designed for headless and background agents that cannot surface approval UI.
Fine-grained approval policy where sandbox, rules, skills, and request_permissions prompts can be controlled independently.
This is the most governance-oriented permission mode in the current code.
A session can be in plan mode while still remembering its previous approval mode in PrePlanMode. That is exactly why the code separates the two concerns.
Background agents and some automation flows tend to use never or carefully scoped bypass, because there is no interactive user there to answer prompts.
Permission modes are stored in the same session context as execution modes, but they answer a different question. They define who approves actions, what can be auto-approved, and how the runtime behaves when user interaction is impossible.
The session keeps a permission context with both approval mode and execution mode. That is why plan mode is no longer treated as a permission mode.
Default runtime behavior. Tools execute normally and the loop resolves real work end to end.
This is the baseline mode represented by ExecutionModeExecute.
Tool execution is suspended. The runtime asks for a concrete plan instead of performing actions immediately.
Plan files are tracked per session and approval mode is preserved separately.
Tools still run, but the collaboration style becomes more interactive and suggestion-oriented.
This is modeled as a distinct execution mode, not just a UI color change.
onRequest, auto, never, and bypass answer “who approves?”. execute, plan, and pair_programming answer “how does the agent behave?”.
The prompt layer already talks about browse as a useful behavior, but it is not yet exposed as a first-class execution-mode constant like the three modes above.
In the runtime model, execution modes live inside the session permission context. That is why the engine can preserve approval state while switching the behavior of the agent. It is also why restored sessions can come back in plan mode instead of silently dropping that context.
This distinction is now explicit in the code. The CLI parser rejects plan as a permission mode and tells you it is now an execution mode instead. The session permission context stores Mode for approval behavior and ExecutionMode for runtime behavior.
CLI app, gRPC service, embedded Go SDK, and the HTTP API built in seshat-ai.
execute, plan, and pair_programming are represented directly in the code.
browse already appears in prompts and guidance, but it is not yet exposed at the same maturity level as the formal execution-mode constants.
That is the honest state of the project today. The architecture is already capable of hosting several surfaces and several behavior patterns, but not every planned runtime behavior should be documented as if it were already fully stabilized.
If you do not separate surfaces from behavior, the runtime becomes hard to reason about. You start treating a TUI, an API server, and a planning stage as if they were the same category. Seshat becomes clearer when you keep the hierarchy strict: one runtime core, several host surfaces, and explicit per-session behavior modes.
Once runtime modes are clear, the next technical question is how long-running work stays coherent when context grows, history accumulates, and retrieval becomes necessary.