Docs/Concepts/Runtime Modes

Runtime Modes

Runtime modes in Seshat are easiest to understand when you separate two axes that are often mixed together: the surface that hosts the runtime, and the execution mode that changes how the session behaves. The codebase already supports multiple entry surfaces, while the session model also tracks execution behavior such as execute, plan, and pair_programming.

Short version

A runtime mode is not just a UI choice. In Seshat, it can mean either a hosting surface like the CLI app, gRPC, HTTP API, or embedded Go SDK, or a session execution behavior like planning instead of executing. The same engine powers all of them, but not all of them change the runtime in the same way.

Two axes, not one list

Entry surfaces

These are host environments and transport layers: cmd/cli, cmd/grpc, the HTTP API in seshat-ai, and direct embedding through pkg/sdk.

Execution modes

These are session behaviors carried in the permission context. They decide whether the agent executes, plans, or collaborates in pair-programming style.

The practical consequence is important: a CLI app and a gRPC server are not two different engines. They are two surfaces over the same runtime. By contrast, plan is not a second surface at all. It is a behavioral mode inside the same session contract.

1. One engine, multiple entry surfaces

CLI Surface
CLI App

The same binary exposes single-turn run, text chat, and interactive TUI chat.

cmd/cli
Remote Surface
gRPC Server

A long-running service wraps the same SDK and streams runtime events over RPC.

cmd/grpc
Product Surface
HTTP API

seshat-ai adds REST + SSE on top of the runtime for desktop and product integrations.

seshat-ai/cmd/api
Embedded Surface
Go SDK

Applications can instantiate the runtime directly and keep full control of callbacks and wiring.

pkg/sdk
Shared Composition Boundary
sdk.Client

Providers, tools, permissions, browser, memory, persistence, hooks, storage, and monitoring are wired here once.

Engine

Owns long-lived runtime wiring and session creation.

Session

Owns transcript, tool surface, memory, and permission context.

Loop

Runs the turn until a legitimate terminal state exists.

This is the cleanest way to read the code. The CLI, the gRPC server, the product-layer HTTP API, and the embedded Go SDK all collapse into sdk.Client, which then wires the same engine, session, loop, providers, tools, memory, permissions, and persistence stack.

2. The CLI app already contains its own local interaction modes

The CLI surface is more than one experience. In cmd/cli/run.go, seshat chat switches into the full TUI when stdout is a real terminal and --no-tui is not set. Otherwise it falls back to a text-mode chat loop. Separate from both, seshat run is a single-turn, non-interactive execution path.

  • chat + TUI: interactive terminal application with session browser, settings, MCP, provider switching, and richer runtime events.
  • chat + text mode: simpler interactive loop for shells, pipes, and non-TTY environments.
  • run: one prompt in, one completed turn out, useful for scripts and direct automation.

3. gRPC and HTTP are server surfaces over the same core

The gRPC server in cmd/grpc/main.go does not implement a different runtime. It builds an SDK client for each request, persists sessions, optionally resumes by context ID, supports both unary and streaming query paths, and can restrict the live tool surface per request.

The HTTP API in seshat-ai/cmd/api/main.go follows the same idea from the product layer. It bootstraps an application using the Seshat runtime, then exposes REST and SSE endpoints for desktop and other clients. This means the product layer is not a parallel engine. It is another host over the same runtime foundation.

4. Embedded mode is the most direct developer surface

For developers, the embedded runtime is the most explicit form. sdk.NewClient() constructs the provider client, orchestrator, compactor, prompt assembler, permission engine, registry, browser manager, memory services, monitoring, and session store. If you embed Seshat this way, you do not re-create the engine logic yourself. You host it directly.

Callbacks

Hosts can subscribe to chunks, tool progress, runtime events, and session title generation without forking the loop.

Prompt control

Apps can set PromptConfig, including stage overlays such as plan mode.

Tool and MCP wiring

Tool registries, MCP servers, browser managers, storage backends, and long-term memory are composed through the client config.

5. Permission modes are part of the same runtime contract

Permission ModesPermissions answer who approves and when.

The runtime keeps permissions inside execution. These modes do not define the transport surface, and they do not replace execution modes. They govern approval behavior around tool use.

onRequest
On Request

Default interactive mode. The runtime asks when an action should be escalated to the user.

Good default for real interactive work.

auto
Auto

The runtime attempts automatic approval through its classifier and safety pipeline.

Fast when the action is judged safe enough to avoid user interruption.

acceptEdits
Accept Edits

Safe file operations inside the working directory can be approved automatically.

Useful when local code edits should flow without repeated confirmations.

bypass
Bypass

Permission checks are bypassed entirely.

Powerful and dangerous. Mostly appropriate for tightly controlled automation paths.

never
Never

The runtime never asks the user. If approval would be needed, the action fails immediately.

Designed for headless and background agents that cannot surface approval UI.

granular
Granular

Fine-grained approval policy where sandbox, rules, skills, and request_permissions prompts can be controlled independently.

This is the most governance-oriented permission mode in the current code.

Execution and permissions combine

A session can be in plan mode while still remembering its previous approval mode in PrePlanMode. That is exactly why the code separates the two concerns.

Headless paths

Background agents and some automation flows tend to use never or carefully scoped bypass, because there is no interactive user there to answer prompts.

Permission modes are stored in the same session context as execution modes, but they answer a different question. They define who approves actions, what can be auto-approved, and how the runtime behaves when user interaction is impossible.

6. Execution modes are session behavior

Execution ModesModes change runtime behavior, not transport.

The session keeps a permission context with both approval mode and execution mode. That is why plan mode is no longer treated as a permission mode.

execute
Execute

Default runtime behavior. Tools execute normally and the loop resolves real work end to end.

This is the baseline mode represented by ExecutionModeExecute.

plan
Plan

Tool execution is suspended. The runtime asks for a concrete plan instead of performing actions immediately.

Plan files are tracked per session and approval mode is preserved separately.

pair_programming
Pair Programming

Tools still run, but the collaboration style becomes more interactive and suggestion-oriented.

This is modeled as a distinct execution mode, not just a UI color change.

Important distinction

onRequest, auto, never, and bypass answer “who approves?”. execute, plan, and pair_programming answer “how does the agent behave?”.

About browse

The prompt layer already talks about browse as a useful behavior, but it is not yet exposed as a first-class execution-mode constant like the three modes above.

In the runtime model, execution modes live inside the session permission context. That is why the engine can preserve approval state while switching the behavior of the agent. It is also why restored sessions can come back in plan mode instead of silently dropping that context.

7. Plan mode is no longer a permission mode

This distinction is now explicit in the code. The CLI parser rejects plan as a permission mode and tells you it is now an execution mode instead. The session permission context stores Mode for approval behavior and ExecutionMode for runtime behavior.

  • Approval mode answers: should this action be auto-approved, denied, or asked?
  • Execution mode answers: should this turn execute, plan, or collaborate?
  • Exiting plan mode restores the previous approval mode instead of flattening everything into one flag.

8. What is fully shipped versus still emerging

Shipped surfaces

CLI app, gRPC service, embedded Go SDK, and the HTTP API built in seshat-ai.

Shipped execution modes

execute, plan, and pair_programming are represented directly in the code.

Emerging behavior

browse already appears in prompts and guidance, but it is not yet exposed at the same maturity level as the formal execution-mode constants.

That is the honest state of the project today. The architecture is already capable of hosting several surfaces and several behavior patterns, but not every planned runtime behavior should be documented as if it were already fully stabilized.

Why this matters

If you do not separate surfaces from behavior, the runtime becomes hard to reason about. You start treating a TUI, an API server, and a planning stage as if they were the same category. Seshat becomes clearer when you keep the hierarchy strict: one runtime core, several host surfaces, and explicit per-session behavior modes.

Next concept

Once runtime modes are clear, the next technical question is how long-running work stays coherent when context grows, history accumulates, and retrieval becomes necessary.