Docs/Learn/Technical Hypotheses

Technical Hypotheses

Seshat is built on a set of explicit technical and product hypotheses about what agent systems need in order to become useful work systems. These hypotheses shape the runtime architecture, the platform layering, the team model, the safety posture, and even the open-source strategy. They should be made visible, because visible bets can be tested.

Overview

A mature architecture is not only a pile of features. It is a coherent answer to a set of beliefs about the world. If those beliefs stay hidden, contributors do not know what the project is optimizing for. If they are explicit, the community can help validate, challenge, refine, or replace them.

Hypothesis Map

These hypotheses fall into four broad families: execution, organization, memory and governance, and ecosystem direction. Together they explain why Seshat looks the way it does today and why the roadmap stages are ordered the way they are.

Execution thesis

AI needs a runtime, not only a chat surface

The work system matters: sessions, tools, permissions, memory, recovery, and provider control.

Organization thesis

The hard problem is organized work

Missions, roles, explicit coordination, and durable collaboration become the next layer of value.

Center Of Gravity

Seshat is a set of testable bets, not only a feature list

The project makes claims about how agent systems should be built, governed, extended, and eventually turned into environments for work, automation, and learning. Making those bets explicit is healthier than hiding them behind generic product language.

Agent model

Complete agents over disposable workers

Sub-agents matter, but they are not the core organizational unit.

Memory model

Structured shared memory

Explicit mission memory should beat opaque pooled context for serious coordination.

Ecosystem model

Open foundation

Community, multi-provider portability, skills, MCP, and self-hosting are part of the architectural thesis.

Detailed Table

The table below treats each thesis as something that must eventually earn its place through real usage, not only through architectural elegance.

Hypothesis	Architectural Implication	What Would Validate It	What Would Weaken It
Chat is not enough; AI needs an execution layer.	The core must own sessions, tools, persistence, permissions, and recovery instead of stopping at prompt-response UX.	Users consistently choose the runtime for real work that continues over time rather than using it only as a chat shell.	If most value comes from simple chat surfaces and not governed execution, the runtime thesis is overstated.
The next value jump is organized work, not only stronger solo agents.	The architecture must leave room for missions, roles, explicit coordination, and durable multi-agent collaboration.	Sub-agent flows, mailbox teams, and structured mission work produce better real outcomes than isolated agent sessions alone.	If teams mostly want one excellent agent and coordination remains marginal, the organization thesis weakens.
Human work patterns are the right inspiration for agent coordination.	Mailboxes, task boards, reports, blockers, and roles matter more than opaque graph magic or hidden global state.	Explicit coordination surfaces make agent behavior more governable, inspectable, and usable in real organizations.	If invisible orchestration consistently outperforms explicit coordination in practice, this design direction needs revision.
The fundamental unit is the complete agent, not the disposable sub-agent.	Identity, history, permissions, memory, and responsibility belong to full sessions first; sub-agents stay an internal execution primitive.	Persistent agents with roles and histories prove more useful for serious missions than purely ephemeral worker patterns.	If teams overwhelmingly prefer stateless specialist workers and durable identity adds little value, the model is too heavy.
Shared memory should stay explicit, minimal, and structured.	Mission memory, reports, and durable decisions should be visible artifacts, not one giant hidden context pool.	Teams can understand why an agent acted, what it knew, and which decisions were shared across the mission.	If explicit memory boundaries become too costly and hidden shared state is what actually works, the model may be too strict.
Digital work can be progressively automated when it becomes observable, structured, delegable, and governed.	The runtime and platform should support automation, scheduled execution, auditable actions, and domain-specific operational surfaces.	Organizations successfully automate real workflows through Seshat with human supervision and measurable utility.	If most target workflows remain too ambiguous or too governance-heavy to automate reliably, the automation thesis must narrow.
Education belongs inside the same agentic foundation.	The system should be able to carry teaching, explanation, exercises, progress continuity, and source-grounded guidance without a separate execution philosophy.	Learning-oriented agents can reuse the same runtime and still provide credible teaching and progression support.	If education requires a fundamentally different core model, then it should remain an adjacent product rather than a native surface.
An open, self-hostable, multi-provider foundation is strategically better than a closed single-vendor stack.	Provider portability, MCP, skills, SDKs, and community extension paths are architectural commitments, not marketing extras.	Developers, operators, and domain experts extend the system because the foundation is open enough to be worth building on.	If the ecosystem never materializes and users mainly want one tightly coupled vendor stack, openness alone is not enough differentiation.

What Good Validation Looks Like

Runtime evidence

Daily serious use

Developers, operators, and products rely on the runtime because the execution contract is more trustworthy than stitching their own stack together.

Team evidence

Explicit coordination beats hidden magic

Durable teams, reports, mission memory, and mailbox coordination prove easier to govern and observe than opaque multi-agent abstractions.

Automation evidence

Real workflows become governable

Organizations use Seshat to automate repeatable digital work while keeping approval, traceability, and boundary control.

Community evidence

The ecosystem builds on the core

Skills, MCP integrations, SDK usage, and domain-specific experiments grow because the foundation is open enough and stable enough to deserve extension.

When The Project Should Revise Its Bets

Explicit hypotheses are useful only if they can be challenged. If the market, users, or architecture consistently show that a thesis is wrong, the project should update the model rather than defend it as ideology.

If most users only want a strong solo agent, Level 3 should stay narrow and pragmatic rather than becoming a forced grand theory.
If explicit shared memory boundaries prove too rigid, the team runtime should evolve them instead of treating them as doctrine.
If open extensibility fails to create a real ecosystem, the product still needs to justify the added complexity with direct user value.
If certain verticals like education need genuinely different execution models, they should branch at the product layer without corrupting the runtime core.

For the wider product narrative behind these hypotheses, continue with What is Seshat?. For the execution-side argument, continue with What is an Agent Runtime?.