MVP Spec v2: Build anything. Break nothing.

Open-source containerized AI runtime. Any agent, any model. Full autonomy by default. Recipes as the viral unit.

MVP Spec v2: [codename TBD]

Source: Gods-tier session, 2026-03-10 Status: DRAFT — needs validation against real users Supersedes: mvp-spec.md v1


Positioning

Build anything. Break nothing.

An open-source, containerized AI runtime. Any agent. Any model. Local or cloud. Private by default. Free by default. Full autonomy by default — because the sandbox makes “dangerously” irrelevant.

Not a coding agent. Not a chatbot. Not a platform. The layer underneath all of them.

Why now

  • Vibe coding is exploding. Millions of new users discovering AI-assisted building.
  • Every provider is a walled garden. Claude-only, Codex-only, Gemini-only.
  • Local models (Ollama, llama.cpp) are getting good enough and growing fast.
  • The audience who wants AI but won’t pay $30/month (or can’t send data to the cloud) has zero tooling.
  • Nobody has made multi-provider agent interaction possible — let alone visible, shareable, or trustworthy.

Target users

Door 1: Builders

“I want to build something with AI.”

Ranges from the TikTok vibe-coder to the experienced developer running multi-agent workflows. Unified by: they want AI to make things, and they want it to work without risk.

Door 2: Everyone else

“I want to use AI privately, my way.”

The student simulating conversations. The professional rehearsing a presentation against simulated stakeholders. The person who wants a persistent, customizable AI companion that lives on their machine, not someone’s cloud. Unified by: they want AI that’s private, free, and configured exactly how they want it.

Same room

Both users run the same runtime. Both create and consume recipes. Both generate the event stream Signet will eventually index. The architecture doesn’t distinguish between them.

The product

What the user experiences

Wave 1 user (zero technical knowledge):

  • Clicks a link or runs one command
  • Lands in a working AI environment
  • Types what they want
  • Things happen. Nothing breaks. Everything is saved.

Wave 2 user (some technical knowledge):

  • docker run [name]
  • Pre-configured agent + local model, ready immediately
  • Full permissions by default (container = sandbox)
  • Swaps in their preferred model or API keys if desired

Wave 3 user (power user):

  • Custom recipes: their own agent configs, model selections, role definitions
  • Multiple agents from different providers in the same container
  • Shares recipes with others. Forks other people’s recipes.
  • Event timeline showing everything that happened

What’s underneath (user doesn’t see this)

  • Docker/OCI container runtime
  • Every action is a signed Nostr event in a local store
  • Recipe format makes configurations portable and versionable
  • Event format is relay-compatible from day one (publish later, format doesn’t change)

Recipes

A recipe is a saved AI configuration. It’s the most valuable artifact the product generates.

What’s in a recipe

my-recipe/
├── recipe.json         # Metadata: name, description, author, parent recipe
├── Dockerfile          # Environment definition (optional, for custom setups)
├── agents/
│   ├── AGENTS.md       # Agent behavior configuration
│   └── roles/          # Role-specific system prompts
├── models/
│   └── config.json     # Model selection, parameters, fallback chain
└── README.md           # Human-readable description

At minimum, a recipe can be a single system prompt file. At maximum, it’s a complete multi-agent environment definition. The format scales with complexity.

Recipe properties

  • Portable: runs identically on any machine with the runtime
  • Versionable: it’s just files in a directory, tracked by git
  • Forkable: change recipe.json parent reference, modify contents, publish
  • Shareable: send a link, recipient gets your exact setup
  • Trackable: recipe hash is embedded in session events — Signet can correlate configurations with outcomes

Recipe examples

Recipe Creator Shared because
“Three-agent code review” Developer Claude reviews architecture, Codex checks implementation, local model validates tests
“Board meeting rehearsal” CFO Four simulacra of board members with distinct personalities and objection patterns
“Crush simulator v3” College student Three personality variants to explore relationship dynamics privately
“Explain like my favorite professor” Student Chemistry tutor calibrated to a specific teaching style
“Startup pitch stress-test” Founder Simulated VC panel that asks hard questions and doesn’t softball

All of these are the same thing: a configuration file loaded into a local AI runtime. The product doesn’t care what the recipe is for.

MVP scope

Phase 0: Proof of concept (now → 2 weeks)

Ship: A Docker image that provides a working AI coding/chat environment with zero configuration.

In Why
One-command launch (docker run [name]) Zero friction
OpenCode pre-installed Best OSS agent, Go, provider-agnostic
Ollama + default local model Free, private, no API key needed
Cloud model support via env vars For users with existing API keys
Full agent permissions by default Container = sandbox = no risk
Browser access via localhost Non-terminal users
Persistent workspace via volume mount Work survives restart
Git pre-installed, auto-initialized Undo = git reset --hard
Out Why not yet
Event capture Validate the experience first
Recipes Let them emerge from usage
Multi-agent Single agent must feel right first
Any UI beyond what OpenCode provides Don’t build what exists
Web-hosted version (no Docker) Docker-first, hosted later

Success criteria: A non-engineer goes from docker run ... to “describing what they want and seeing results” in under 5 minutes. No configuration. No decisions. Nothing breaks.

Validation:

  • [ ] 5 non-engineer humans attempt the experience with zero guidance
  • [ ] Measure time-to-first-interaction
  • [ ] Measure “did anything scary happen” (should be zero)
  • [ ] Verify OpenCode + Ollama run cleanly in Docker with acceptable cold start
  • [ ] Verify local model quality is “good enough” for simple tasks on a MacBook Air
  • [ ] Document every point where the user got confused or stuck

Phase 1: Events + observability (2-4 weeks)

Ship: Silent event capture with a simple “what happened” view.

In Why
Local Nostr event signing + SQLite store Foundation for everything later
Capture file changes, commands, git ops The receipts
Minimal timeline view (web UI) “What did my agent do?”
Auto-checkpointing before agent actions One-click undo to any point
Recipe format v0 Save your configuration as a portable directory

Success criteria: After any session, user can see a clear timeline of everything that happened and undo to any checkpoint. User can save their setup as a recipe and load it in a fresh container.

Phase 2: Multi-agent + recipe sharing (4-8 weeks)

Ship: Multiple agents in one container, recipe distribution.

In Why
Multiple agent sessions with isolated worktrees Claude + Codex + local model together
Cross-agent event visibility See what all agents are doing
Cross-provider interaction The spectacle nobody else can offer
Recipe packaging and sharing (git-based) Viral mechanism activated
Recipe forking with parent tracking Recipes improve through competition

Success criteria: A user can run Claude + Codex + local model in one container, share their setup as a recipe, and someone else can reproduce it in one command.

Phase 3: Relay sync + Signet bootstrap (8-16 weeks)

Ship: Optional network publication. Signet begins indexing.

In Why
Optional relay publication of events “git push” moment
Signet indexes published events Recipe reputation begins
Recipe browsing with outcome data “Which recipes actually work?”
Agent reputation across sessions “Which model/role combos are reliable?”

Success criteria: A user can browse recipes sorted by verified outcomes, fork a high-performing one, and benefit from someone else’s hard-won configuration knowledge.

Architecture

┌─────────────────────────────────────────────────┐
│                   SIGNET                        │
│   Reputation │ Discovery │ Trust │ Receipts     │
│   "See what matters and why. With receipts."    │
├─────────────────────────────────────────────────┤
│               RELAY NETWORK                     │
│          Optional. "git push" moment.           │
├─────────────────────────────────────────────────┤
│                  RECIPES                        │
│    Portable, versioned AI configurations        │
│    Coding pipelines ←→ Personality configs      │
├─────────────────────────────────────────────────┤
│          AGENTS (not ours — any agent)          │
│  OpenCode │ Claude Code │ Codex │ Gemini CLI    │
├─────────────────────────────────────────────────┤
│          MODELS (not ours — any model)          │
│  Llama │ Qwen │ Claude │ GPT │ Gemini │ Mistral │
├─────────────────────────────────────────────────┤
│  ═══════════ [OUR PRODUCT] ═══════════════════  │
│                                                 │
│   Containerized AI runtime                      │
│                                                 │
│   • Sandbox (ruin impossible)                   │
│   • Any agent, any model, any use case          │
│   • Every action → signed Nostr event (local)   │
│   • Recipe format (portable configurations)     │
│   • Multi-agent via worktrees                   │
│   • Browser + CLI access                        │
│                                                 │
├─────────────────────────────────────────────────┤
│              DOCKER / OCI RUNTIME               │
├─────────────────────────────────────────────────┤
│              HARDWARE (yours)                   │
└─────────────────────────────────────────────────┘

Event layer (invisible to user)

Every session generates signed Nostr events stored locally. The user sees “timeline” and “undo.” The protocol sees cryptographically attributable, relay-publishable, Signet-indexable events.

Minimum event vocabulary

What Content Stored
Session start Agent, model, recipe hash, environment Always
Action taken File change / command / message (with diff) Always
Checkpoint Git commit hash, state snapshot Always
Error What failed, what agent attempted Always
Session end Duration, action count, outcome Always

Events use NIP-01 format with custom tags. Same events work locally and on relays with zero format changes.

Identity model

One keypair per session instance. Auto-generated, stored locally. Role, model, and tier are metadata tags, not part of the key. This means:

  • Every agent action is cryptographically attributable
  • Different sessions of the same model have distinct identities
  • Reputation can be built per-configuration, not just per-model
  • The user doesn’t manage keys — it’s invisible

Distribution / viral mechanism

The product has two natural viral loops:

Recipe sharing

“Try my setup” → one command → recipient has your exact AI configuration running. Developers share coding pipelines. Everyone else shares personality configs, tutoring setups, rehearsal simulacra. Recipes that work well get forked and improved. Competition on outcomes, not marketing.

The spectacle

Multi-provider agent interaction is inherently compelling and exists nowhere else. Claude arguing with Codex while Gemini plays devil’s advocate is shareable content that every user generates just by using the product. This selects for the right audience: people curious enough about AI to become users.

Neither loop requires building anything extra. Both are emergent properties of the product being provider-agnostic and interactions being visible.

Naming

The local tool needs its own name, distinct from Signet.

Signet = the network layer. Reputation, discovery, trust. “GitHub.” [TBD] = the local runtime. Any agent, any model, zero risk. “git.”

Requirements:

  • Short, one word, easy to type
  • Works as CLI: [name] run, [name] status
  • Evokes building/safety/freedom, not surveillance
  • No conflicts with existing tools
  • Bonus: domain available

To be decided separately, not under time pressure.

Open questions

  1. Default local model: Which model ships in the image? Must run on a MacBook Air, must be good enough for simple tasks. 7-8B parameter range. Needs benchmarking.
  2. Browser-only experience: How do Wave 1 users (no terminal, no Docker) get in? WebContainers? Docker Desktop with web UI? Hosted version? This is the biggest unsolved UX question.
  3. Recipe format stability: v0 will change. Need a versioning and migration strategy.
  4. Event kind registration: Use generic Nostr kinds (30078) until vocabulary stabilizes, or register custom kinds early?
  5. OpenCode relationship: The containerization layer is below OpenCode. Do we contribute upstream, maintain a fork, or integrate independently?
  6. Spectacle features: How much do we invest in making multi-agent interactions visible/shareable vs. leaving it to the community? Instinct says: make the raw interaction viewable, let the community build the TikTok layer.
  7. Content moderation: Recipes can configure anything. Do we need a policy? Instinct says: the runtime is like git (neutral infrastructure), Signet is where quality signals live (reputation, not censorship).

Validation checklist (before writing code)

  • [ ] 5 non-engineer humans attempt Docker-based experience
  • [ ] 5 non-engineer humans attempt the “simulacrum” use case with local model
  • [ ] Benchmark local model quality in container vs. bare metal
  • [ ] Benchmark cold start time (image pull → working agent)
  • [ ] Map 3 real swarm tasks to proposed event vocabulary
  • [ ] Package 2 recipes manually and test portability across machines
  • [ ] Talk to 3 OpenCode community members about containerization
  • [ ] Survey 10 people: “What would you do with a free, private, unlimited AI on your laptop?”

What success looks like

Month 1: Docker image works. 50 developers try it. 5 share recipes.

Month 3: Recipe ecosystem emerging. Multi-agent support live. First non-developer users. The spectacle starts generating organic content.

Month 6: Relay sync available. Signet alpha indexing recipe outcomes. Recipe reputation visible. Community growing around recipe sharing.

Month 12: Signet is the trust layer. “Which AI configuration should I use for X?” has an answer backed by data. The event stream from thousands of users powers reputation scoring that no competitor can replicate.

AWS GAIA application (June 2026): Runtime is live, recipes are being shared, the story is concrete and demonstrable.


Write a comment
No comments yet.