MVP Spec v2: Build anything. Break nothing.
- MVP Spec v2: [codename TBD]
MVP Spec v2: [codename TBD]
Source: Gods-tier session, 2026-03-10 Status: DRAFT — needs validation against real users Supersedes: mvp-spec.md v1
Positioning
Build anything. Break nothing.
An open-source, containerized AI runtime. Any agent. Any model. Local or cloud. Private by default. Free by default. Full autonomy by default — because the sandbox makes “dangerously” irrelevant.
Not a coding agent. Not a chatbot. Not a platform. The layer underneath all of them.
Why now
- Vibe coding is exploding. Millions of new users discovering AI-assisted building.
- Every provider is a walled garden. Claude-only, Codex-only, Gemini-only.
- Local models (Ollama, llama.cpp) are getting good enough and growing fast.
- The audience who wants AI but won’t pay $30/month (or can’t send data to the cloud) has zero tooling.
- Nobody has made multi-provider agent interaction possible — let alone visible, shareable, or trustworthy.
Target users
Door 1: Builders
“I want to build something with AI.”
Ranges from the TikTok vibe-coder to the experienced developer running multi-agent workflows. Unified by: they want AI to make things, and they want it to work without risk.
Door 2: Everyone else
“I want to use AI privately, my way.”
The student simulating conversations. The professional rehearsing a presentation against simulated stakeholders. The person who wants a persistent, customizable AI companion that lives on their machine, not someone’s cloud. Unified by: they want AI that’s private, free, and configured exactly how they want it.
Same room
Both users run the same runtime. Both create and consume recipes. Both generate the event stream Signet will eventually index. The architecture doesn’t distinguish between them.
The product
What the user experiences
Wave 1 user (zero technical knowledge):
- Clicks a link or runs one command
- Lands in a working AI environment
- Types what they want
- Things happen. Nothing breaks. Everything is saved.
Wave 2 user (some technical knowledge):
docker run [name]- Pre-configured agent + local model, ready immediately
- Full permissions by default (container = sandbox)
- Swaps in their preferred model or API keys if desired
Wave 3 user (power user):
- Custom recipes: their own agent configs, model selections, role definitions
- Multiple agents from different providers in the same container
- Shares recipes with others. Forks other people’s recipes.
- Event timeline showing everything that happened
What’s underneath (user doesn’t see this)
- Docker/OCI container runtime
- Every action is a signed Nostr event in a local store
- Recipe format makes configurations portable and versionable
- Event format is relay-compatible from day one (publish later, format doesn’t change)
Recipes
A recipe is a saved AI configuration. It’s the most valuable artifact the product generates.
What’s in a recipe
my-recipe/
├── recipe.json # Metadata: name, description, author, parent recipe
├── Dockerfile # Environment definition (optional, for custom setups)
├── agents/
│ ├── AGENTS.md # Agent behavior configuration
│ └── roles/ # Role-specific system prompts
├── models/
│ └── config.json # Model selection, parameters, fallback chain
└── README.md # Human-readable description
At minimum, a recipe can be a single system prompt file. At maximum, it’s a complete multi-agent environment definition. The format scales with complexity.
Recipe properties
- Portable: runs identically on any machine with the runtime
- Versionable: it’s just files in a directory, tracked by git
- Forkable: change recipe.json parent reference, modify contents, publish
- Shareable: send a link, recipient gets your exact setup
- Trackable: recipe hash is embedded in session events — Signet can correlate configurations with outcomes
Recipe examples
| Recipe | Creator | Shared because |
|---|---|---|
| “Three-agent code review” | Developer | Claude reviews architecture, Codex checks implementation, local model validates tests |
| “Board meeting rehearsal” | CFO | Four simulacra of board members with distinct personalities and objection patterns |
| “Crush simulator v3” | College student | Three personality variants to explore relationship dynamics privately |
| “Explain like my favorite professor” | Student | Chemistry tutor calibrated to a specific teaching style |
| “Startup pitch stress-test” | Founder | Simulated VC panel that asks hard questions and doesn’t softball |
All of these are the same thing: a configuration file loaded into a local AI runtime. The product doesn’t care what the recipe is for.
MVP scope
Phase 0: Proof of concept (now → 2 weeks)
Ship: A Docker image that provides a working AI coding/chat environment with zero configuration.
| In | Why |
|---|---|
One-command launch (docker run [name]) |
Zero friction |
| OpenCode pre-installed | Best OSS agent, Go, provider-agnostic |
| Ollama + default local model | Free, private, no API key needed |
| Cloud model support via env vars | For users with existing API keys |
| Full agent permissions by default | Container = sandbox = no risk |
| Browser access via localhost | Non-terminal users |
| Persistent workspace via volume mount | Work survives restart |
| Git pre-installed, auto-initialized | Undo = git reset --hard |
| Out | Why not yet |
|---|---|
| Event capture | Validate the experience first |
| Recipes | Let them emerge from usage |
| Multi-agent | Single agent must feel right first |
| Any UI beyond what OpenCode provides | Don’t build what exists |
| Web-hosted version (no Docker) | Docker-first, hosted later |
Success criteria: A non-engineer goes from docker run ... to “describing what they want and seeing results” in under 5 minutes. No configuration. No decisions. Nothing breaks.
Validation:
- [ ] 5 non-engineer humans attempt the experience with zero guidance
- [ ] Measure time-to-first-interaction
- [ ] Measure “did anything scary happen” (should be zero)
- [ ] Verify OpenCode + Ollama run cleanly in Docker with acceptable cold start
- [ ] Verify local model quality is “good enough” for simple tasks on a MacBook Air
- [ ] Document every point where the user got confused or stuck
Phase 1: Events + observability (2-4 weeks)
Ship: Silent event capture with a simple “what happened” view.
| In | Why |
|---|---|
| Local Nostr event signing + SQLite store | Foundation for everything later |
| Capture file changes, commands, git ops | The receipts |
| Minimal timeline view (web UI) | “What did my agent do?” |
| Auto-checkpointing before agent actions | One-click undo to any point |
| Recipe format v0 | Save your configuration as a portable directory |
Success criteria: After any session, user can see a clear timeline of everything that happened and undo to any checkpoint. User can save their setup as a recipe and load it in a fresh container.
Phase 2: Multi-agent + recipe sharing (4-8 weeks)
Ship: Multiple agents in one container, recipe distribution.
| In | Why |
|---|---|
| Multiple agent sessions with isolated worktrees | Claude + Codex + local model together |
| Cross-agent event visibility | See what all agents are doing |
| Cross-provider interaction | The spectacle nobody else can offer |
| Recipe packaging and sharing (git-based) | Viral mechanism activated |
| Recipe forking with parent tracking | Recipes improve through competition |
Success criteria: A user can run Claude + Codex + local model in one container, share their setup as a recipe, and someone else can reproduce it in one command.
Phase 3: Relay sync + Signet bootstrap (8-16 weeks)
Ship: Optional network publication. Signet begins indexing.
| In | Why |
|---|---|
| Optional relay publication of events | “git push” moment |
| Signet indexes published events | Recipe reputation begins |
| Recipe browsing with outcome data | “Which recipes actually work?” |
| Agent reputation across sessions | “Which model/role combos are reliable?” |
Success criteria: A user can browse recipes sorted by verified outcomes, fork a high-performing one, and benefit from someone else’s hard-won configuration knowledge.
Architecture
┌─────────────────────────────────────────────────┐
│ SIGNET │
│ Reputation │ Discovery │ Trust │ Receipts │
│ "See what matters and why. With receipts." │
├─────────────────────────────────────────────────┤
│ RELAY NETWORK │
│ Optional. "git push" moment. │
├─────────────────────────────────────────────────┤
│ RECIPES │
│ Portable, versioned AI configurations │
│ Coding pipelines ←→ Personality configs │
├─────────────────────────────────────────────────┤
│ AGENTS (not ours — any agent) │
│ OpenCode │ Claude Code │ Codex │ Gemini CLI │
├─────────────────────────────────────────────────┤
│ MODELS (not ours — any model) │
│ Llama │ Qwen │ Claude │ GPT │ Gemini │ Mistral │
├─────────────────────────────────────────────────┤
│ ═══════════ [OUR PRODUCT] ═══════════════════ │
│ │
│ Containerized AI runtime │
│ │
│ • Sandbox (ruin impossible) │
│ • Any agent, any model, any use case │
│ • Every action → signed Nostr event (local) │
│ • Recipe format (portable configurations) │
│ • Multi-agent via worktrees │
│ • Browser + CLI access │
│ │
├─────────────────────────────────────────────────┤
│ DOCKER / OCI RUNTIME │
├─────────────────────────────────────────────────┤
│ HARDWARE (yours) │
└─────────────────────────────────────────────────┘
Event layer (invisible to user)
Every session generates signed Nostr events stored locally. The user sees “timeline” and “undo.” The protocol sees cryptographically attributable, relay-publishable, Signet-indexable events.
Minimum event vocabulary
| What | Content | Stored |
|---|---|---|
| Session start | Agent, model, recipe hash, environment | Always |
| Action taken | File change / command / message (with diff) | Always |
| Checkpoint | Git commit hash, state snapshot | Always |
| Error | What failed, what agent attempted | Always |
| Session end | Duration, action count, outcome | Always |
Events use NIP-01 format with custom tags. Same events work locally and on relays with zero format changes.
Identity model
One keypair per session instance. Auto-generated, stored locally. Role, model, and tier are metadata tags, not part of the key. This means:
- Every agent action is cryptographically attributable
- Different sessions of the same model have distinct identities
- Reputation can be built per-configuration, not just per-model
- The user doesn’t manage keys — it’s invisible
Distribution / viral mechanism
The product has two natural viral loops:
Recipe sharing
“Try my setup” → one command → recipient has your exact AI configuration running. Developers share coding pipelines. Everyone else shares personality configs, tutoring setups, rehearsal simulacra. Recipes that work well get forked and improved. Competition on outcomes, not marketing.
The spectacle
Multi-provider agent interaction is inherently compelling and exists nowhere else. Claude arguing with Codex while Gemini plays devil’s advocate is shareable content that every user generates just by using the product. This selects for the right audience: people curious enough about AI to become users.
Neither loop requires building anything extra. Both are emergent properties of the product being provider-agnostic and interactions being visible.
Naming
The local tool needs its own name, distinct from Signet.
Signet = the network layer. Reputation, discovery, trust. “GitHub.” [TBD] = the local runtime. Any agent, any model, zero risk. “git.”
Requirements:
- Short, one word, easy to type
- Works as CLI:
[name] run,[name] status - Evokes building/safety/freedom, not surveillance
- No conflicts with existing tools
- Bonus: domain available
To be decided separately, not under time pressure.
Open questions
- Default local model: Which model ships in the image? Must run on a MacBook Air, must be good enough for simple tasks. 7-8B parameter range. Needs benchmarking.
- Browser-only experience: How do Wave 1 users (no terminal, no Docker) get in? WebContainers? Docker Desktop with web UI? Hosted version? This is the biggest unsolved UX question.
- Recipe format stability: v0 will change. Need a versioning and migration strategy.
- Event kind registration: Use generic Nostr kinds (30078) until vocabulary stabilizes, or register custom kinds early?
- OpenCode relationship: The containerization layer is below OpenCode. Do we contribute upstream, maintain a fork, or integrate independently?
- Spectacle features: How much do we invest in making multi-agent interactions visible/shareable vs. leaving it to the community? Instinct says: make the raw interaction viewable, let the community build the TikTok layer.
- Content moderation: Recipes can configure anything. Do we need a policy? Instinct says: the runtime is like git (neutral infrastructure), Signet is where quality signals live (reputation, not censorship).
Validation checklist (before writing code)
- [ ] 5 non-engineer humans attempt Docker-based experience
- [ ] 5 non-engineer humans attempt the “simulacrum” use case with local model
- [ ] Benchmark local model quality in container vs. bare metal
- [ ] Benchmark cold start time (image pull → working agent)
- [ ] Map 3 real swarm tasks to proposed event vocabulary
- [ ] Package 2 recipes manually and test portability across machines
- [ ] Talk to 3 OpenCode community members about containerization
- [ ] Survey 10 people: “What would you do with a free, private, unlimited AI on your laptop?”
What success looks like
Month 1: Docker image works. 50 developers try it. 5 share recipes.
Month 3: Recipe ecosystem emerging. Multi-agent support live. First non-developer users. The spectacle starts generating organic content.
Month 6: Relay sync available. Signet alpha indexing recipe outcomes. Recipe reputation visible. Community growing around recipe sharing.
Month 12: Signet is the trust layer. “Which AI configuration should I use for X?” has an answer backed by data. The event stream from thousands of users powers reputation scoring that no competitor can replicate.
AWS GAIA application (June 2026): Runtime is live, recipes are being shared, the story is concrete and demonstrable.
Write a comment