Confluence Knowledge Base — Design¶

Date: 2026-04-18 Status: Proposed Scope: Neumann's Workshop Confluence — first project space: PhonoLex

Purpose¶

Stand up an internal Confluence knowledge base for Neumann's Workshop, starting with a PhonoLex project space that will serve as the template for future project spaces.

The KB is optimized for three uses:

Onboarding — a new collaborator should be able to read the space and understand what the project is, how it's built, and how to operate it.
Collaboration reference — a shared surface for cross-cutting context that the repo can't carry well.
Maintenance and auditing — a durable, navigable record of decisions, operations, data provenance, legal status, and incidents.

Design test¶

Every candidate page passes through a single test:

Would this matter to a historian looking back at Neumann's Workshop in 100 years?

If yes, it belongs in Confluence. If it's ephemeral, implementation-level, or derivable from the code/git history, it stays in the repo.

Audience¶

Internal only — current solo operator (Jared Neumann), future collaborators, contractors, and auditors. Not public-facing.

Repo vs Confluence boundary¶

Concern	Home	Why
Architecture (detailed, technical)	Repo (`docs/`, code comments)	Drifts fast; PRs should touch it alongside code
Feature specs & implementation plans	Repo (`docs/superpowers/specs/`, `plans/`)	Ephemeral, per-feature, archived after shipping
API schemas, data schemas, phoneme tables	Repo	Reference material that must match code exactly
AI project instructions	Repo (`CLAUDE.md`)	Tool-specific, tightly coupled to code state
High-level architecture overview ("whiteboard view")	Confluence	Stable, narrative, onboarding value
Decisions & rationale (ADRs)	Confluence	Historical record; decision ≠ the code that resulted from it
Operations, environments, deploy topology	Confluence	Cross-cutting, slow-changing, audit-relevant
Runbooks	Confluence	Procedural, not code
Data provenance, licensing, attribution	Confluence	Audit requirement; legal record
Roadmap & release history	Confluence	Historical timeline
Incidents & lessons learned	Confluence	Institutional memory
Legal, licensing, trademark	Confluence	Company-level, durable
People, vendors, external relationships	Confluence	Organizational record

Rule of thumb: existing repo docs are treated as reference source material for building Confluence pages, not content to be migrated wholesale. Confluence pages are written fresh from doc + code review.

Space structure¶

Space key: PHONOLEX (global space) Space name: PhonoLex

Top-level page tree under the space home:

Mission & Origins — why PhonoLex exists, the problem it addresses, who it serves
Architecture Overview — high-level "whiteboard view" of the three faces and data flow; explicit repo-vs-Confluence boundary statement
Data & Datasets — provenance, licenses, versions, update cadence for every source dataset
Decisions (ADRs) — significant technical and product decisions, one page per decision
Operations — environments, deploy topology, secrets inventory (names + locations only), monitoring, cost tracking
Runbooks — step-by-step procedures for repeatable operational tasks
Roadmap & Milestones — release history, planned work, on-hold items
Incidents & Lessons — short post-mortems: what broke, why, what changed
Legal & Licensing — license scope, dataset attribution, trademarks, domains, ToS, privacy policy source of truth
People & Relationships — contributors, contractors, vendors, external relationships

Space home page¶

Brief: one paragraph on what PhonoLex is, links to live URLs (phonolex.com, staging, repo, RunPod console), a link to each of the 10 top-level pages with a one-line purpose statement. Updated rarely.

Out of scope (intentionally excluded)¶

"API Reference" / "Schema Reference" — repo territory, drifts too fast.
"Current Sprint" / "Active Tasks" — not historical; lives in GitHub issues or Jira.
Per-feature specs — stay in docs/superpowers/specs/.

Seeding strategy¶

Approach A: Skeleton + starter pages. Create all 10 top-level pages with a 1-paragraph purpose stub. Populate a small set of high-priority child pages immediately. Everything else is added in the flow of work.

Starter pack (v1 content)¶

Section	Seed page(s)
Mission & Origins	"What PhonoLex is and why it exists"
Architecture Overview	"High-level architecture" (three faces, data flow, repo-vs-Confluence boundary)
Data & Datasets	"Dataset inventory" (table: source, version, license, attribution requirement, update cadence)
Decisions (ADRs)	Three seed ADRs: (1) Cloudflare Workers + D1 for API, (2) T5Gemma 9B-2B as the generation model, (3) Unified trie-based constrained generation architecture
Operations	"Environments and deploy topology" (local / staging / prod, RunPod, Cloudflare Pages, secrets map)
Runbooks	(stub only; populate on demand)
Roadmap & Milestones	(stub only; first real entry = v5.0.0 release)
Incidents & Lessons	(stub only; populate on incident)
Legal & Licensing	"License scope and dataset attributions"
People & Relationships	"Contributors and vendors" (Jared, Cloudflare, RunPod, HuggingFace, GitHub)

Approximately 7 populated pages + 10 section headers at v1. Each populated page is written by reading the relevant code and repo docs and composing fresh content appropriate for this audience and format — not by copying existing docs.

Growth model¶

When a significant decision is made → write an ADR.
When a deploy or operational procedure stabilizes → write a runbook.
When an incident occurs → write a short post-mortem.
When a dataset is added, updated, or removed → update the inventory.
When a release ships → update the roadmap/milestones page.

Confluence remains accurate because it only exists where it's been deliberately written.

ADR format¶

Use a lightweight MADR-style template:

# ADR NNN: <title>

**Status:** Proposed | Accepted | Superseded by ADR NNN
**Date:** YYYY-MM-DD

## Context

What forces are at play? What problem are we solving?

## Decision

What we chose.

## Options considered

- Option A — pros/cons
- Option B — pros/cons
- Option C — pros/cons

## Consequences

What becomes easier, harder, or locked in as a result?

ADRs are immutable once accepted. A superseding decision gets a new ADR that references the old one.

Template-ability for future NW projects¶

Sections 1, 2, 4, 5, 6, 7, 8, 9, 10 transfer to any project. Section 3 (Data & Datasets) is dataset-heavy and generalizes to any data-centric project; non-data projects can replace it with a domain-specific section.

The space-per-project pattern is the unit of replication. Each new NW project gets a fresh space with the same 10-section backbone, seeded via the same starter-pack process.

A workshop-level umbrella space (cross-project conventions, shared vendor list, company legal) is deliberately not designed in this spec — it's deferred until there's a second project space to share content with.

Non-goals¶

This spec does not define access controls, page permissions, or Confluence admin settings. Default space permissions for an internal workspace are assumed.
This spec does not migrate existing repo docs. Repo docs remain authoritative for their content; Confluence pages are written fresh.
This spec does not design the implementation sequence — that's for the implementation plan that follows.

Success criteria¶

All 10 top-level pages exist under the PhonoLex space, each with a clear 1-paragraph purpose.
The starter-pack pages are populated with content written fresh from code + doc review.
Each populated page passes the 100-year historian test.
The structure is documented well enough that a second NW project space can be stood up by following the same pattern.