Skip to content

Confluence Knowledge Base — Design

Date: 2026-04-18 Status: Proposed Scope: Neumann's Workshop Confluence — first project space: PhonoLex

Purpose

Stand up an internal Confluence knowledge base for Neumann's Workshop, starting with a PhonoLex project space that will serve as the template for future project spaces.

The KB is optimized for three uses:

  1. Onboarding — a new collaborator should be able to read the space and understand what the project is, how it's built, and how to operate it.
  2. Collaboration reference — a shared surface for cross-cutting context that the repo can't carry well.
  3. Maintenance and auditing — a durable, navigable record of decisions, operations, data provenance, legal status, and incidents.

Design test

Every candidate page passes through a single test:

Would this matter to a historian looking back at Neumann's Workshop in 100 years?

If yes, it belongs in Confluence. If it's ephemeral, implementation-level, or derivable from the code/git history, it stays in the repo.

Audience

Internal only — current solo operator (Jared Neumann), future collaborators, contractors, and auditors. Not public-facing.

Repo vs Confluence boundary

Concern Home Why
Architecture (detailed, technical) Repo (docs/, code comments) Drifts fast; PRs should touch it alongside code
Feature specs & implementation plans Repo (docs/superpowers/specs/, plans/) Ephemeral, per-feature, archived after shipping
API schemas, data schemas, phoneme tables Repo Reference material that must match code exactly
AI project instructions Repo (CLAUDE.md) Tool-specific, tightly coupled to code state
High-level architecture overview ("whiteboard view") Confluence Stable, narrative, onboarding value
Decisions & rationale (ADRs) Confluence Historical record; decision ≠ the code that resulted from it
Operations, environments, deploy topology Confluence Cross-cutting, slow-changing, audit-relevant
Runbooks Confluence Procedural, not code
Data provenance, licensing, attribution Confluence Audit requirement; legal record
Roadmap & release history Confluence Historical timeline
Incidents & lessons learned Confluence Institutional memory
Legal, licensing, trademark Confluence Company-level, durable
People, vendors, external relationships Confluence Organizational record

Rule of thumb: existing repo docs are treated as reference source material for building Confluence pages, not content to be migrated wholesale. Confluence pages are written fresh from doc + code review.

Space structure

Space key: PHONOLEX (global space) Space name: PhonoLex

Top-level page tree under the space home:

  1. Mission & Origins — why PhonoLex exists, the problem it addresses, who it serves
  2. Architecture Overview — high-level "whiteboard view" of the three faces and data flow; explicit repo-vs-Confluence boundary statement
  3. Data & Datasets — provenance, licenses, versions, update cadence for every source dataset
  4. Decisions (ADRs) — significant technical and product decisions, one page per decision
  5. Operations — environments, deploy topology, secrets inventory (names + locations only), monitoring, cost tracking
  6. Runbooks — step-by-step procedures for repeatable operational tasks
  7. Roadmap & Milestones — release history, planned work, on-hold items
  8. Incidents & Lessons — short post-mortems: what broke, why, what changed
  9. Legal & Licensing — license scope, dataset attribution, trademarks, domains, ToS, privacy policy source of truth
  10. People & Relationships — contributors, contractors, vendors, external relationships

Space home page

Brief: one paragraph on what PhonoLex is, links to live URLs (phonolex.com, staging, repo, RunPod console), a link to each of the 10 top-level pages with a one-line purpose statement. Updated rarely.

Out of scope (intentionally excluded)

  • "API Reference" / "Schema Reference" — repo territory, drifts too fast.
  • "Current Sprint" / "Active Tasks" — not historical; lives in GitHub issues or Jira.
  • Per-feature specs — stay in docs/superpowers/specs/.

Seeding strategy

Approach A: Skeleton + starter pages. Create all 10 top-level pages with a 1-paragraph purpose stub. Populate a small set of high-priority child pages immediately. Everything else is added in the flow of work.

Starter pack (v1 content)

Section Seed page(s)
Mission & Origins "What PhonoLex is and why it exists"
Architecture Overview "High-level architecture" (three faces, data flow, repo-vs-Confluence boundary)
Data & Datasets "Dataset inventory" (table: source, version, license, attribution requirement, update cadence)
Decisions (ADRs) Three seed ADRs: (1) Cloudflare Workers + D1 for API, (2) T5Gemma 9B-2B as the generation model, (3) Unified trie-based constrained generation architecture
Operations "Environments and deploy topology" (local / staging / prod, RunPod, Cloudflare Pages, secrets map)
Runbooks (stub only; populate on demand)
Roadmap & Milestones (stub only; first real entry = v5.0.0 release)
Incidents & Lessons (stub only; populate on incident)
Legal & Licensing "License scope and dataset attributions"
People & Relationships "Contributors and vendors" (Jared, Cloudflare, RunPod, HuggingFace, GitHub)

Approximately 7 populated pages + 10 section headers at v1. Each populated page is written by reading the relevant code and repo docs and composing fresh content appropriate for this audience and format — not by copying existing docs.

Growth model

  • When a significant decision is made → write an ADR.
  • When a deploy or operational procedure stabilizes → write a runbook.
  • When an incident occurs → write a short post-mortem.
  • When a dataset is added, updated, or removed → update the inventory.
  • When a release ships → update the roadmap/milestones page.

Confluence remains accurate because it only exists where it's been deliberately written.

ADR format

Use a lightweight MADR-style template:

# ADR NNN: <title>

**Status:** Proposed | Accepted | Superseded by ADR NNN
**Date:** YYYY-MM-DD

## Context

What forces are at play? What problem are we solving?

## Decision

What we chose.

## Options considered

- Option A — pros/cons
- Option B — pros/cons
- Option C — pros/cons

## Consequences

What becomes easier, harder, or locked in as a result?

ADRs are immutable once accepted. A superseding decision gets a new ADR that references the old one.

Template-ability for future NW projects

Sections 1, 2, 4, 5, 6, 7, 8, 9, 10 transfer to any project. Section 3 (Data & Datasets) is dataset-heavy and generalizes to any data-centric project; non-data projects can replace it with a domain-specific section.

The space-per-project pattern is the unit of replication. Each new NW project gets a fresh space with the same 10-section backbone, seeded via the same starter-pack process.

A workshop-level umbrella space (cross-project conventions, shared vendor list, company legal) is deliberately not designed in this spec — it's deferred until there's a second project space to share content with.

Non-goals

  • This spec does not define access controls, page permissions, or Confluence admin settings. Default space permissions for an internal workspace are assumed.
  • This spec does not migrate existing repo docs. Repo docs remain authoritative for their content; Confluence pages are written fresh.
  • This spec does not design the implementation sequence — that's for the implementation plan that follows.

Success criteria

  1. All 10 top-level pages exist under the PhonoLex space, each with a clear 1-paragraph purpose.
  2. The starter-pack pages are populated with content written fresh from code + doc review.
  3. Each populated page passes the 100-year historian test.
  4. The structure is documented well enough that a second NW project space can be stood up by following the same pattern.