Below is the revised first-order component list, updated to reflect your clarifications, followed by a brief note on what changed and why. This is now in a good state to move from conceptual architecture to executable specs.


First-order components (revised)

1. Identity and accounts

Purpose Manage users and authentication.

Core responsibilities

  • User sign-up and login (email/OAuth)
  • User profile (minimal, non-public-facing at MVP)

Design notes

  • Identity is global; permissions are resolved at the publication level.

Explicit non-goals (v1)

  • Organisation-level billing
  • Cross-publication roles

2. Publications and permissions

Purpose Represent a data publication as a durable, subscribable container with extensible ownership.

Core responsibilities

  • Create publications
  • Publication metadata (name, description, visibility)
  • Auto-create seeded content on creation (welcome post, about page)
  • Manage publication permissions

Permission model (v1 → v2)

  • Single owner at creation
  • Owner can add other users as collaborators (roles: editor / contributor)
  • Architecture should anticipate teams without implementing full orgs

Key invariants

  • Every publication has exactly one owner
  • Publications own posts, subscribers, and revenue

3. Data posts (datasets and data stories)

Purpose Define and manage the atomic publishing unit.

Core responsibilities

  • Create, edit, and publish data posts

  • Support two post modes under one model:

    • dataset-first
    • story-first (text with embedded data/graphs)
  • Manage post visibility (public vs subscribers-only)

Key invariants

  • Every post belongs to exactly one publication
  • Posts are editable after publish
  • Publishing creates a new snapshot/version

Design constraint

  • Constrained structure; no free-form layout system

4. Data ingestion and versioning

Purpose Ingest datasets and produce stable, usable snapshots.

Core responsibilities (MVP)

  • Connect to a Git repository
  • Select dataset path and branch
  • Snapshot data at publish time
  • Generate derived artefacts (preview, metadata)

Explicit non-goals (MVP)

  • CSV drag-and-drop
  • Inline editors
  • Multi-source joins

Architectural note

  • Ingestion is decoupled from posts so future inputs (CSV upload, URL ingestion) can be added cleanly.

5. Social graph

Purpose Provide lightweight social signals and discovery primitives.

Core responsibilities

  • Likes on data posts
  • Follow publications
  • Subscriber lists (free vs paid)

Design constraint

  • No comments or resharing at MVP
  • Social signals are additive, not conversational

6. Payments and access control

Purpose Enable monetisation via subscriptions.

Core responsibilities

  • Enable paid subscriptions per publication
  • Stripe integration
  • Access gating for subscribers-only posts
  • Platform fee extraction

Key constraint

  • Monetisation is publication-level only
  • No per-post or per-dataset pricing at MVP

7. Distribution (feeds and email)

Purpose Ensure published data reaches readers.

Core responsibilities

  • Public URLs for publications and posts
  • Follow-based user feed
  • Email delivery to subscribers (optional but first-class)

MVP stance

  • Email is architecturally supported but not required for first release

8. Analytics and notifications

Purpose Provide publishers with feedback and reassurance.

Core responsibilities

  • Views, downloads, likes per post
  • Subscriber counts and changes
  • Event-based notifications (new subscriber, etc.)

Design constraint

  • Descriptive, not optimisation-oriented

What changed relative to the previous version

  • Publications now explicitly include a permissions layer, anticipating teams without overbuilding orgs.
  • Data posts are clarified as a single model supporting both datasets and data stories.
  • Data ingestion is separated from data posts as its own component, improving extensibility.
  • Post editing and versioning are first-class rather than implied.
  • Email is downgraded from mandatory to priority-but-optional, which simplifies MVP sequencing.