For a 90-Day Product-Led Growth Phase

Introduction

This playbook defines how DataHub improves growth without increasing traffic.

It operationalizes our shift toward a “Substack / Statista for Data” model by focusing on the only thing that matters at this stage: whether users can successfully extract value from datasets they already find via search.

This is a product-led growth phase. Marketing activity is limited to optimizing the product surfaces where demand already lands.

Strategic Frame (Read This First)

  • Time horizon: 90 days
  • Growth constraint: value realization, not awareness
  • Primary unit of optimization: individual dataset page
  • Primary success signal: Data Extraction, not email capture or subscriptions

Publisher growth, brand storytelling, and classic marketing are explicitly deferred until this foundation is strong.


Core Definition (Used Everywhere)

Data Extraction A user action that produces a reusable data artifact.

Count:

  • Full dataset downloads
  • Sample / filtered downloads
  • Visualization exports (PNG/SVG/PDF)
  • Copy-to-clipboard of tabular data

Do not count views, scrolls, or passive interactions.


North-Star Metric

Data Extractions per 1,000 Dataset Page Views

This is the single metric the team optimizes in this phase.


Metrics Dashboard (What We Track)

Required (page-level)

  • Page views
  • Unique visitors
  • Total data extractions
  • Data extractions / 1,000 views
  • Bounce rate
  • Median time on page

Diagnostic (supporting)

  • Extraction type breakdown (raw / sample / visualization)
  • Preview interactions
  • Multi-dataset sessions

Explicitly Secondary (observe only)

  • Account creation
  • Subscriptions
  • Purchases

Success = extraction density increases on top pages.


Dataset Page Audit Checklist (How We Improve Metrics)

Every dataset page should pass these tests.

1. Relevance (≤10 seconds)

  • Clear scope: what / where / when
  • One-line problem statement
  • Coverage shown (geo, time, granularity)
  • License / price visible

2. Extraction Clarity

  • Primary CTA clearly labeled
  • No surprise signup/payment
  • Multiple extraction paths visible if applicable

3. Immediate Use

  • Inline preview with real data
  • Low-friction extraction option available

4. Trust (after usefulness)

  • Source identified
  • Methodology / notes available
  • Update info visible

5. Lateral Expansion

  • Related datasets shown
  • Clear “next useful dataset”

6. Platform Signals (minimal)

  • Light DataHub attribution only
  • No interrupts before first extraction

7. Instrumentation

  • All extraction events tracked

Each page is marked: ✅ Extraction-ready | ⚠️ Fixable | ❌ Extraction-blocked


Benchmarks (Directional)

  • <10 extractions / 1,000 views → failing
  • 10–25 → baseline
  • 25–50 → good
  • >50 → excellent

These are guides, not rigid KPIs.


Operating Loop (Weekly)

  1. Identify underperforming high-traffic pages
  2. Audit with checklist
  3. Fix highest-leverage failures
  4. Confirm metric movement

Do not add new datasets or campaigns until this loop works.


How This Playbook Relates to Other Docs

This playbook is the operational distillation of those documents. If something is unclear here, the deeper docs provide justification—not alternative strategy.


Final Reminder

If users cannot extract value, nothing else compounds.

This playbook governs decisions until extraction performance is demonstrably strong.