Product: High Level Job Stories

Below is a structured set of job stories, grouped by user flows, and broken down incrementally from MVP to adjacent extensions.


1. Onboarding and Publication Creation

Job: create a data publication

When I sign up for the platform, I want to create a named data publication with minimal setup, so that I immediately have a clear place to publish and share data.

Sub-jobs (incremental):

  • Choose a name and short description.
  • Get a stable home URL/subdomain automatically.
  • Understand, at a glance, what kind of content belongs here.

Here is a clean amendment, integrated at the job-story level.


Seeded publication on creation so that i nderstand what my publication will look like

When I create a new data publication, I want it to be pre-populated with example content, so that I can immediately understand the publishing model without imagining it abstractly.

Sub-jobs:

  • Automatically create a welcome data post that explains what this publication is and how data posts work.
  • Automatically create an about page describing the publication’s purpose, editable by the user.
  • Ensure both are fully editable and deletable, but present by default.

Design rationale This mirrors Substack’s onboarding pattern: users learn by editing concrete examples rather than reading instructions. It reduces blank-page anxiety, clarifies constraints, and shortens time-to-first-real-post. The key is that this requires exactly one save from the user: creation and comprehension happen together.


Job: understand what I can publish

When I first arrive in the editor, I want clear constraints on what a “data post” is, so that I don’t have to design structure or make format decisions.

Sub-jobs:

  • See that a post consists of title, description, and one primary data source.
  • Understand that this is not a general CMS or notebook environment.

2. Publishing a Data Post

Job: publish a dataset with minimal friction

When I have a dataset I want to share, I want to connect it or drop it in with almost no configuration, so that publishing takes minutes rather than hours.

Sub-jobs (v1):

  • Connect a Git repository as the source of truth.
  • Select a dataset path or default file.
  • Publish without manual schema definition or chart design.

Job: make my data immediately legible

When my dataset is published, I want it to render previews and basic visuals automatically, so that readers can understand it without downloading it first.

Sub-jobs:

  • Auto-generate a table preview.
  • Infer simple charts where appropriate.
  • Display basic metadata (rows, columns, file type).

Job: treat a dataset as a durable object

When I update my data later, I want previous versions to remain accessible, so that links remain stable and changes are traceable.

Sub-jobs (later iteration):

  • Snapshot data at publish time.
  • Create new versions rather than mutating old ones.

3. Reading, Discovery, and Social Interaction

Job: discover useful datasets from others

When I’m browsing the platform, I want to see recent and popular data posts, so that I can learn, reuse data, or follow good publishers.

Sub-jobs:

  • Browse a feed of recent data posts.
  • See simple popularity signals (likes, subscribers).

Job: signal that a dataset is valuable

When I find a data post I appreciate, I want to like it with a single action, so that I can express value without commentary.

Sub-jobs:

  • Like a data post.
  • See aggregate likes as a public signal.

Job: keep up with specific data publishers

When I find a publication I trust, I want to follow or subscribe to it, so that I receive future datasets without actively checking.

Sub-jobs:

  • Follow a publication for free updates.
  • Receive posts in a feed or by email.

4. Monetisation and Subscriptions

Job: charge for ongoing access to my data

When my data has ongoing value, I want to enable paid subscriptions at the publication level, so that monetisation is simple and predictable.

Sub-jobs (MVP):

  • Set a monthly or annual price.
  • Mark posts as free or subscribers-only.
  • Avoid per-dataset pricing complexity.

Job: accept payments without managing infrastructure

When someone pays for my publication, I want payments, fees, and payouts handled automatically, so that I don’t have to manage billing logic.

Sub-jobs:

  • Connect Stripe once.
  • See net payouts after platform fees.

Job: validate demand before charging

When I’m not ready to enable payments yet, I want people to pledge interest, so that I can gauge demand without friction.

Sub-jobs:

  • Allow users to pledge to subscribe.
  • See counts of pledged interest.

5. Publisher Feedback and Analytics

Job: know whether my data is being used

When I publish datasets, I want basic visibility into views and downloads, so that I know whether my work is reaching people.

Sub-jobs:

  • See views per data post.
  • See dataset download counts.

Job: track audience and revenue health

When I run a paid publication, I want a simple dashboard of subscribers and revenue, so that I can understand sustainability without analytics overload.

Sub-jobs:

  • See free vs paid subscribers.
  • Track new subscriptions and cancellations.

Job: stay aware of important activity

When something meaningful happens, I want lightweight notifications, so that I feel feedback without constant monitoring.

Sub-jobs:

  • Notification on new subscriber.
  • Notification on notable engagement events.

Raw

Okay, so I want you to act as a world-class product, don't you? And you're selling the product design for the Substack for Data. I think we want to focus on two areas here, which is the… Well, maybe three. The sign-up, the kind of creation experience into Substack for Data. Then the social and liking, the third area is getting kind of paid, which is kind of paid subscriptions, paid purchases of datasets. And so the idea is I kind of, the experience is I come in, I sign up, and then I create a data publication. And that can have any name at the moment. So I create that, and then I don't know if I have any of this stuff of importing subscribers or anything like that. I don't think at the moment I really do. But what happens then is that I then publish a dataset, which is the moment I connect to Git repo, or maybe I can drop a CSV file. That's, you know, that's to be seen. But I think the simplest moment is a Git repo. So that's kind of part one of the stream. There's probably a bunch of detail to work out here. Number two is once I've signed up and so on, I can like existing posts and things like that. I can like, sorry, data posts, I guess we'll call them. And we can also follow or subscribe to a data publication. I guess then what happens, that's pretty easy, our concept of liking and following. We should think about that. I don't know if there's a place that shows all the things I've liked. But anyway, then there's the getting paid, where I can set a price, I guess. Some of my posts can be open within my publication. Some of them can be private or subscribers only. And then there can be different. Basically, I can subscribe to an entire publication, and then we even have pledged to subscribe before they even have the payment system set up. But I can pledge or I can, which is, yeah, and then I can, I can set maybe different amounts, maybe sometimes even per post or for the price of a whole publication. That's interesting. One here, which is a little bit different, like maybe per post, per data set I can charge, or I've got the whole publication. I think for now, I can just have a whole publication. Let's keep it simple. And you have a subscription for that. Yeah, and then I have to connect from, as a publisher, I have to connect Stripe to get paid and things like that. And then this also takes us through the experience of signing up. I don't know how we take our 10% cut, but there should be some of that. Then the other aspect on social, the final feature would be also showing analytics, like having a dashboard for users that shows them, and sends them updates, like someone subscribed, something happened, like that kind of general activity, that the things are going on, that the people are viewing my data set, are downloading it, blah, blah, blah.