How MerchSage turns a YouTube channel into print-on-demand merch in 6 stages
MerchSage takes a YouTube channel URL and produces a stocked storefront. No human picks the products. No human writes artwork prompts. No human approves a design before it lands on a t-shirt. The whole thing runs as a 6-stage pipeline.
This post is the overview — what each stage contributes and how they fit together. Two of the stages are interesting enough to deserve their own posts, linked below.
The 6 stages
Scrape → Analyze → Generate Concepts → Create Designs → Generate Mockups → Finalize Listings
| Stage | Role |
|---|---|
| 1. Scrape | Pull raw material from YouTube |
| 2. Analyze | Build a structured creative brief for the channel |
| 3. Concepts | Turn the brief into design briefs |
| 4. Designs | Render the briefs as artwork |
| 5. Mockups | Show the artwork on real products |
| 6. Listings | Produce storefront-ready listing drafts |
Each stage has one job. Each stage can be re-run on its own.
Stage 1 — Scrape
We pull the raw evidence: a representative sample of the channel's videos, their transcripts, comments, and visual material. The sample is time-spread and outlier-trimmed, so a single viral hit doesn't drag the brand reading off-center. This is the only stage that reaches outside the system — everything downstream works from what we capture here.
Stage 2 — Analyze
A set of AI specialists read the scraped material and build a structured understanding of the channel. Between them, they produce:
- Brand understanding — what the creator stands for, who the audience is, the personality of the channel.
- A visual design guide — palette, motifs, typography, the creative range that fits the brand.
- A product plan — which products belong in this creator's lineup, and what mockup scenes match the channel's vibe.
- Asset extraction — recurring visual elements (logos, faces, characters) lifted from channel imagery for reuse in designs.
Every downstream creative decision flows from this. Anything we generate later sits on top of the design guide and the product plan.
Stage 3 — Generate Concepts
Turn the creative brief into design briefs — the specifications that drive image generation.
We generate aggressively, far more concepts than we'll keep, to maximize creative diversity. A rating-and-pruning pass then selects the strongest, most distinctive ones to actually render.
The split between generation and selection is deliberate. Asking a model to be both wildly creative and ruthlessly discerning in a single pass produces safe, average output. Splitting it lets generation run uninhibited and selection run cold.
Stage 4 — Create Designs
Render the selected briefs into transparent artwork — ready to drop onto a product.
The hard part isn't the image generation itself — it's getting clean transparent cutouts reliably, on every design, at scale. That's its own post.
A rating pass scores the rendered designs. Anything the model bungled — visible artifacts, broken composition, off-brand colors — gets filtered before it ever reaches a product.
Stage 5 — Generate Mockups
Send each design to Printful to be rendered on real products — t-shirts, posters, mugs, phone cases — in scenes chosen to match the channel's vibe. A visual quality gate scores how well each design sits on its product. Mockups that don't pass are kept for review but excluded from the storefront.
Stage 6 — Finalize Listings
Pick the best design per product line, write SEO-ready listing copy, and produce drafts for the storefront. Publishing happens later — admin curation and the creator's own selection through the portal control what actually goes live.
How the stages fit together
Stages communicate through a shared database, not through pipeline outputs. Each stage persists everything it produces; the next stage loads what it needs. That decoupling is what makes any stage re-runnable in isolation, and what lets a failed stage stop a run cleanly without taking the rest of the pipeline with it.
The orchestration itself — how the stages are wired together, how concurrent pipeline runs share rate-limited APIs, how an LLM agent can author and operate the whole thing — is its own post.
A failed run is cheap. A silently-degraded run produces wrong artifacts. The whole codebase leans hard into the first.