Brad Zhang / public archive

Brad Zhang

AI product notes, agent workflow writing, and open-source dossiers for founders and early technical teams.

Open source/fireworks-design

Project dossier / README

fireworks-design

yizhiyanhua-ai/fireworks-design

🎆 A Claude Code workflow that replicates ClaudeDesign — parallel design exploration, panel judging, synthesis, and adversarial refinement to a single world-class frontend page. 双语 README。

Repository reading order

Understand the workflow it replaces before deciding whether to stay in the README, open a topic path, or move toward collaboration.

01 / Workflow

🎆 A Claude Code workflow that replicates ClaudeDesign — parallel design exploration, panel judging, synthesis, and adversarial refinement to a single world-class frontend page. 双语 README。

Start by naming the old workflow this repo is replacing.

02 / README

<div align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="docs/images/logo/logo-dark.svg?v=2"> <source media="(prefers-color-scheme: light)" srcset="docs/images/logo/logo-light.svg?v=2"> <img alt="fireworks-design logo" src="docs/i...

Then use the README and site interpretation to confirm the project boundary and delivery surface.

03 / Proof graph

Return to thesis and the wider proof system

Finally, place the repository back into a bigger market question and working conversation.

github.com/yizhiyanhua-ai/fireworks-design
fireworks-design logo

fireworks-design

An open-source Claude Code workflow that replicates ClaudeDesign — fan out many distinct design directions, judge them like a panel, synthesize the best, and adversarially refine to a single world-class frontend page.

English · 简体中文

License: MIT
Platform
Dynamic Workflow
Model
PRs Welcome


One-sentence pitch: Stop rolling the dice on a single model output. fireworks-design explores 6–8 independent aesthetics in parallel, scores each across 6 design dimensions, grafts the winners together, then critique-fix-loops the result until it's shippable.

Six-phase pipeline

📦 Install

In Claude Code, just say:

Install fireworks-design into this project — repo is yizhiyanhua-ai/fireworks-design.

Claude drops the workflow into .claude/workflows/ and you're done.

Prefer the shell?
mkdir -p .claude/workflows
curl -fsSL -o .claude/workflows/fireworks-design.js \
  https://raw.githubusercontent.com/yizhiyanhua-ai/fireworks-design/main/fireworks-design.js

🚀 Usage

That's the whole setup. Now just talk to Claude in plain language — say what you want and where to save it:

Use fireworks-design to build a landing page for a coffee shop. Save it to ~/design/coffee.

Claude runs the workflow in the background (watch it live with /workflows) and hands you a finished final.html. Tweak it conversationally whenever:

…explore 6 directions, refine 3 rounds, and use brand color #7c3aed.

What you get: a single self-contained final.html (plus the explored draft-*.html directions) and a summary of which aesthetic won and why.

Prefer to call it directly?
Workflow({
  name: "fireworks-design",
  args: { prompt: "...", outputDir: "/abs/path", variants: 6, refineRounds: 2 }
})

prompt and outputDir (absolute path) are required; variants (3–8), refineRounds, brand, and lenses are optional. Full reference in the Arguments section below.

✨ Featured outputs (效果解读)

14 real pages across totally different domains — all live on GitHub Pages · full table + deep-dives. Four highlights:

LUMIÈRE 🎬 LUMIÈRE — movie rating · Dark Premium NOVA · AURORA 🎵 NOVA · AURORA — album · Bold Editorial
OBJECT & ECHO 🎨 OBJECT & ECHO — studio · Bold Editorial AZORES ✈️ AZORES — travel · Bold Editorial
Page Winner & why it fits Signature moment
🎬 LUMIÈRE — movie rating Dark Premium — theatrical immersion beat editorial; antique-gold rationed as "prestige currency" (only ratings/CTA/top ranks) one-shot gold projector-beam sweeps the 9.2 score
🎵 NOVA · AURORA — album Bold Editorial — oversized Didone wordmark, midnight-violet-teal 60/30/10 generative cover + 32-bar visualizer + play-state changes the whole room
🎨 OBJECT & ECHO — studio Bold Editorial — gallery-zine, kinetic grotesque "object" vs ghosted italic "echo" spatial afterimage echo behind the hero wordmark
✈️ AZORES — travel Bold Editorial — photography-as-product, NatGeo-meets-Cereal interactive islands map with breathing halo + cross-fade detail panel

Winner diversity proves the point: across 14 briefs, Bold Editorial won ×8, Dark Premium ×3 (movie/restaurant/ecommerce), Swiss Minimal ×2 (fitness/edtech), Editorial ×1 (nonprofit). Different briefs crown different winners — that's why we judge instead of generating once. Full 效果解读 (winning rationale, signature moments, real bugs the refine/polish pass caught) in examples/README.md.

🖼️ Gallery — all 14 live examples


LUMIÈRE

MAISON NOIR

OBJECT & ECHO

NOVA · AURORA

PULSE

AZORES

LUMEN

ECHOES OF THE VOID

Brightwater

AURA ONE

BUILD/2026

vector

tideline

Lin Hua

🧠 How it works — six phases

① Brief — distill a design system

One agent turns your prompt (+ optional brand) into a shared creative brief: product framing, audience, required sections, and concrete design tokens (font pairings, palette hexes, mood, references). These tokens are injected into every subsequent agent, so the whole pipeline stays on-brand.

② Diverge — wide exploration (the quality core)

Each agent commits fully to one aesthetic and produces a complete, self-contained HTML file. Each generator internally plans → self-critiques → produces, rather than dumping a first draft.

Diverge fan-out

③ Judge — panel scoring

Every direction × every design dimension, scored 1–10 by independent critics (~36 critiques in parallel for 6 directions). Each verdict also returns its single highest-leverage fix, which feeds the next phase.

Judge matrix

④ Synthesize — graft the best

Reads the Top-3 directions' source, uses the strongest as the base, folds in the best elements of the others, and fixes every issue the judges flagged. The output must clearly beat any single direction.

⑤ Refine — adversarial polishing

A ruthless reviewer returns prioritized issues (severity-tagged); a fixer applies them surgically. Looped refineRounds times (default 2). This is ClaudeDesign's "fine-grained controls," engineered.

Refine loop

⑥ Polish — ship-ready QA

A final gate checks and fixes: responsiveness (375/768/1280+), all interactive states, prefers-reduced-motion, semantic HTML + ARIA, WCAG AA contrast, no console errors, no leftover placeholders — then writes final.html.

✨ Why this exists

Even experienced designers ration exploration — there's rarely time to prototype a dozen directions, so you settle for two. From the Claude Design announcement:

"Even experienced designers have to ration exploration — there's rarely time to prototype a dozen directions, so you limit yourself to a few."

A single LLM generation is one draw from a distribution. Its taste, mood, and prompt interpretation are locked into that one version. fireworks-design turns that variance into a quality floor by:

  • Exploring widely — N parallel agents, each committed to a distinct aesthetic.
  • Judging independently — a panel scores every direction across separate design dimensions.
  • Synthesizing — taking the winner as the skeleton and grafting the best of the runners-up.
  • Refining adversarially — critique → fix, looped until it clears the bar.

🧩 Built on Claude Code's Dynamic Workflow

fireworks-design is not a prompt, a library, or a hosted service. It's a single JavaScript file that the Claude Code Workflow runtime executes as a Dynamic Workflow — deterministic code that spawns model agents, runs them in parallel, and composes their schema-validated outputs. Control flow belongs to the script (loops, fan-out, barriers), not to the model, so every run is reproducible and resumable. Bring it your model; it brings the orchestration.

The whole pipeline is ~270 lines with a shape you can read at a glance:

// fireworks-design.js — condensed to its bones
phase('Brief');    const brief    = await agentRetry(briefPrompt, { schema: BRIEF_SCHEMA })

phase('Diverge');  const variants = await parallel(LENSES.map(l => () =>          // fan-out: N directions
                    agent(generate(l), { schema: VARIANT_SCHEMA }))).filter(Boolean)

phase('Judge');    const verdicts = await parallel(variants.flatMap(v =>          // panel: N × 6 dimensions
                    DIMS.map(d => () => agent(judge(v, d), { schema: SCORE_SCHEMA }))))

phase('Synthesize'); await agentRetry(synthesize(top(variants, verdicts)))        // graft the best

for (let i = 0; i < REFINE_ROUNDS; i++) {                                         // critique ↔ fix loop
  const issues = await agentRetry(critique, { schema: CRITIQUE_SCHEMA })
  await agentRetry(fix(issues))
}

phase('Polish');   await agentRetry(polish)                                       // ship-ready QA
return { outputPath: FINAL_PATH, winner, ranking, summary: polish }

What makes this a dynamic workflow rather than a script that calls a model:

  • Deterministic orchestrationparallel() is a real barrier, phase() groups live progress, the for loop is yours. The model never decides what runs next.
  • Schema-validated agents — every agent() returns typed JSON, so the pipeline composes data, not prose. No regex-parsing model output.
  • Resumable — edit a prompt and re-run; the unchanged prefix replays from cache (resumeFromRunId).

Full file: fireworks-design.js.

🔬 Under the hood — orchestration, not iteration

This isn't a "generate, then ask the same model to improve it" loop. It's a deterministic multi-agent pipeline that composes several modern inference-time techniques into one reproducible run — built on the Claude Code Workflow runtime (parallel() / pipeline() / agent() primitives, schema-validated returns, resumable execution).

Technique Where it lives Why it matters
Best-of-N + self-consistency Diverge → Judge Generate N directions, keep the one with the highest average across independent scores — quality rises with N, not with luck.
LLM-as-judge panel Judge 6 dimensions × N directions, scored by critics that never see each other's answers. Kills single-judge bias.
Diverse generation (diverse beams) Diverge lenses Each agent commits to a distinct aesthetic, so the N samples cover the design space instead of clustering on one idea.
Critique-and-revise (Self-Refine / Reflexion) Refine A dedicated critic emits severity-tagged issues; a fixer applies them surgically; loop until the bar clears.
Synthesis / grafting Synthesize The winner is the skeleton, runners-up donate their best parts — not a merge of averages.
Structured tool-use every agent Returns are schema-validated JSON, composed deterministically — no brittle regex parsing of model prose.
Context isolation per-agent Each agent runs in its own context with only the tokens it needs; the shared brief is injected (cacheable), not re-read.
Resumable execution runtime resumeFromRunId replays cached results for the unchanged prefix — edit a prompt mid-run without paying to re-run everything.
Model-agnostic agent() omits model Inherits the session model. Swap Opus ↔ Sonnet ↔ anything; the pipeline is identical.
Budget- & rate-limit-aware budget global, retries Fan-out can scale to a token budget; agents auto-retry on 429s instead of crashing the run.

The net effect: you're not hoping the model has a good day. You're engineering a quality floor out of sampling, judging, and refinement — the same inference-time-scaling ideas behind best-of-N and self-consistency, applied to design instead of math.

⚙️ Arguments

Argument Required Default Description
prompt What to build (natural language).
outputDir Absolute path to write files.
variants 6 Number of directions (3–8).
refineRounds 2 Critique→fix loops.
brand Brand notes / existing design system / references.
lenses all Restrict to specific aesthetics by key.

🎚️ Customization

Aesthetic lenses (LENSES)

The distinct directions explored in phase ②. Edit the LENSES array to add your own styles:

Key Style
editorial Bold Editorial — high-contrast, expressive display type, magazine grid
minimal Swiss Minimal — grid-obsessed, restrained, Inter/Geist
gradient Vibrant Gradient — mesh gradients, glassmorphism, neon accents
dark-premium Dark Premium — near-black canvas, gold/violet accent, cinematic
organic Soft Organic — rounded forms, warm palette, approachable motion
brutalist Neo-Brutalist — raw borders, hard shadows, mono, high-energy
glass Glass Aurora — translucent layers, aurora blobs, backdrop blur
mono-tech Mono Tech — monospace accents, terminal/data-forward

Judge dimensions (DIMS)

What the panel scores on: hierarchy · typography · color/contrast · motion · engineering craft · delight/originality.

🤖 Model choices

Every agent() call omits the model parameter, so all subagents inherit the current session model. Run it on Opus for top quality, on Sonnet for speed, or on any model your harness exposes — no code changes needed.

💼 Example cases

A few prompts to try (all outputDir set to an absolute path):

Type Prompt Args
SaaS landing Pricing + landing for an open-source vector DB; speed benchmarks, code hero, comparison table. variants: 6
OSS homepage MIT CLI tool; install command, 3 feature cards, terminal demo. variants: 4, brand: "mono, #10b981"
Portfolio Designer one-pager; asymmetric editorial, large type, works grid. variants: 8, refineRounds: 3
Event One-day AI conference; countdown hero, speakers, schedule, register CTA. variants: 6, brand: "#ea580c"
More quick recipes
Goal Suggested args
Fast first draft variants: 4, refineRounds: 1
Maximum quality variants: 8, refineRounds: 3
Brand-locked pass brand: with hexes + fonts
Specific aesthetics only lenses: ["editorial","dark-premium"]

🔗 Mapping to ClaudeDesign

ClaudeDesign principle This workflow
Wide exploration (a dozen directions) ② Diverge — 6–8 parallel aesthetics
Strongest vision model judges ③ Judge — 6 dimensions × N directions
Distill & merge the best ④ Synthesize — Top-3 graft
Fine-grained iterative controls ⑤ Refine — critique↔fix loop
Design system throughout ① Brief — tokens injected everywhere
Export deliverable HTML ⑥ Polish — QA → final.html

💰 Cost & considerations

  • 6 variants × full pipeline ≈ 40+ agent calls. This is deliberate — quality is the point.
  • Token cost scales with variants, refineRounds, and page complexity.
  • Workflow agents read full HTML files for judging/synthesis, so very long pages cost more.
  • Want budget-aware scaling? Tie VARIANT_COUNT to budget.total (the workflow exposes budget).

❓ FAQ

Do I need a specific model? No. It inherits your session model. A strong model (Opus-class) gives the best results.

Can I use just one direction? Then you don't need this workflow — run a normal frontend-design pass instead. The value is the parallel exploration + judging.

Where do the draft files go? In your outputDir. They're not committed (see .gitignore); keep or delete them freely.

Can I plug in my design system? Yes — pass it via brand. Brief will fold your tokens into every agent.

🤝 Contributing

Contributions welcome — new aesthetic lenses, judge dimensions, or smarter synthesis. Open an issue first to discuss scope. See CONTRIBUTING.md.

📄 License

MIT © yizhiyanhua-ai


Built with Claude Code · Quality is the only thing that matters.

📖 简体中文

fireworks-design | Open Source Project