Brad Zhang / public archive

Brad Zhang

AI product notes, agent workflow writing, and open-source dossiers for founders and early technical teams.

X Thread / EN

Inference as a Publishing Problem

Every production model is also a publishing decision about what deserves to become visible fast.

February 2026 / 9 min read / Inference

Visual structure

Essay structure map

Built from summary and key paragraph positions

Inference as a Publishing ProblemTHESISEvery production modelis also a publishingdecision about whatdeserves to becomeSIGNALInferenceinfrastructure isoften described in thelanguage of throughputOPERATORSeen this way, servingis not justdistribution ofcompute. It is a formIMPLICATIONEvaluation then stopsbeing an afterthoughtand becomes thediscipline that
Mermaid outline
flowchart LR
  thesis["Every production model is also a publishing decision about what deserves to become visible fast."]
  signal["Inference infrastructure is often described in the language of throughput and cost. Those metrics matter, b..."]
  operator["Seen this way, serving is not just distribution of compute. It is a form of editorial prioritization. Some..."]
  implication["Evaluation then stops being an afterthought and becomes the discipline that defines what may be published a..."]
  thesis -->|frames| signal
  signal -->|develops| operator
  operator -->|lands in| implication

Inference infrastructure is often described in the language of throughput and cost. Those metrics matter, but they only become meaningful when tied back to visibility: what the user sees, how quickly they see it, and what level of certainty the system is willing to attach to it.

Seen this way, serving is not just distribution of compute. It is a form of editorial prioritization. Some results deserve fast publication. Others deserve delay, review, or suppression.

Evaluation then stops being an afterthought and becomes the discipline that defines what may be published automatically at all.

Inference as a Publishing Problem