LLM-First Web Design: Turning Pages into Knowledge APIs
“What if every webpage treated ChatGPT like its #1 reader?” A thought-experiment—plus a build plan—for re-engineering the entire stack around large-language-model consumption.
1. Treat Content Like Code
The canonical asset isn’t HTML—it’s a signed JSON-LD
blob that stores every claim, citation, and hash. Writers “commit” edits the way developers push to Git. Humans see a rendered skin; machines fetch crystal-clear structure.
2. Semantic Chunking & Role Tags
Wrap each sentence in a micro-span (<span data-claim="C123">
) and label its function—claim, evidence, counter-argument. A manifest file lets an LLM reconstruct arguments without rhetorical guesswork.
3. Ship Your Own Embeddings
At build-time, generate OpenAI-compatible vectors for every chunk and host them under /embeddings/{id}.bin
. Expose /.well-known/nearest
so agents can hit an HNSW index directly—no paid re-embedding, near-zero latency.
4. Time-Travel URLs & Reproducible Answers
Serve a Memento-Datetime
header and pin-able hashes (/snapshot/2025-05-01T17:00Z
). LLM citations stay valid forever; auditors can diff any two versions byte-for-byte.
5. One-Hop Discovery Endpoints
/robots.txt
advertises a crawl budget just for LLMs./.well-known/llm-feed?topic=⇢
returns the 50 most salient claims for a query.- Optional mini-SPARQL endpoint lets power users run live graph queries.
6. Prompt-Injection Hardening
Default machine endpoint is plain JSON; the human HTML lives in a <script type="text/plain+human">
wrapper. Crawlers must opt in, slashing the risk of stray user-supplied prompt fragments.
7. Moon-Shot Extras
- Self-expiring HTML that nudges agents toward the structured layer.
- Publish authors’
cot.json
(chain-of-thought) so models can learn expert reasoning paths. - WebAssembly summarizer shipped with every page—humans & bots get pixel-perfect identical abstracts.
8. Snapshot of Pros & Cons
- Pros—Zero parsing ambiguity, cheaper RAG pipelines, transparent edit history.
- Cons—Authoring feels like coding, bigger headers, new privacy surface (vectors can leak identity).
9. Starter Tech Stack
- Static generator (Eleventy) extended to spit out JSON-LD + vectors.
- Vector DB (Weaviate) behind the
/nearest
endpoint. - Edge workers (Cloudflare) inject version headers & snapshot routing.
10. Why Bother?
Because it turns “content” into an addressable truth graph. LLMs stop hallucinating and start retrieving—exactly the future we keep asking for.
← Back to all posts