Skip to content
6 min read engineering, astro, tutorial

Astro Content Collections with Zod

Turn a folder of Markdown into a typed, validated, queryable API — schema definition, the date-coercion gotcha, drafts, and how a bad post fails the build instead of shipping broken.

The naive way to put a blog on a site is a folder of .md files and some fs.readFile glue. It works until it doesn’t — a post with a typo’d frontmatter key, a date in the wrong format, a missing title. None of that errors at build time. It just ships, quietly broken, and you find out when someone tells you the date on your latest post says Invalid Date.

Content collections are Astro’s answer: src/content/ becomes a typed, validated, queryable API. Define a schema once with Zod, and every file in the collection is checked against it at build time. A bad post is a build failure, not a deploy. This is the setup running on this site — it’s the longer version of one section from my Astro portfolio walkthrough.

What a content collection actually is#

It’s a named group of content entries — Markdown, MDX, or even JSON/YAML data — that Astro loads, validates, and hands you as typed objects. You stop thinking in terms of “files on disk” and start thinking in terms of “a list of posts, each with a title and a date.” Astro handles the file I/O, the frontmatter parsing, the validation, and (for Markdown/MDX) the rendering.

Defining the collection#

One file: src/content.config.ts. (In Astro 5 it lives at the project’s src/ root — older tutorials say src/content/config.ts; that path still works but the new one is canonical.)

// src/content.config.ts
import { defineCollection, z } from "astro:content";
import { glob } from "astro/loaders";

const journal = defineCollection({
  loader: glob({ pattern: "**/*.{md,mdx}", base: "./src/content/journal" }),
  schema: z.object({
    title: z.string(),
    summary: z.string(),
    date: z.coerce.date(),
    updated: z.coerce.date().optional(),
    tags: z.array(z.string()).default([]),
    draft: z.boolean().default(false),
  }),
});

export const collections = { journal };

Three things doing the work here:

  • glob({ pattern, base }) — the loader. It tells Astro which files belong to this collection and where they live. **/*.{md,mdx} picks up nested folders too. (Astro 5 introduced the Content Layer API; glob is the built-in loader for files on disk. There are others — fetch from a CMS, read a JSON file — but for a journal, glob is it.)
  • schema — the Zod object that every file’s frontmatter is validated against. This is the contract.
  • export const collections — the registry. The key (journal) is the name you’ll pass to getCollection("journal").

The schema is a contract — read it as one#

Each line in that z.object({...}) is an enforced rule:

  • title: z.string() — required, must be a string. Omit it in a post’s frontmatter and the build fails with journal → some-post: title is required. No more shipping a post with no title.
  • date: z.coerce.date() — required, coerced to a Date. The coerce matters: YAML frontmatter gives you a string (2026-05-11) or sometimes a YAML date; z.coerce.date() accepts either and normalizes to a JS Date you can call .toISOString() on. Use plain z.date() and a string frontmatter value fails validation.
  • updated: z.coerce.date().optional() — same coercion, but .optional() means a post without it is fine. (I use this for substantive-revision dates — bump it when you add a section, leave it unset for typo fixes. It feeds dateModified in the article’s JSON-LD.)
  • tags: z.array(z.string()).default([]) — an array of strings; .default([]) means a post with no tags: line gets [], not undefined. Now downstream code can do entry.data.tags.map(...) without a guard.
  • draft: z.boolean().default(false) — defaults to published. Set draft: true and the post still validates — you just filter it out at query time (next section).

.optional() vs .default() is the distinction to internalize: .optional() leaves the field possibly-undefined (you handle the absence); .default(x) guarantees a value (you never handle the absence). Pick .default() whenever there’s a sane fallback — it pushes the “is this set?” branch out of every consumer.

Querying#

getCollection returns the whole collection, optionally filtered. The standard pattern:

---
import { getCollection } from "astro:content";

const posts = (await getCollection("journal", ({ data }) => !data.draft))
  .sort((a, b) => b.data.date.getTime() - a.data.date.getTime());
---

<ul>
  {posts.map((post) => (
    <li>
      <a href={`/journal/${post.id}`}>{post.data.title}</a>
      <time datetime={post.data.date.toISOString()}>
        {post.data.date.toLocaleDateString()}
      </time>
    </li>
  ))}
</ul>
  • The second argument to getCollection is a filter predicate({ data }) => !data.draft drops drafts. Crucially, this works with the dev/prod split: in astro dev you can still navigate directly to a draft’s URL to preview it; in astro build the filter excludes it from the build entirely. Best of both.
  • post.id is the entry’s identifier — derived from the file path relative to the collection’s base (so welcome.mdxwelcome). In Astro 5 it’s .id; older code used .slug. Use .id.
  • post.data is the validated frontmatter, fully typed. Your editor autocompletes post.data.title, knows post.data.date is a Date, knows post.data.tags is string[]. That’s the payoff of the schema — types for free, derived from the same definition that does the validation.

Need one specific entry? getEntry("journal", "welcome").

Rendering an entry#

For Markdown/MDX entries, render() gives you a component plus the heading list (useful for a table of contents):

---
import { getEntry, render } from "astro:content";

const entry = await getEntry("journal", Astro.params.slug);
const { Content, headings } = await render(entry);
---

<article>
  <h1>{entry.data.title}</h1>
  <Content />
</article>

Combine that with a getStaticPaths that maps every entry to a route, and you’ve got the whole /journal/[slug] page in ~15 lines.

Validation that actually pays off#

The first time a bad post fails your build, you’ll get it. Forget the title? Build fails, names the file. Write date: May 11 2026 instead of date: 2026-05-11? z.coerce.date() rejects it — build fails, names the file. Typo darft: true? Zod doesn’t know darft, ignores it, and draft defaults to falsethat one slips through, so the lesson is: the schema catches malformed known fields, not misspelled unknown ones. (If you want to catch stray keys too, add .strict() to the z.object() — it’ll error on any frontmatter key not in the schema.)

The point: broken content stops being a runtime surprise and becomes a build-time error with a filename attached. On a personal site that’s a nice-to-have; on a site where other people write posts, it’s the difference between “the CI caught it” and “it’s live.”

Gotchas worth knowing#

  • Config file location: src/content.config.ts in Astro 5. The old src/content/config.ts still resolves, but new docs and tooling assume the root path.
  • .id not .slug: Astro 5 renamed the entry identifier. If you’re following a 2023 tutorial, mentally swap entry.slugentry.id.
  • base is relative to the project root, not to the config file: base: "./src/content/journal". Get it wrong and getCollection returns an empty array with no error — the most confusing failure mode here. If a collection is mysteriously empty, check the base.
  • z.coerce.date() parses to midnight UTC for a bare 2026-05-11. If you render it with toLocaleDateString() in a timezone behind UTC, it can display as the previous day. Either format in UTC explicitly, or accept the off-by-one, or store a full timestamp if it matters.
  • Reference other collections with reference("authors") in the schema — handy once you have, say, a projects collection that points at an authors collection. Overkill for a solo journal.
  • JSON/YAML data collections use the same defineCollection + Zod, just with glob({ pattern: "**/*.json", ... }) and no render(). Good for structured data you’d otherwise hardcode in a .ts file.

What you end up with#

A content.config.ts that turns src/content/journal/* into a typed array of validated post objects. Querying is one line, rendering is two, drafts handle themselves, and a malformed post is a build error with a filename — not a Invalid Date someone spots in production three weeks later.

Validation isn’t bureaucracy. It’s the difference between “the build caught it” and “a reader did.”

This is one layer of a larger build — layout, OG cards, SEO meta, deploy — walked through end to end in How to Build a Portfolio Website with Astro.

Where to next

More writing in the journal, or jump back to the beginning.