Gatsby vs. Next.js

Over the past few months, I've moved as much code as possible away from Gatsby. While I see why people are attracted to it and its growing ecosystem, I am no longer sipping the KoolAid. Next.js, while not perfect either, provides a better abstraction layer on top of Webpack that is more than adequate for the vast majority of projects.

In this post, I am going to share my personal opinions about the two projects as they relate to making static websites.

What is Gatsby good at?

Before I discuss its shortcomings, I want to share where I think Gatsby excels by telling you about a recent experience.

Just the other day, I was building a Gatsby website for a friend who is opening a night club called Noir in New York City. As part of the requirements, my friend wanted an image gallery with cool photos of the venue. However, since the venue is still under construction, they didn't have any usable images for the gallery just yet.

Noir New York Homepage

In the interim, they asked if I could just pull images from their Instagram account (@noirnewyorkcity) until they officially open. Since I was using Gatsby, I did a quick google search for "Gatsby Instagram," and lo' and behold, I immediately came across gatsby-source-instagram. This Gatsby Source Plugin pulls the last 12 Instagram posts from a profile without an API key and puts them onto your site's GraphQL Schema. After installing the plugin, I popped open Gatsby's GraphiQL IDE. I copy and pasted the query from the plugin’s docs, and boom. It worked.

Gatsby Source Instagram GraphiQL

It took all of 15 minutes to integrate and get the posts onto the page. What's remarkable is that I did all of this without ever once visiting the Instagram API docs. Pretty sweet.

This kind of interaction exemplifies Gatsby's unique value proposition: a single, universal GraphQL API for any and all data sources. By leveraging GraphQL and its growing ecosystem of source plugins, Gatsby, in theory, can provide developers with a unified abstraction layer (and query language) to access any data (whether it be remote JSON, local files, or cat videos).

What the what?

The above anecdote is both the best and worst part of Gatsby. Yeah, it's cute to use GraphQL, but actually very very annoying most of the time. If you think about it, I don't give a crap how that Instagram data I want gets to my React component, just that it shows up as props (and preferably with TypeScript types). Remember folks, since we are in static land, we don't really care too much about overfetching; I would be just as happy for you to give me all the Instagram data as props since there is no runtime overhead (just build time).

I feel like I'm taking crazy pills, but imagine how sweet it would have been to not write any GraphQL? What if it just automagically provided all the data to all of my pages. That would have saved oodles of time.

GraphQL is overkill for most static sites

Another notable experience with Gatsby for me was building my podcast's website: https://undefined.fm (source). Unlike your average developer markdown blog (like this one), undefined.fm's content is solely derived from our podcast's RSS feed: https://feeds.simplecast.com/8lcA0Is7. This is great because it allows for our podcast hosting solution/CMS (Simplecast) to be the single source of truth[^2].

While this sounds awesome, getting this setup to work was a pain. Although there are a few Gatsby Source plugins for RSS feeds, none of them were fit for Podcasts[^3]. I forked the one that seemed to be working the best and copied all of into my projects source code under plugins/gatsby-source-simplecast-rss. I then added it to my gatsby-config.js. I then needed to figure out how in the world I add this data to Gatsby's GraphQL API.

Fun times. It looks like this.

// gastby-node.js
const { load, select, createChildren } = require('./internals');

exports.sourceNodes = async ({ actions }, options = {}) => {
  const { createNode } = actions;
  const { feed } = options;
  try {
    // Create nodes here, generally by downloading data
    // from a remote API.
    const { rss } = await load(feed);
    const podcast = rss.channel[0].item;
    createChildren(podcast, null, createNode);
  } catch (e) {}

  // We're done, return.
  return;
};

And now for the crazy:

// internals.js
const crypto = require('crypto');
const rp = require('request-promise');
const { parseString } = require('xml2js');
const lget = require('lodash.get');

// Utils copied from initial plugin by Uptime Ventures
const transform = (i) =>
  new Promise((resolve, reject) =>
    parseString(i, (e, p) => (e ? reject(e) : resolve(p)))
  );

const load = (uri) => rp({ uri, transform });

const select = (i, key) => {
  const value = lget(i, key);
  if (Array.isArray(value)) {
    return value[0];
  }

  return value;
};

const digest = (i) =>
  crypto.createHash('md5').update(JSON.stringify(i)).digest('hex');

/**
 * Slugify a string
 * @param s Any string
 */
function toSlug(s) {
  if (!s) {
    return '';
  }
  s = s.toLowerCase().trim();
  s = s.replace(/ & /g, ' and ');
  s = s.replace(/[ ]+/g, '-');
  s = s.replace(/[-]+/g, '-');
  s = s.replace(/[^a-z0-9-]+/g, '');
  return s;
}

const createChildren = (nodes, parent, createNode) => {
  const children = [];
  nodes.forEach((n) => {
    const link = toSlug(select(n, 'title'));
    children.push(link);
    const node = {
      id: toSlug(select(n, 'title')),
      title: select(n, 'title'),
      description: select(n, 'itunes:summary'),
      html: select(n, 'content:encoded'),
      // Fix the date
      date: new Date(select(n, 'pubDate')).toISOString(),
      // Extract out the embed URL
      artwork: n['itunes:image'][0]['$']['href'],
      embed: n.enclosure[0]['$']['url']
        .replace('.mp3', '')
        // hack @todo
        .replace('/audio/17ba21/17ba21db-66b5-4612-855e-556b20f60155', '')
        .replace('https://cdn', 'https://player')
        .split('/')
        .slice(0, 4)
        .join('/'),
      audioUrl: n.enclosure[0]['$']['url'],
      duration: select(n, 'itunes:duration'),
      keywords: select(n, 'itunes:keywords'),
      episodeNumber: select(n, 'itunes:episode'),
      link,
      parent,
      children: [],
    };

    // This is how you we specify that each entry
    // in the RSS feed should become an Episode node in Gatsby's GraphQL layer
    node.internal = {
      type: 'Episode',
      contentDigest: digest(node),
    };

    createNode(node);
  });

  return children;
};
module.exports = {
  select,
  load,
  createChildren,
};

I don't need to walk through it line by line, but what this plugin is doing is as follows:

  • Fetch the RSS feed (which is in XML) specified in plugin options in gatsby-config.js
  • Convert the XML into JSON
  • Pluck the episodes array off of the JSON object
  • Walk through each entry, tweak some of it with string magic (to generate a slug and get the correct URL for the iframe embed)
  • Add each entry to Gatsby's GraphQL layer as an Episode node by specifying a type and hashing it (my guess is that Gatsby uses this hash to optimize changes? :shrug:)

This all looks great and works now, but developing this plugin was far from ideal. Each time I changed the plugin, I needed to rerun rm -rf .cache && gatsby develop. If there were any errors, the site would explode (more on that later).

I can't imagine what this must feel like for beginners. If all of this seems like a lot and not fully explained that's because I don't get it either. It seems like a bunch of hoops to go through. In the end, however, I did get it to work.

Or so I thought. What I just showed you was just how to get the data into Gatsby's GraphQL API. To actually generate a page for each Episode, I had to figure out this monster function inside of gatsby-node.js. I did this by copying the GraphQL query from the gatsby-source-filesystem docs then and tweaking it for my allEpisode query. Took a bunch of trial and error to get the slugs working.

const path = require('path');

// Apparently our plugin didn't actually create the node like we thought,
// this is required.
exports.onCreateNode = ({ node, getNode, actions }) => {
  const { createNodeField } = actions;
  if (node.internal.type === `Episode`) {
    createNodeField({
      node,
      name: `slug`,
      value: `/radio/${node.id}`,
    });
  }
};

// Query the GraphQL API and make a page for every episode using the
// episode template. This works a lot like the `gatsby-source-filesystem` plugin
exports.createPages = ({ graphql, actions }) => {
  return new Promise((resolve, reject) => {
    const episodeTemplate = path.resolve('./src/templates/episode.tsx');
    const episodeQuery = /* GraphQL */ `
      {
        allEpisode(sort: { fields: [date], order: DESC }, limit: 1000) {
          edges {
            node {
              id
              title
              description
              fields {
                slug
              }
            }
          }
        }
      }
    `;

    resolve(
      graphql(episodeQuery).then((result) => {
        if (result.errors) {
          reject(result.errors);
        }

        result.data.allEpisode.edges.forEach((edge) => {
          actions.createPage({
            path: edge.node.fields.slug,
            component: episodeTemplate,
            context: {
              slug: edge.node.fields.slug,
            },
          });
        });
      })
    );
  });
};

/**
 * Slugify a string
 * @param s Any string
 */
function toSlug(s) {
  if (!s) {
    return '';
  }
  s = s.toLowerCase().trim();
  s = s.replace(/ & /g, ' and ');
  s = s.replace(/[ ]+/g, '-');
  s = s.replace(/[-]+/g, '-');
  s = s.replace(/[^a-z0-9-]+/g, '');
  return s;
}

// ...

Wait! There's more! Two more GraphQL queries are needed to actually request the episode data. One for the homepage and one for the episode template!

// ./src/index.tsx
// Get the list of all the episodes for the homepage
export const query = graphql`
  {
    allEpisode(sort: { fields: [date], order: DESC }, limit: 100) {
      edges {
        node {
          id
          title
          description
          episodeNumber
          duration
          date
          fields {
            slug
          }
        }
      }
    }
  }
`;
// ./src/templates/episode.tsx
// Get a single episode
export const pageQuery = graphql`
  query ($slug: String!) {
    episode(fields: { slug: { eq: $slug } }) {
      id
      title
      description
      date
      html
      embed
      duration
      artwork
      audioUrl
      fields {
        slug
      }
    }
  }
`;

If you're confused, you are not alone. If you think this is a lot of code and indirection, then you and I are on the same boat.

Debugging is a nightmare

If this was just another Webpack plugin or normal fs sorcery, debugging errors wouldn't be that big of a deal. Errors would be traceable/debuggable to their source through normal stack traces. However, because of all of the abstraction required for GraphQL and Webpack inside of Gatsby's internals, debugging is a nightmare even for me (and I know a rather unfortunate amount about Webpack and GraphQL).

For example, let's say that one of my select(n, 'itunes:xxx') calls throws an error for one specific episode, but the rest work fine, this is what the console would look like when you run gatsby develop:

~/workspace/github/jaredpalmer/theundefined master*
❯ yarn start -p 5000
yarn run v1.19.1
$ rm -rf .cache && gatsby develop -p 5000
success open and validate gatsby-configs — 0.016 s
success load plugins — 0.236 s
success onPreInit — 1.147 s
success delete html and css files from previous builds — 0.095 s
success initialize cache — 0.005 s
success copy gatsby files — 0.029 s
success onPreBootstrap — 0.009 s
warning The gatsby-transformer-sharp plugin has generated no Gatsby nodes. Do you need it?
warning The gatsby-source-simplecast-rss plugin has generated no Gatsby nodes. Do you need it?
success source and transform nodes — 0.209 s
success building schema — 0.124 s
error gatsby-node.js returned an error


  TypeError: Cannot read property 'allEpisode' of undefined

  - gatsby-node.js:81 graphql.then.result
    /Users/jared/workspace/github/jaredpalmer/theundefined/gatsby-node.js:81:21

  - util.js:16 tryCatcher
    [theundefined]/[bluebird]/js/release/util.js:16:23

  - promise.js:512 Promise._settlePromiseFromHandler
    [theundefined]/[bluebird]/js/release/promise.js:512:31

  - promise.js:569 Promise._settlePromise
    [theundefined]/[bluebird]/js/release/promise.js:569:18

  - promise.js:606 Promise._settlePromiseCtx
    [theundefined]/[bluebird]/js/release/promise.js:606:10

  - async.js:142 _drainQueueStep
    [theundefined]/[bluebird]/js/release/async.js:142:12

  - async.js:131 _drainQueue
    [theundefined]/[bluebird]/js/release/async.js:131:9

  - async.js:147 Async._drainQueues
    [theundefined]/[bluebird]/js/release/async.js:147:5

  - async.js:17 Immediate.Async.drainQueues [as _onImmediate]
    [theundefined]/[bluebird]/js/release/async.js:17:14


success createPages — 0.035 s
success createPagesStatefully — 0.042 s
success onPreExtractQueries — 0.001 s
success update schema — 0.080 s
error GraphQL Error There was an error while compiling your site's GraphQL queries.
  Error: RelayParser: Encountered 2 error(s):
- Unknown field 'episode' on type 'Query'. Source: document `usersJaredWorkspaceGithubJaredpalmerTheundefinedSrcTemplatesEpisodeTsx2642664173` file: `GraphQL request`

  GraphQL request (3:5)
  2:   query($slug: String!) {
  3:     episode(fields: { slug: { eq: $slug } }) {
         ^
  4:       id

- Unknown field 'allEpisode' on type 'Query'. Source: document `usersJaredWorkspaceGithubJaredpalmerTheundefinedSrcPagesIndexTsx3249400678` file: `GraphQL request`

  GraphQL request (3:5)
  2:   {
  3:     allEpisode(sort: { fields: [date], order: DESC }, limit: 100) {
         ^
  4:       edges {


success extract queries from components — 0.224 s
success run graphql queries — 0.014 s — 5/5 483.25 queries/second
success write out page data — 0.007 s
success write out redirect data — 0.001 s
success onPostBootstrap — 0.001 s

info bootstrap finished - 5.33 s

Starting type checking and linting service...
Using 1 worker with 2048MB memory limit
Starting type checking and linting service...
Using 1 worker with 2048MB memory limit
Watching: /Users/jared/workspace/github/jaredpalmer/theundefined/src
Browserslist: caniuse-lite is outdated. Please run next command `yarn upgrade caniuse-lite browserslist`
 DONE  Compiled successfully in 4118ms                                          11:45:37 AM

Type checking and linting in progress...

You can now view undefined.fm in the browser.

  http://localhost:5000/

View GraphiQL, an in-browser IDE, to explore your site's data and schema

  http://localhost:5000/___graphql

Note that the development build is not optimized.
To create a production build, use npm run build

ℹ 「wdm」:
ℹ 「wdm」: Compiled successfully.
No type errors found
No lint errors found
Version: typescript 2.9.2, tslint 5.12.1
Time: 6294ms

In the above 100 lines of logs, it states that something is wrong with my episodes query. What's not immediately clear is what and where the mistake is.

  TypeError: Cannot read property 'allEpisode' of undefined

  - gatsby-node.js:81 graphql.then.result
    /Users/jared/workspace/github/jaredpalmer/theundefined/gatsby-node.js:81:21

This error says that there is something wrong in gatsby-node.js on line 81 (where I map through the episodes returned from GraphQL).

hmmmm....let's enter the mind of Jared...dooodoodleooop....:

Is there an error in my source plugin? Maybe? Is it on all of the episodes? Possibly? Is it on a single field of an episode? Or maybe on a specific field of a specific episode? Which one though? Seems fine? Maybe I made a typo in my gatsby-node.js GraphQL query? Based on the rest of the terminal output, it seems like my some other GraphQL queries (located somewhere in my source code) are failing too? Where are those? The ones that are failing? I forgot? Shoot. My plugin seemed to work just a second ago? Now none of my GraphQL seem to work? F@(&#)@&%$)Q@()!@#)&#$)!@#)(%$@#)%&K!

Turns out, that on one sneaky line that you probably missed, it says this lil' gem:

warning The gatsby-source-simplecast-rss plugin has generated no Gatsby nodes. Do you need it?

Hmmmmmm. So now if you're me, you start uncommenting / commenting out the bits of that createChildren until you figure out what is throwing via trial and error. Fun times.

But wait! There's more!

Gatsby might be too smart for its own good

My Dad (@shellypalmer) launched his own tech-focused podcast with Ross Martin a few weeks back called Think About This. It's awesome and you should totally listen to it. Being a good son, I forked undefined.fm's source code, changed the graphics and copy, and tossed up thinkaboutthis.fm. While it uses Megaphone.fm (instead of Simplecast), the two sites share 99.99% of the same code. Everything with the site was working swimmingly for the trailer/preview episode. I even automated redeployment of the site with a GitHub Action cron job every Wednesday morning.

On the morning that the first full episode premiered, though, my Dad pings me on Slack,

thinkaboutthis.fm did not pick up the RSS feed for the premiere episode

This didn't make any sense at all! My setup had been working perfectly for undefined.fm for months. Why didn't the new episode show up? The site seemed to deploy just fine. No errors in logs. Dafuq?

Well, it turns out that Gatsby/GraphQL is really too smart for its own good sometimes. While usually a good thing, Gatsby infers the GraphQL schema based on the source data. This works well when everything is always defined, but not so much when things are nullable. It turns out, that the first entry in my Dad's podcast's RSS feed was a trailer, so it didn't have a value for <itunes:episode>. This didn't matter until the first real episode came out (second entry in the feed) which did have a value for <itunes:episode> of 1. This caused Gatsby to explode because it could not properly infer the value of episodeNumber. Who knew? (not me). After rooting through the docs, I learned that you can override this behavior using GraphQL annotations @dontInfer (read more).

While this may seem like an easy fix and tiny bug, because of the challenging debugging experience, this actually took me several hours to finally figure out. In the interim, I just removed the episode number from the site all together.

Final Thoughts on GraphQL for Static sites

At the end of the day, yes, the filesystem is a graph. GraphQL may be the technically "correct" universal data API. However, the above workflow is over-engineered and wayyyy overly complicated for fetching an RSS feed and generating some HTML. It's just doesn't need to be this hard kids.

Plugins

Another controversial aspect of Gatsby is its plugin system. There are a metric shit-ton of plugins on NPM for all kinds of wild and whacky stuff. The problem I have with Gatsby, is that it doens't come with the battery pack included. The gatsby package itself just orchestrates the plugins and GraphQL stuff. All other functionality is offloaded to plugin-land. I hate this passionately. While this seems like a great maintenance strategy, it isn't that much fun as a consumer. The result is that you need to install around 7-8 plugins every time you start a new site. Or, you can be cool and make your own gatsby-theme (which is a group of plugins). Again, my issue is that having a crazy-slim core means that every single Gatsby site ends up being a snowflake. I have around 4 or 5 Gatsby sites, and they're all kind of similar, but not enough where I can blindly copy and paste code between them. I just feel like they should be sharing a lot more code. While this issue is somewhat solved for in with Gatsby themes, it still feels like Gatsby core is mostly useless out-of-the-box. Furthermore, because so much stuff is offloaded to plugin-land, switching between Gatsby codebases has more overhead than if the core did a bit more for you. For example, almost every single gatsby-node.js file inevitably becomes a snowflake. Let's just start with the most downloaded plugin: gatsby-source-filesystem. This plugin requires ~20-50 lines of code in gatsby-node.js. This duplication means that it all diverges.

"But Jared, you schmuck, how is this different than razzle.config.js or next.config.js?" Well, the GraphQL part and the Gatsby abstractions. Both Next.js (and Razzle) just give you direct access to the Webpack config. If you know Webpack, then you know Next.js and Razzle. Period. With Gatsby, you have a lifecycle methods and framworky functions that you must use to augment functionality. GraphQL is again, the source of complexity for Gatsby and why gatsby-node.js files are so much more complex than your average next.config.js file. Put differently, gatsby-node.js is like OG Wordpress functions.php on steroids and I'm not a fan.

Next.js is lit for static sites

Next.js isn't perfect for everything. However, it is really really good for static sites. First, Next.js's filesystem routing is phenomenal for static sites: ./pages/about.tsx => /about. Instead of being forced to query a central GraphQL API, you can just write a function called getStaticProps which gets executed at build time. Anything that is returned is injected as props into your page component. Even better, you can write any Node.js code in this function as well and it will be removed from the client, so you can use the filesystem.

Doing epic stuff with getStaticProps

TBH I didn't really see the light around Next.js until I read the source code of the Expo docs. Buried in it is some rad code for generating the sidebar and statically analyzing the filesystem. It all works because of babel-plugin-preval. This nifty plugin by Kent C. Dodds pre-evaluates JavaScript code at build time. This in turn can be utilized to pre-evaluate the contents of the filesystem using good ol' fs. However, now with Next.js 9.x, you don't even need preval anymore, you can just export a function from a page called getStaticProps and it just works.

For example, the new Formik docs are going to have a blog. All the articles are written in MDX. Every article gets a .mdx file inside of the ./pages/blog/ directory and has the same front matter: title, description, date, etc. To generate the data for the blog index, I do the simplest possible thing: I read .mdx files in the ./pages/blog/ directory, parse their front matter with the front-matter package, toss them onto an array, and then sort them by date. Since I am using getStaticProps, all of the happens at build time so the result is still a static page.

// ./pages/blog/index.js
import React from 'react';
import path from 'path';
import fm from 'front-matter';
import fs from 'fs-extra';
import toDate from 'date-fns/toDate';
import compareDesc from 'date-fns/compareDesc';

export default function BlogList({ posts }) {
  return (
    <>
      {posts.map(({ title }) => (
        <div key={title}>{title}</div>
      ))}
    </>
  );
}

export function getStaticProps() {
  let items = fs.readdirSync('./pages/blog');
  for (var i = 0; i < items.length; i++) {
    const filePath = path.join(path_, items[i]);
    const { ext, name } = path.parse(filePath);
    // Only process markdown/mdx files that are not index.tsx pages
    if (ext.startsWith('.md') && ext !== 'index') {
      try {
        let { attributes } = fm(fs.readFileSync(filePath, 'utf8'));
        let obj = {
          ...attributes,
          date: toDate(attributes.date),
          href: filePath
            .replace(/^pages\/blog/, '/blog')
            .replace(/.mdx?$/, '')
            .replace(/.tsx?$/, ''),
        };
        arr.push(obj);
      } catch (e) {
        console.log(`Error reading frontmatter of ${filePath}`, e);
      }
    }
  }
  return { props: { posts: arr.sort(compareDesc) } };
}

So that's it. We mimicked 100% of gatsby-source-filesystem without any GraphQL magic whatsoever. We just used fs and front-matter packages.

If we wanted to add stuff like tags or categories, again, all we would need to do is more getStaticProps and getStaticPaths. Absolutely, zero GraphQL would be necessary.

I'm done with Gatsby

So yeah, I'm done with Gatsby. I didn't get into Gatsby build performance, or its cache strategy, or incremental builds, because none of that matters to me. At the end of the day, I don't see a any value for the added complexity or indirection of Gatsby's usage of GraphQL. I'm pretty happy with Next.js for static sites. Obviously, everything is a tradeoff. If you like Gatsby, and it works for you and your team. That's awesome. I'm happy for you. To me, Next.js feels like the right abstraction. If I do need GraphQL, I can use it if I want to, but it's not forced down my throat.

[^1]: I have a GitHub Action that redeploys the site every night by pinging Vercel deploy hook. When I post a new episode, once every week or so, I usually just redeploy it manually via Vercel's dashboard. [^2]: Podcasts have a very specific RSS format that Apple invented years ago. Every podcast app, PocketCasts, Overcasts, etc. is actually just a fancy RSS reader in disguise.

Jared Palmer headshot

👋 Hey! I'm Jared Palmer. I'm currently the VP of AI at Vercel. I joined Vercel after they acquired my build system startup Turborepo in late 2021.

Jared Palmer © 2024