To run the homepage vorsfelde-faustball.de I decided to use Drupal 8 at that time. Drupal 8 reached its end of life on November 2, 2021 and I needed to either upgrade to Drupal 9 or build something new.

For multiple reasons, I decided to build something new:

  • Have a project to try out new frameworks and techniques
  • I have not really benefited from all the features provided by Drupal. The homepage I built was mainly a bunch of static pages, news articles and a self-developed module to fetch results from faustball.com.
  • I spent a considerable amount of time updating Drupal’s core and modules when another vulnerability was found. Somewhat shortend my solution was to:
    1. Locally start a Docker container running the homepage,
    2. exec into the container and update dependencies (I needed to run the container with 6GB of RAM so that composer does not run out of memory 😱),
    3. docker cp the updated dependency files (composer.json and composer.lock) and
    4. Commit and push the updated dependency files and let a GitLab pipeline build a new image.

This process took at least an hour but it worked. And it was a great improvement compared to the old way of uploading files via FTP. - Writing a custom module to fetch the current league table and do a bit of processing was quite a hassle for me as a person without any Drupal developer experience. There were some changes necessary and I wanted to avoid spending too much time.

Change from classic CMS to headless CMS + Frontend

In the past years, I learned a lot about more up-to-date/different ways of building homepages. For example headless CMS’ that provide content via an API, building static homepages by querying those APIs at build-time, deploy frontends to an object storage like AWS S3 and provide the homepage with a content delivery network like AWS CloudFront.

Classic CMS approaches have the benefit that modified content is live immediately (maybe after clearing the cache).

CMS modification leads to updated page

CMS modification immediately leads to an updated page

Separating the content part into a static build requires automation. This is for example achieved by pipelines that the headless CMS triggers.

CMS modification triggers a pipeline to build a new frontend version

CMS modification triggers a pipeline to build a new frontend version

Like a lot of other developers I love trying out new technologies. Even though it is an overkill for this use-case, I’m happy to have a project to implement my theoretical knowledge and get my hands dirty.

Tech stack

For my tech stack of choice I needed to find (1) a framework to build a static frontend, (2) a headless CMS, (3) a place to store and provide the build and (4) something to connect the individual parts.

Frontend

GatsbyJS is used to build a static frontend. It is a React-based framework and I’m very pleased with its fast page navigation.

Preloading is used to prefetch page resources so that the resources are available by the time the user navigates to the page.

Source

GatsbyJS also allows to dynamically add content. In my case I wanted to display the current league results from an external resource. It is not reasonable to integrate the table results into the build process. To the best of my knowledge, the only way would be to develop an application that is monitoring changes. On each change, a webhook is called which would trigger a new build. So definitely possible but out of scope for now.

Another external resource to fetch content from is the REST API of MTV Vorsfelde. They provide articles from local newspapers that I would like to show on our homepage as well. So the idea was to fetch those articles at build-time.

The main source for articles is of course provided by ourself. Articles written by us are stored in a headless CMD and published by an API.

Headless CMS

I chose Strapi for managing articles, players, pictures and so on. The main reason is its possibility to self-host Strapi. It also integrates nicely with GatsbyJS with the help of a plugin called Strapi Source Plugin that allows to query Strapi’s API.

I wrote a little migration tool in Go that is querying Drupal’s REST API to fetch all news articles including images. Those articles are afterwards imported into Strapi.

Deployment

Having a frontend consisting of prebuild files (.html, .css, .js, …) and some assets, I just needed a place to store those files. One option is to have a custom-build Docker image running, for example, nginx serving those files. This Docker image is then “orchestrated” by my Nomad cluster, just like some other applications I’m hosting for myself.

I am learning every now and then to become an AWS Certified Developer - Associate, not sure if I will ever take the exam. Nevertheless, I took this as a possibility to practically learn how to serve static sites with AWS CloudFront that are hosted on AWS S3.

Connecting the pieces

After choosing tools and services for the main building blocks, some essentials parts are missing to connect those.

Infrastructure as Code

Although this setup requires just very few resources on AWS, requesting and configuring cloud resouces manually feels very wrong. My tool of choice is HashiCorp Terraform.

Version control

Almost not worth mentioning…I’m using my self-hosted Gitea which is able to trigger pipelines.

Automated building

Gitea does not provide CI/CD features but relies on external services. I’m using a self-hosted version of Drone. A Drone pipeline, of course configured as code, is triggered to build and deploy a new version for a DEV or PROD environment of the frontend.

Implementation

Having the explanation of all requirements, tools and services out of the way, lets complete the picture.

Hooking into Gatsby’s build

Gatby’s build part is quite interesting. Of course it is doing all this NodeJS mumbo jumbo like building/compressing CSS and JS. To be honest, I don’t really know what exactly is happening and am not that interested. But from my perspective, interesting are extension points that can optionally be defined in gatsby-node.ts.

Fetching articles from multiple sources

My goal was to combine articles from two sources. Of course our own articles managed and provided by Strapi. In addition, articles published on MTV Vorsfelde’s own homepage. Those articles are accessible via Wordpress’ REST API.

Having fetched all articles during the build phase, minor processing steps are necessary. For example merging and sorting them by date, and creating article redirects as formerly provided by Drupal.

Steps to build the page

Steps to build the page

To be able to fetch articles via GraphQL, they must be available in Gatsby’s GraphQL schema. This is done by two Gatsby plugins.

Firstly, the already mentioned Strapi Source Plugin. With some configuration parameters, articles, players, etc. are fetched and available in Gatsby’s GraphQL schema.

Secondly, a plugin for generic GraphQL APIs. I know about the combination of two comprehensive plugins to provide Wordpress data via GraphQL. But I do not have influence on MTV Vorsfelde’s site and I have to live with what is already provided.

So a workaround was necessary. Wordpress’ REST API is activated and available by default. This means I could use a generic Gatsby plugin for REST APIs. But there was a challenge. Articles are paginated, which makes sense. I did not find a way to fetch articles from multiple sites. Or maybe did not try hard enough. Instead, I wrote an AWS Lambda function in Go that is querying the Wordpress REST API handling the pagination. To now query these articles, I setup a GraphQL schema using AWS AppSync. So when requesting articles via GraphQL, AWS AppSync is running the Lambda function. The Lambda function in turn sends requests to the Wordpress API. Quite some overhead, now that I’m writing about it.

Gatsby fetching data from sources and providing it internally to gatsby-node.ts

Gatsby fetching data from sources and providing it internally to gatsby-node.ts

But now we have all articles within Gatsby’s GraphQL schema and we can start creating pages out of it.

Creating pages dynamically at build time

We are provided with APIs to build a Gatsby site. > Code in the file gatsby-node.ts is run once in the process of building your site. You can use its APIs to create pages dynamically, add data into GraphQL, or respond to events during the build lifecycle. > > Source

I make use of the createPages extension point to get two functions from its Actions:

To get a list of articles I query Gatsby’s GraphQL schema:

const result = await graphql(`
  {
    allStrapiArticle {
      edges {
        node {
          id
          slug
          # more fields
        }
      }
    }
    faustball {
      mtvArticles {
        id
        date
        # more fields
      }
    }
  }
`)

With all articles in results I can perform some processing steps. Afterwards, for each processed article createPages is called. If it is an Drupal-imported article, createRedirect is called as well.

Now that Gatsby is able to import articles and build a static frontend, the next step was to automate this process in a pipeline.

Pipeline details

Assume a new article is written in Strapi. Strapi allows to call webhooks for certain events, for example when publishing an article. So I created a webhook that is calling Drone. Drone is now running a pipeline.

Strapi calling Drone pipeline

Strapi calling Drone pipeline

The pipeline running in Drone is where almost everything is happening. Of course everything starts with fetching the source code. Not really much to add here, it is just the newest version on the main branch.

Since GatsbyJS is a NodeJS project a ton of dependencies need to be installed. Maybe I would have preferred to use Fresh today, who knows.

The build step is based on Gatsby’s CLI command gatsby build that is configured in package.json. But before building, a .env.production file is created and filled with some environment variables. Those environment variables are required at build time. For example API_URL and API_TOKEN define the connection to Strapi’s API to fetch articles, images and so on. Additional variables for the later deployment to S3 are SITE_ADDRESS and BUCKET_NAME. Gatsby’s build step includes all steps decribed above: Querying sources to fetch articles, processing and building a static site.

On success, the build needs to be pushed to AWS S3. The easiest way for me was to use another Gatsby plugin called gatsby-plugin-s3. With a bit of configuration, running gatsby-plugin-s3 deploy --yes is the only required command to upload the new build.

Although all files on S3 are updated, the page is not updated yet. A CloudFront distribution caches files. And I need to explicitely tell CloudFront to update its files. I chose the easiest way, which is invalidating the entire distribution. There are definitely better ways, but usually I’m updating the page only once or twice a month. So invalidating everything is okay for me.

Fetching live league results

Almost everything the page now shows is static content. Except for the current league table. To fetch league table data, I need to send a request to faustball.com. The new version of vorsfelde-faustball.de is running in the user’s browser (Single-page application). Requests are sent using the browser’s Fetch API. Hence, CORS (Cross-Origin Resource Sharing) is something that adds additional complexity.

In short, the problem is that requests from the user’s browser to faustball.com will be blocked for security reasons.

Gatsby build is not allowed to send a request to faustball.com

Gatsby build is not allowed to send a request to faustball.com

To solve this problem, I wrote an AWS Lambda function in Go that is sending the request to faustball.com. As an advantage here, I can preprocess the data. The way faustball.com provides data is not ideal I think. But at least the table data is provided as JSON, so that’s an acceptable start.

The Lambda function is written in Go and kind of proxies requests to faustball.com. I added an additional query for league table data in the AWS AppSync GraphQL schema. This query is calling the AWS Lambda function as its resolver.

Using @apollo/client I was now able to query league table data.

Gatsby sends requests to AWS AppSync, gets forwarded to Lambda, new request to faustball.com

Gatsby sends requests to AWS AppSync, gets forwarded to Lambda, new request to faustball.com

Summary

Some month have already passed between finishing the new page and writing this article. Recapping what I did back then, I’m still pretty happy. Sure, there are many spots to improve, but most of those techniques were new to me and I wanted to get to know them. And I did. And I like what I did.

Especially when looking at the initial reasons to change the tech stack.

Instead of having a possibly outdated Drupal instance running on my server, I’m now shipping a bunch of static files. Updating NodeJS dependencies is always necessary and does not differ that much from composer. Except that I don’t have to spin up a Docker container locally to update the dependency files in my setup.

With Drupal I had the possibility to do so much that I never would have done. Forums, User Management, detailed permission management, creating and customizing content types, … I like those features, but I don’t need them. Now I have something more suited to my needs.