Windmill on ECS

Windmill on ECS

January 6, 2024

With the recent news about Airplane being acquired by Airtable and shutting down March 1, 2024, I imagine some platform teams are scrambling to find a replacement. I have spent the better part of a week researching alternatives and the emerging favorite is Windmill.

Windmill is fully open source, can be self-hosted (even with an Enterprise license), and has a lot of interesting functionality that places it roughly equivalent to Airplane. As part of a team that heavily utilized Airplane and perhaps did a bit too much with it, its replacement needs to offer solid code-based approach to internal tooling and automation and Windmill seems to fit the bill (with some small caveats which another post will undoubtedly discuss).

While an entire post could easily be dedicated to Windmill’s functionality, I wanted to document how to get Windmill up and running in ECS rather than using the provided docker-compose file (since most teams will want to run this in production). This post will initially be farely brief and lacking in explicit detail; however, I plan to update it regularly as our Windmill self-hosted journey evolves.

Architecture/Manual Deployment #

The Windmill deployment architecture is fairly straightforward. The desired outcome is Windmill running on ECS while being accessible via HTTPS with a valid SSL certificate. Windmill can be deployed with a postgres container; however, this is not the most suited toward production workloads. To address this, we’ll also want to have an Aurora RDS cluster to persist our data.

This post will generally outline the necessary configuration and deployment of the various AWS resources required for a “production-ready” Windmill deployment. Future updates will add graphics and IaC but for now all of the content will be text-based.

All of these resources can be created in a region of your choosing.

VPC #

The ECS cluster will need to be placed into a VPC. Either an existing VPC or new VPC can be used – this example will assume that a new VPC is being created.

The VPC can use a CIDR block of your choosing, but the default (assuming no overlaps with an existing VPC) CIDR block can be used (10.0.0.0/16).

The VPC will also need:

  • public and private subnets
  • a NAT Gateway residing in one of the private subnets
  • an Internet Gateway residing in one of the public subnets
  • a private route table for the private subnets and the NAT Gateway
  • a public route table for the public subnets and the Internet Gateway

Security Group #

A security group will be needed for the ECS cluster, ALB, and RDS cluster.

The security group will need to allow the following ports:

  • All Traffic ingress the security group ID itself (ephemeral ports, RDS traffic, etc. from the ECS cluster)
  • 5432/tcp ingress the VPC CIDR (10.0.0.0/16)
  • 5432/tcp ingress from a trusted IP (EIP or otherwise) in order to bootstrap the RDS instance with Windmill’s SQL script
  • 443/tcp ingress globally (0.0.0.0/0)

RDS Cluster #

Aurora is used for the external Windmill database. For the purposes of this document, Aurora Serverless for Postgres is used.

Create the RDS cluster with any user/password you desire and ensure that it is created in the VPC created above along with the security group created above.

Depending on how the Windmill SQL script will be executed, the RDS cluster may need to be publicly-accessible. If running the script locally via psql, ensure that the cluster is publicly-accessible so that a remote connection is allowed.

Use the following command as an example:

> psql postgresql://<RDS primary username>:<RDS primary password>@<RDS DNS endpoint>/postgres?sslmode=disabled -f <SQL script>

RDS is a little different per Windmill’s documentation as well. Once the SQL script has been executed, create the windmill database and grant the appropriate windmill roles to the RDS user specified when the cluster was created:

CREATE DATABASE windmill OWNER <RDS user>;
GRANT windmill_admin TO <RDS user>;
GRANT windmill_user TO <RDS user>;

and grant schema permissions to the windmill roles:

GRANT USAGE ON SCHEMA public TO windmill_admin;
GRANT USAGE ON SCHEMA public TO windmill_user;

ECS Cluster #

ECS clusters can be created using EC2 instances or AWS Fargate (serverless). For this example, the latter is used. Specify a name and description along with the Fargate Infrastructure option.

ECS Task Definition #

The Task Definition will specify the actual Windmill container image, port, environment variables, and other configuration.

For the container name, specify windmill and for the image, specify ghcr.io/windmill-labs/windmill:main (for Enterprise, ghcr.io/windmill-labs/windmill-ee:main would be used)

For the container port, specify 8000, TCP and HTTP (leaving the Port name blank to be auto-generated).

Using the following environment variables as an example, add each to the windmill container’s environment variables:

DATABASE_URL=postgresql://<RDS username from above>:<RDS password from above>@<RDS DNS endpoint>/windmill?sslmode=disable

Other environment variables like the following can also be specified:

PUID=1000
PGID=1000
TZ=America/NewYork
RUST_LOG=info
NUM_WORKERS=1
DISABLE_SERVER=false
METRICS_ADDR=false

The environment variables can also be loaded from a .env file in S3; however, this is out of scope for this post.

One large caveat here is that passing the RDS credentials in plaintext via an environment variable is an operational no-no. For production, consider doing something like this by utilizing Secrets Manager.

ECS Service #

Configure a new ECS Service to consume the Task Definition created above. The service should:

  • use the VPC created above
  • the public subnets (so that the container can be reached from outside of the VPC)
  • use the security group created above

The other configuration options are at your discretion. For this example, a replica value of 1 can be used.

ACM Certificate #

An ACM certificate will be needed in order for HTTPS requests to validate successfully. This ACM certificate is used by the ALB.

Application Loadbalancer #

An ALB will be need so that traffic can be sent to the windmill container without needing to know its IP.

Create an IP-based ALB that uses the security group created above.

Target Group #

Before creating the HTTP listeners, create a target group that forwards 8000/tcp traffic to the VPC created above.

Listener(s) #

Create two listeners:

  1. 80/tcp that redirects to a URL (HTTPS)
  2. 443/tcp that uses the ACM certificate and target group created above

Route53 Record #

A Route53 A record pointing to the ALB DNS endpoint can be used in conjunction with the ACM certificate validation FQDNs to serve a valid, HTTPS-friendly DNS entry.

The Route53 record can be created in any domain and it must be an A-type that “points to IPv4 addresses and some AWS resources” – specify a Network Load Balancer, the region where the ALB was created, and the entire DNS name for the ALB. Once the record is created, create the CNAME record using the validation FQDNs provided by the ACM certificate.

Future #

Going forward, automating these steps via Cloudformation or Terraform is on the table. While still fairly straightforward to implement, the process still requires a lot of Console navigation and clicking/typing, so IaC would speed things up dramatically now that the process has been ironed out.