I’ve been running Compiler Explorer for over four years now, and a lot has changed in that time. In addition to C++, it now supports Go, Rust and D. It scales up and down to support demand. It has many different compilers, many self-built.
I’ve been asked by a couple of people recently how everything works, and so I thought I’d put some notes down here, in case it should help anyone else considering something similar.
In brief: Compiler Explorer runs on some Amazon EC2 instances, behind a load-balancer. DNS routes to the load balancer, which then picks one of the instances to actually send the request to. In fairness, most of the time I only have one instance running, but during rolling updates (and high load) I can have two or so.
The number of instances running is controlled by some Amazon alerts based on CPU load: as CPU load on the instances goes up, more are added. The autoscaling stuff is something I only put in after I got “Hacker Newsed” (the new Slashdotted) a few times: it’s nice to stay up even if you get a 30x traffic spike!
Spot the Hacker News posts...
The EC2 instances boot up a custom AMI (Amazon machine instance image) I prepare offline. That AMI has a base Ubuntu, plus Docker and a bunch of web server stuff. It’s configured to boot up and then pull down some Docker images from a private repository. These images are the “VMs” for each of the sites (two for gcc.godbolt.org, d.godbolt.org, rust.godbolt.org and go.godbolt.org). Each contains either Ubuntu or Alpine Linux with node.js and a bunch of compilers.
The reason for Docker (basically extremely lightweight VMs) is manyfold, even though I dislike the tooling it provides:
The drawbacks are:
Dockerfileyou write is hard to maintain. I have to use some
Makefiletrickery to prevent having to repeat parts of the Dockerfile amongst each of the containers.
The reason they’ve dropped in size recently is I’ve started migrating the
compilers I can build myself to EFS: Amazon’s cloud filesystem. Basically it
appears like an NFS mount and I put the majority (8GBish) of compilers there
now, then mount that read-only into each docker container. When locally
testing, I put the same data in the same place (
/opt/gcc-explorer) but it’s
stored locally and not on EFS.
On the EC2 instance itself I run nginx. nginx is configured to route traffic to each of the sites to ports 10240, 10241, 10242 etc (where each site’s node.js listens). This allows all my sites to live on the same box(es). nginx also serves up my static content, like this blog and the jsbeeb website.
To build the AMI I use packer, which remote-controls an EC2 instance, running all the commands I need to set it up, then does a snapshot to build an AMI.
The Docker images are built using
Dockerfiles with some
Makefile stuff to
script everything up. That all runs locally; I can
test that as I said before before pushing the images.
To do upgrades I actually just start up a fresh instance: as the instance boots it picks up the latest of everything (via some scripts it runs on bootup). Then I wait for it to become “healthy” (i.e. the load balancer internal checks for it to pass) and then I kill the old instance. This gives me a fresh-start VM, plus it also tests the scale-up/scale-down behaviour. It also means the site stays up during upgrades.
The updates are all triggered by GitHub checkin hooks: if I check in to a “release” branch, then the whole process is driven automatically. Which is pretty cool.
All of this stuff is actually available in a very messy repo on GitHub,
including the packer configuration, the
Docker files, the
Makefile I use to drive the whole show,
the GitHub hook (I run on my home computer), the scripts I use to build
compilers and update the EFS…the whole lot. It’s totally undocumented but picking
through there may give some further clues as to what’s going on.
Compiler Explorer started out as me manually running a single instance
with no load balancer: I’d just
ssh on and
git pull to upgrade.
As things grew and I wanted to be be less susceptible to accidentally breaking
it I automated more and more.
It’s taken many years to get to this level of sophistication – indeed the EFS only went in last weekend. I’m pretty happy with how it now works.