Today’s newsletter was going to be about how (and why) I once stole a laptop from one of the world’s largest IT consultancy companies. But I feel it’s more important to tell you about what happened to me this morning - and how you can stop it from happening to you.
You might already know I’m a big fan of GitHub Pages. I’ve written several blog posts about how I use them. If you’ve visited a static website that I run over the last five years or so, there’s a very strong chance that it’s run on GitHub Pages. But you probably won’t realise it’s on GitHub Pages because I often set up custom domains for a site. My CPAN Dashboard is running on GitHub Pages, for example. So is TwittElection. And Perl School. And many more. In fact one of the results of the story I’m about to tell is that I’m rather surprised at just how many domains I have pointing to GitHub Pages sites.
It’s easy enough to set up. You just configure your DNS provider so the domain points to GitHub’s servers. And then you add a file called CNAME to the repo that contains the web site code. That’s so GitHub knows where to direct requests for that domain. So, for example, my Perl School repo contains a file called CNAME which literally just contains the text “perlschool.com”.
You should also note that the DNS configuration is the same no matter who is doing it. That is, GitHub publishes four IP addresses that you should point your domain to. So if you’re configuring a GitHub Pages site, you’ll use exactly the same IP addresses as I do.
So far so good. The problem comes when you start changing things around. And that's what I had done.
In my last newsletter, I mentioned my static site builder called Aphra. And I said I should create a web site for it. I had forgotten that I had already started to create a site some years ago. I had, of course, used GitHub Pages and the site was at aphra.davecross.co.uk. I rediscovered this just after sending the last newsletter and had a bit of a think about the domain. In the end, I decided that as Aphra was a Perl project, it should really live at aphra.perlhacks.com instead. So I added a DNS entry in the perlhacks.com zone and changed the domain in the CNAME file.
At that point, I had introduced the problem. But I’d be really impressed if you can work out what it is.
I had a new DNS record for aphra.perlhacks.com. But I also had an old DNS record for aphra.davecross.co.uk. That was pointing at the GitHub servers, but there was no repo with a CNAME file configured to accept that connection.
If a bad person can find a DNS record that points at the GitHub servers but there’s no repo configured to accept connections on that domain, then they can create their own repo, add a CNAME that references that DNS record and serve their own content on your domain name.
That’s what had happened. Someone had spotted the “dangling” DNS record for aphra.davecross.co.uk and had created a GitHub Pages site containing that domain in its CNAME file and GitHub happily started serving their dubious content.
I assume there’s a way that you can automatically search for DNS records pointing at GitHub with no matching repo. I also assume there are enough people who make the same mistake as me that it’s worthwhile searching for this particular misconfiguration.
Of course, given that I’d forgotten about the existence of aphra.davecross.co.uk, I might never have found out about it - but I had a bit of dumb luck.
A while ago I did some work in SEO. And that got me in the habit of mildly obsessing about SEO for my web sites. One of the ways my obsession manifests is that I add all of my web sites to Google’s Search Console. You register your domain and it will tell you how you’re doing on SEO. It also tells you other interesting things like which search queries bring visitors to your site.
So davecross.co.uk is registered there. And yesterday, I got some weird emails from Google about the site. They told me that a new owner had been added to the site. I followed the link to Search Console but there was no sign of the new owner. Finally, this morning, I worked out they were talking about the “aphra” subdomain and, when I visited the site, I saw the new content and quickly worked out what had happened.
How do we fix the problem? My first attempt was to set up a new repo and try to override the CNAME in the spurious repo. But GitHub works on a “first come, first served” basis and it kept telling me that I couldn’t configure my repo to use that domain as it was already taken.
The easiest fix would be to just delete the DNS record so the new site stops working. But that’s only a temporary fix. What happens if I change my mind again and want to move the site back to the davecross.co.uk domain?
In the end, I did two things. My DNS provider allows you to configure web redirection rules so that requests are redirected to a different address before any web server even gets to see the request. I set that up so that aphra.davecross.co.uk redirects to aphra.perlhacks.com. It took a while for the DNS to propagate, but that’s working now - which solves the immediate problem.
But I couldn’t help thinking there must be a more permanent way to fix this problem. A way that prevents a random hacker from hijacking my domain in the first place. And it turns out that GitHub has thought of that. You can now verify your custom domains so that only you (or people working in your GitHub organisation) can configure a CNAME file so that it points at domains (or subdomains) that you own. It’s a simple process - you just add DNS records to the domain. The only slight downside here is that now I’ve verified davecross.co.uk, the hijackers are given seven days to respond to my claim (they won’t be able to, of course, as they don’t control the DNS) and until that time is up, I won’t get control of the domain - not that it matters in this case as I’ve stopped requests getting to their repo.
All in all, it was an interesting morning and lessons have been learned. If you want to benefit from my misfortune then I think there are a couple of important points:
Verify the domains you’re going to use with GitHub Pages
Register all of your domains with Search Console. Even if you don’t care about SEO, it’s still useful to know that Google are watching out for any weirdness going on with your site.
What else is going on?
If you’re thinking about getting a trainer in to bring your team up to speed with Modern Perl best practices or GitHub Actions then now would be a really good time to speak to me
I’m also interested in talking about freelancing with you. Have a look at my services page and see if there’s anything I offer that could be useful to you
I’ve also set up a GitHub sponsorship scheme if you just find my work useful and would like to find a way to say thank you
Here are a few interesting links I’ve seen in the last week or so.
I’ve had a lot of success (and fun!) using GitHub Copilot to make my coding more efficient. Now it looks like GitHub Copilot Workspace will be taking that a step further
At FOSDEM in February, Scott Chacon gave a talk called So You Think You Know Git. I learned things (but you won’t get me to admit what they were)
The Sora hype continues (and maybe gives your nightmares rather too many new ideas)
An annotated presentation from Simon Willison illustrating how journalists could be using AI right now
That’s all for now. I need to get back to verifying custom domains :-)
All the best,
Dave…
p.s. One last plea for help with my GitHub Pages problem. As I’ve said, the problem is now fixed. But I still don’t know which repo contains the CNAME file that hijacked my domain. If anyone can suggest how I could work that, it would make me very happy.