Ramblings of a Tampa engineer
Anubis - Open source and amazing.

I started a blog post last week as I patched my Leaf hobby project to be a bit more restrictive on what it bans due to my requests abruptly growing from 50k/day to 2mil/day. I thought that was going to be the end of it, but then I got more alerts of downtime and saw my hits had grown to over 4 million a day. There is no natural growth of a Halo Infinite stat site from 50k to 4 million visits in any world.

This didn't seem like an out of control bot anymore due to thousands of those requests just opening/closing the connection incredibly fast. Once my PHP workers were exhausted and 502's were returned - these vast set of IPs would continue hammering. It just didn't seem like stuff that regular spiders or bots would do.

So once I had some downtime I started digging into this again.

Hits spiking to 4 million/day

I pulled some logs locally to re-run some analytics and was surprised to see a major jump from 2 million hits a day to over 4 million. Unfortunately due to some naming I wiped out some logs of the past week creating that odd dip you see above. With Halo Infinite presently only pulling about 2,000 unique players a day there is no chance in hell that a unique set of 2 million visitors were visiting Leaf.

đź’ˇ
HTTP requests containing the same IP, same date, and same user agent are considered a unique visitor in GoAccess.

Unfortunately I was out and about when these alarms were going off, so I suffered a few hours of downtime as my server just tanked 90-120 heavy requests a second for hours.

For a properly cached and tuned server that would be nothing, but I built Leaf using Laravel Livewire which heavily depends on cookies. I could not identify an easy way to rip cookies out so I could serve cached pages purely from NGINX. That meant every single request (except static files) would invoke PHP to render said page and consume some resources.

Linode stats of server pre/post fix.

So now that I had exhausted my adaptations to my robots.txt & fail2ban rules I set off to introduce a tool I had noticed on a lot more sites recently. Anubis is a tool that sits in front of your website and acts as a little body guard dictating what can or cannot view your site.

In the era of AI agents and aggressive scrapers taking as much data as they can, Anubis found success introducing a variety of challenges to make their access incredibly difficult. So I decided to give it a shot and configure it in front of Leaf. With one quick copy I had my configuration in place.

# cat /etc/anubis/leafapp.env 
BIND=:8923
BIND_NETWORK=tcp
DIFFICULTY=4
METRICS_BIND=:9090
METRICS_BIND_NETWORK=tcp
SERVE_ROBOTS_TXT=0
TARGET=http://127.0.0.1:3001
COOKIE_DOMAIN=leafapp.co
ED25519_PRIVATE_KEY_HEX=redacted

Configuration file for Anubis for Leaf

This basically helped show Anubis how to redirect traffic to my internal listener once a user passed their checks. I swapped my NGINX over and as my Linode graphic shows above I watched my load decrease all the way to under 1.

Now when you visit Leaf if you don't have a stored cookie proving validity - Anubis will show up briefly to run some challenges.

Intermediate page before Leaf

This page above is tough to see on a faster piece of hardware, but briefly on mobile devices you may see it. It isn't my cup of tea in terms of a design, but this is the offering on the open source edition. An enterprise option exists with a more boring name (BotStopper) for those in the industry that don't want to print an anime like avatar before the page loads.

So I wanted to dive into the analytics Anubis offers after it had been running for a few hours. With a simple query to the /metrics route, I found Prometheus metrics spat out.

# curl -s http://127.0.0.1:9090/metrics | grep ^anubis
anubis_challenges_issued{method="embedded"} 454593
anubis_challenges_validated{method="fast"} 1479
anubis_policy_results{action="ALLOW",rule="bot/bingbot"} 6368
anubis_policy_results{action="ALLOW",rule="bot/common-crawl"} 191
anubis_policy_results{action="ALLOW",rule="bot/duckduckbot"} 16
anubis_policy_results{action="ALLOW",rule="bot/favicon"} 546
anubis_policy_results{action="ALLOW",rule="bot/googlebot"} 60021
anubis_policy_results{action="ALLOW",rule="bot/robots-txt"} 57
anubis_policy_results{action="ALLOW",rule="bot/well-known"} 24
anubis_policy_results{action="ALLOW",rule="bot/yandexbot"} 681
anubis_policy_results{action="ALLOW",rule="threshold/minimal-suspicion"} 20076
anubis_policy_results{action="CHALLENGE",rule="threshold/extreme-suspicion"} 195
anubis_policy_results{action="CHALLENGE",rule="threshold/moderate-suspicion"} 467875
anubis_policy_results{action="DENY",rule="bot/ai-catchall"} 11081
anubis_policy_results{action="DENY",rule="bot/ai-clients"} 11
anubis_policy_results{action="DENY",rule="bot/ai-crawlers-search"} 28138
anubis_policy_results{action="DENY",rule="bot/ai-crawlers-training"} 1373
anubis_policy_results{action="DENY",rule="bot/alibaba-cloud"} 1271
anubis_policy_results{action="DENY",rule="bot/huawei-cloud"} 10731
anubis_policy_results{action="WEIGH",rule="bot/deny-aggressive-brazilian-scrapers"} 195
anubis_policy_results{action="WEIGH",rule="bot/generic-browser"} 469549
anubis_proxied_requests_total{host="96.126.124.217"} 14
anubis_proxied_requests_total{host="leafapp.co"} 51992
anubis_proxied_requests_total{host="www.leafapp.co"} 49450
anubis_time_taken_sum{method="fast"} 543398
anubis_time_taken_count{method="fast"} 1479

Anubis metrics output.

As I examined these results it was interesting to see how quickly effective Anubis was in roughly 6 hours.

  • I had ~468,000 Anubis challenges deployed and only 1,479 solved it.
    • A crazy 0.30% pass rate.
  • I had ~52,000 challenges denied immediately for falling into a bucket of "bad bots"
  • I had ~101,000 requests allowed which was largely Googlebot (60k) and a bunch of other deemed legitimate traffic.
  • Out of the 1,479 solved challenges they averaged 367ms delay prior to loading Leaf.
Leaf server rebounding with no load post Anubis

Time will tell if this solution is enough, but I'm happy at the moment with an acceptable amount of load on the server while still accepting traffic for the few humans left still leveraging Leaf and playing Halo Infinite.

You’ve successfully subscribed to Connor Tumbleson
Welcome back! You’ve successfully signed in.
Great! You’ve successfully signed up.
Success! Your email is updated.
Your link has expired
Success! Check your email for magic link to sign-in.