• r00ty@kbin.life
    link
    fedilink
    arrow-up
    1
    ·
    2 days ago

    I thought I’d reply to this with an update. Because, I saw how far the free tier goes, and it’s pretty far.

    I had another huuuuge influx of AI/other bots scraping my instance at top speed. Hundreds of requests per second, and it was putting some load on the postgres server.

    What I found was, that there was a mixture of traffic. Some was coming from a handful of AS numbers (each hosting hundreds of large IP blocks) controlled by a small handful of the same names. So, those I was able to block outright by AS number.

    But then I found a very large number of random requests coming in bursts (and definitely not humans) all on mobile or isp customer blocks… I assume it’s some kind of botnet being used? But they were all valid requests for posts and comments.

    I looked at the custom ruleset on cloudflare and, it’s quite powerful. I settled on the following.

    1: Allow known fediverse software by user agent (yes, the bots could eventually spoof these. But right now they are not). 2: Allow known instances by IP blocks 3: Allow access to the fediverse inbox specifically. Which is where most inter-instance traffic goes. 4: Allow access to LOCAL users and well known services/other standard ActivityPub urls 5: Everything else, for everyone else. Managed challenge.

    The traffic just completely stopped dead. The fediverse traffic continued unfettered. But the traffic coming in was legitimate (it’s mostly me, a handful of others that is so little traffic).

    All it adds, is the interstitial page with the “are you human?” checkbox that for most people automatically checks. And the user moves on fine and can interact normally with the site. So for people it’s a very minor inconvenience and it stops bot traffic completely in its tracks.

    What is annoying. I could make this MUCH better with regex matches. But, not only do they not allow free accounts to use regex (I understand). But “Pro” users cannot either. It’s only for business or above… Business accounts are eye wateringly expensive for a hobbyist!