How to Build a Powerful Reverse Proxy Firewall for Blocking the Evil Web-Scraping Robot Hordes from Hell
(cheapskatesguide.org)
from kjo@discuss.tchncs.de to selfhosted@lemmy.world on 01 Sep 02:45
https://discuss.tchncs.de/post/43969984
from kjo@discuss.tchncs.de to selfhosted@lemmy.world on 01 Sep 02:45
https://discuss.tchncs.de/post/43969984
My goals for this firewall were mostly to provide better robot blocking and perhaps some more powerful DDoS protection than my Raspberry Pi 3 web server is capable of delivering. I still have to do some testing before I will know if my new firewall actually provides either of those, but at least I now have the additional ability to run multiple physical web servers on my LAN. Exploring that should be fun, and fun is a very important component of running a home web server.
Not my article. Just sharing.
#selfhosted
threaded - newest
All this effort to block clankers? Absolutely worth it fuck clankers.
I absolutely love the term clankers. It’s the perfect blend of dystopian cyberpunk and the very real threat of AI.
It seems goofy to me — I wish we had collectively picked a term with more oomph.
I’m struggling to come up with an alternative though.
I’m still trying to make ‘sloppers’ happen. Perfectly describes the lack of thought that goes into what they produce.
Yeah and the make a mess of network traffic that slows everything down.
Sludgers works too, but I like slop for the LLM output, so it makes sense as the bot term of derision.
clankers make the end result that sloppers (meatbags) eat up ;)
I like ‘sloppers’ as a term for the morons distrubing and consuming the shit that the clankers are excreting.
Stealing a slur from Star Wars and engaging in traditional name calling to show we disapprove of uncreative slop.
We can’t even think of an original term. We can’t think of a novel way to shit on AI. We just copy what everyone else is doing to make fun of the plagiarism machine.
No no - it’s not plagiarism; it’s standardization.
Why does everyone think the term came from Star Wars? I know it was used in steampunk before then, and google suggests it goes back to a 1958 article about robots. Sorry, not trying to be pedantic, just feels like a lot of people give Star Wars unjust credit for things they didn’t actually create.
toasters
So say we all.
so say we all
All of this has happened before.
does that mean we get to burn down textile mills?
This page isn't loading for me.
found the bot
I'm getting a 404 error, using Cloudflare DNS, who ironically has the best commercial clanker protection in the world, otherwise half the world's internet wouldn't use them
Bot
Not a bot. Both of you can go fuck yourselves with an ENTIRE can of bear mace.
Using a VPN that you forgot is on?
Yep. Why do you ask?
Because the VPN might be the reason you’re being blocked from the page.
Why would a post on a .org domain blog site about blocking AI bots be relevant to my VPN?
Since multiple people will be using the same IP when using a VPN. If one person is a bad actor and causes the IP to be blacklisted, it will affect you too.
Uranibab said.
Try turning off your VPN.
Sounds like something a bot would say
Cloudflare is a protection racket. They cover so many websites because it’s easier to pay the mafia.
Im not a cloudflare dick rider, so if you have a suggestion for a better service with commensurate features, im all ears.
Well it cant be that good becauase it thinks im a bot.
Anubis works pretty well for me so far in blocking clankers.
It seems Anubis’github issues shows many false positives with smartphone browsers. Depending on OP’s target audience it’s worth to hunt for FP
I just wish i could read it, it seems to block based on my IP which isn’t really a good way to identify bots.
Here is a mirror git.qiuwen.net.cn/Mirror/anubis handle with care
Thanks but i meant the site in the original post cheapskatesguide.org/…/debian-netinstall-waf.html
It says
Your IP address has been blocked. This MAY be because you have made yourself look like a robot by using an unknown VPN or Tor exit node.
Blocking tor is pretty bold, that network is too slow to use for anything but straight up privacy.
How ironic
I guess you will have to resort to online translators or actual web proxies to read these pages 🙄
I just used a bot to read it: web.archive.org/web/…/debian-netinstall-waf.html
Here you go: web.archive.org/web/…/debian-netinstall-waf.html
They seem to block archive.today but not archive.org.
Oh interesting! Thank you.
And then you have to fill a block list with something like github.com/…/nginx-ultimate-bad-bot-blocker
Why not just use a network level firewall like pfsense?
Opnsense > pfsense
The fact that I have to go through a fucking purchase page, even though pfsense is free (for now), is sketch as hell. First step in their inevitable enshittification.
Opnsense is funded by European non-profits, and is has a better UI
Not to mention they tried running a disinformation campaign against OPNsense for a few years, which was resolved in court.
Also they implemented a WireGuard module that after a review upstream on FreeBSD was found to be completely hocus.
Should we just move everything to tor and start the Internet over again?
Wait till you hear about betanet
Isn’t that the new freenet ?
No that’s alphanet
Damn it, with their confusing naming !
surely you mean Pipernet
I was interested until I saw the crypto stakes to vote on changes baked in.
*i2p
look if i can’t browse from my fridge i don’t want to know about it
Not sure about Tor but your fridge definitely supports mesh networks whether it wants to or not.
Aka “how to harm marginalized folks” and prevent them from accessing your content too
403 error. Congrats, you built a broken system of false-positives.