Cloudflare Turnstile requiring fingerprintable WebGL

(hacktivis.me)

88 points | by HypnoticOcelot an hour ago ago

37 comments

Cloudflare is known to use fingerprinting to detect scrapers For example, they use JA3 fingerprints and match them against the UA to block stuff like cURL while allowing OkHttp (Android clients) - but this can be easily be spoofed with packages such as CycleTLS [1].

I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Cromite, a privacy conscious fork of Chromium for Android, has constantly issues with CloudFlare Turnstile [2] because they (Cloudflare) try to fingerprint it in multiple ways in order to pass the challenge. The only way to get it to work would be to join the CloudFlare Browser Developer program - which requires signing an NDA. Rightfully so, the project maintainer didn't want to do it.

If you want to see the extent of what CloudFlare does to fingerprint the browsers, just have a look in the issue [2] and see which flags need to be disabled in order to allow CloudFlare to pass the challenge.

I understand both sides, but at least CloudFlare could be flexible enough to fall back to PoW instead of just blocking people from sending forms or accessing websites...

[1]: https://github.com/Danny-Dasilva/CycleTLS

[2]: https://github.com/uazo/cromite/issues/2365

[-]

PearlRiver 8 minutes ago

This is why I have two separate browsers. If you want to do official stuff like paying for things you need to get through cloudflare.

avallach 15 minutes ago

Doesn't this mean we just need to make the webgl fingerprint resistance implementation smarter? Instead of explicitly rejecting webgl access or responding with dummy data, respond with data that is random within space of N common and reproducible patterns. E.g. emulate webgl implementation of some low spec but actually popular devices.

malka1986 35 minutes ago

Thanks, i did not know about `privacy.resistfingerprinting`

I'll make sure to fail all cloudflare turnshit in the future.

[-]

gruez 24 minutes ago

I have it enabled and turnstile works fine.

adamtaylor_13 22 minutes ago

So if you need to prevent bot abuse, but also don't want an ugly captcha every time someone goes to sign up, is there a better option?

[-]

ribtoks a minute ago

Use proof-of-work captchas, many are private by default. Look into Private Captcha or Cap captcha.

ImPostingOnHN a minute ago

The tool "Anubis" uses proof of work instead

gruez 13 minutes ago

This blog post is filled with false assumptions.

>Turns out it's because Cloudflare wants to have a fingerprint of your device via WebGL, the only reason for doing this would be tracking.

> So Cloudflare just banned all WebKitGTK browsers as I guess they put an exception for Safari.

This is false. I ran firefox with:

* hardware acceleration disabled (so software renderer, nothing to fingerprint)

* resistfingerprinting enabled, including letterboxing with default window size

* webgl disabled

* VPN enabled

* In a Windows VM

By all accounts this should be the most suspicious fingerprint ever, but turnstile happily lets me through. If they want to track people, they're doing a pretty bad job. My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

> Such things are blocked in WebKit, and have been for years. Meaning it's tracking so awful that even Apple would block it, and as far as I can tell it's not the kind of privacy protection you can easily disable in it.

This is also false. Webgl fingerprinting works just fine on Safari. They might try to mitigate it by adding some noise, but that's not so different than what firefox does, and is certainly not "blocked".

Wowfunhappy 34 minutes ago

...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

Obviously this is terrible, but I think there's a possibility it's the least terrible option? Another option is IP reputation, which I think is worse. Or scanning a code with a non-rooted phone, which I think is even worse than that!

[-]

thisislife2 7 minutes ago

The only solution is regulation. If all content created by anyone has a copyright, how does an implicit opt-in (which is what happens if you don't create robots.txt file for your website) for scraping make any sense? Moreover, even if you have a robots.txt, AI (or whatever) bots often don't respect it (or use workarounds - they outsource scraping of such "restricted" sites to third-parties to get the data). So clearly, the logic and the "honour system" has failed.

Cloudflare, Google Captcha, HCaptcha etc. are all shitty technical solutions because, as we are all discovering, it comes at the cost of our privacy (i.e. our personal data may monetise these services) and / or our computing resource and time. If current copyright laws aren't sufficient to prevent this, we have to acknowledge the system is broken. The answer could be enhancing it with some kind of Digital Millennium Copyright Act (DMCA) -like laws, but in favour of the creators against BigTech or rogue actors.

- Web-scraping and copyright law - https://www.neudata.co/blog/web-scraping-and-copyright-law

- Why DMCA Claims Against Web Scrapers Face Long Odds - https://capstonedc.com/insights/why-dmca-claims-against-web-...

fidotron 31 minutes ago

> ...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

There isn't one, and pretending otherwise is nonsense because humans will always provide their credentials to something to act on their behalf.

In the limit you end up with Chinese phone farms.

[-]

tardedmeme 22 minutes ago

Right. Botnet operators love cloudflare because they make so much money renting out compromised machines to pass their tests.

Gander5739 9 minutes ago

You don't need a non-rooted phone to pass captcha checks, I have a rooted phone and can pass the captchas that ask you to scan a qr code. But I doubt phones without google services would manage.

cr125rider 23 minutes ago

And identifying a bot that is acting on my behalf. Claude go search this topic is basically the same as Googling something and clicking on the results. Human driven AI searching needs to be in a different box than AI scraping for training data.

Which sounds extremely difficult to differentiate

spacedoutman 15 minutes ago

Private invite only internets

csomar 20 minutes ago

They are not a problem unless you "believe" it is a problem. I estimate around 20-25K hits to my website from bots per day and I have all cloudflare protections disabled. Any decently optimized server should be able to easily handle that. (it's roughly 1 request every 3 seconds).

[-]

specialp 6 minutes ago

Yes and that is just the bot background radiation of the internet. I run a primary source of information site and these botnets are aggressive to a DDOS level. All to do some sort of scraping. Because they have sophisticated enough tactics to DDOS us if they wanted to. However I am not sure their objective as they have wasted enough of our resources to have scraped all our content 1000s of times over. That 25k traffic is a couple of minutes for us. And that adds up. 80-90pct of our traffic is this

thisislife2 4 minutes ago

True. But it still wastes your server resources, right? And it's sad that you have to accept that as part of the "cost" of hosting a site ...

malka1986 30 minutes ago

> keeping out bot

You can forget about it. It is not possible. Simple as that.

[-]

Wowfunhappy 27 minutes ago

Let's say I'm selling concert tickets. How do I prevent bots from buying up all the tickets and scalping them?

[-]

MyMemoryfails 16 minutes ago

I'd simply check filling speed, even with browser's autocomplete humans are slow due needing click submit.

Then when it's "processing", do them in bulk and prioritize slower users. There's huge opportunity do bot checks after checkout without affecting user experience.

Also on product launches you could add unique field which requires user to input, for example that way bots can't prepare for launches.

luckylion 25 minutes ago

Tie them to the buyer's identity, offer at-value buy-backs until X weeks before event, disallow resale.

doctorpangloss 20 minutes ago

web environment integrity

nulledy an hour ago

As turnstile users on several of our sites, I think we need to revisit that decision.

[-]

sammy2255 34 minutes ago

Out of curiosity, why did you have it on in the first place?

Fokamul 43 minutes ago

Please, anyone from EU (US is doomed rofl) create a petition to ban browser-fingerprinting in EU, across all existing browsers.

I'm not good at creating petitions but can happily sign it. Also with stop killing games and anti-chat control.

I can imagine this can get a traction, if it's explained in youtube video to "normal" people.

[-]

fidotron 29 minutes ago

A better solution would be to make webgl, webgpu and (especially) webrtc have some sort of prompt before they can be in any way used in that fashion, but this will absolutely destroy web ux Windows Vista style.

[-]

richwater 19 minutes ago

You mean the "Accept Cookies" banner that has become a complete joke? Pass

[-]

MyMemoryfails 5 minutes ago

I think he means browser permissions, for example when browsers want notify or record your mic theres a permission check something similar for webgl.

koolala 33 minutes ago

a. Accept All

b. Accept Only Necessary Fingerprinting

kykat an hour ago

What? Big tech company is evil? No way! I thought cloudflare were good guys...

[-]