This will be one of the big fights of the next couple years. On what terms can an Agent morally and legally claim to be a user?
As a user I want the agent to be my full proxy. As a website operator I don’t want a mob of bots draining my resources.
Perhaps a good analogy is Mint and the bank account scraping they had to do in the 2010s, because no bank offered APIs with scoped permissions. Lots of customers complained, and after Plaid made it big business, eventually they relented and built the scalable solution.
The technical solution here is probably some combination of offering MCP endpoints for your actions, and some direct blob store access for static content. (Maybe even figuring out how to bill content loading to the consumer so agents foot the bill.)
It's impossible to solve. A sufficient agent can control a device that records the user's screen and interacts with their keyboard/mouse, and current LLMs basically pass the Turing test.
IMO it's not worth solving anyways. Why do sites have CAPTCHA?
- To prevent spam, use rate limiting, proof-of-work, or micropayments. To prevent fake accounts, use identity.
- To get ad revenue, use micropayments (web ads are already circumvented by uBlock and co).
- To prevent cheating in games, use skill-based matchmaking or friend-group-only matchmaking (e.g. only match with friends, friends of friends, etc. assuming people don't friend cheaters), and make eSport players record themselves during competition if they're not in-person.
What other reasons are there? (I'm genuinely interested and it may reveal upcoming problems -> opportunities for new software.)
People just confidently stating stuff like "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study. It's so divorced from my experience of these tools, I genuinely don't really understand how my experience can be so far from yours, unless "basically" is doing a lot of heavy lifting here.
> "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study.
I think you may think passing the Turing test is more difficult and meaningful than it is. Computers have been able to pass the Turing test for longer than genAI has been around. Even Turing thought it wasn't a useful test in reality. He meant it as a thought experiment.
The problem with comparing against humans is which humans? It's a skill issue. You can test a chess bot against grandmasters or random undergrads, but you'll get different results.
The original Turing test is a social game, like the Mafia party game. It's not a game people try to play very often. It's unclear if any bot could win competing against skilled human opponents who have actually practiced and know some tricks for detecting bots.
Having LLMs capable of generating text based on human training data obviously raises the bar for a text-only evaluation of "are you human?", but LLM output is still fairly easy to spot, and knowing what LLMs are capable of (sometimes superhuman), and not capable of, should make it fairly easy for a knowledgeable "turing test administrator" to determine if they are dealing with an LLM or not.
It would be a bit more difficult if you were dealing with an LLM agent tasked with faking a turing test as opposed to a naieve LLM just responding as usual, but even there the LLM will reveal itself by the things that it plain can't do.
If you need a specialized skill set (deep knowledge of current LLM limitations) to distinguish between human and machine then I would say the machine passes the turing test.
OK, but that's just your own "fool some of the people some of the time" interpretation of what a Turing test should be, and by that measure ELIZA passed the Turing test too, which makes it rather meaningless.
The intent (it was just a thought experiment) of a Turing test, was that if you can't tell it's not AGI, then it is AGI, which is semi-reasonable, as long as it's not the village idiot administering the test! It was never intended to be "if it can fool some people, some of the time, then it's AGI".
Turing's own formulation was "an average interrogator will not have more than 70% chance of making the right identification after five minutes of questioning". It is, indeed, "fool some of the people some of the time".
OK, I stand corrected, but then it is what it is. It's not a meaningful test for AGI - it's a test of being able to fool "Mr. Average" for at least 5 min.
I think that's all we have in terms of determining consciousness.. if something can convince you, like another human, then we just have to accept that it is.
Agreed. I tend to stand with the sibling commenter who said "ELIZA has been passing the Turing test for years". That's what the Turing test is. Nothing more.
Perhaps, but that's somewhat off topic since that's not what Turing's thought experiment was about.
However, I'd have to guess that given a reasonable amount of data an LLM vs human interacting with websites would be fairly easy to spot since the LLM would be more purposeful - it'd be trying to fulfill a task, while a human may be curious, distracted by ads, put off by slow response times, etc, etc.
I don't think it's a very interesting question whether LLMs can sometimes generate output indistinguishable from humans, since that is exactly what they were trained to do - to mimic human-generated training samples. Apropos a Turing test, the question would be can I tell this is not a human, even given a reasonable amount of time to probe it in any way I care ... but I think there is an unspoken assumption that the person administering the test is qualified to do so (else the result isn't about AGI-ability, but rather test administrator ability).
> an LLM vs human interacting with websites would be fairly easy to spot since the LLM would be more purposeful - it'd be trying to fulfill a task, while a human may be curious, distracted by ads, put off by slow response times, etc, etc.
Even before modern LLMs, some scrape-detectors would look for instant clicks, no random mouse moves, etc., and some scrapers would incorporate random delays, random mouse movements, etc.
Easy to spot assuming the LLM is not prompted to use a deliberately deceptive response style rather than their "friendly helpful AI assistant" persona. And even then, I've had lots of people swear to me that an emoji laden not this--but that bundle of fluff looks totally like it could have been written by a human.
Yes, but there are things that an LLM architecturally just can't do, and LLM-specific failure modes, that would still give it away, even if being instructed to be deceptive would make it a bit harder.
Obviously as time goes on, and chatbots/AI progress then it'll become harder and harder to distinguish. Eventually we'll have AGI and AGI+ - capable of everything that we can do, including things such as emotional responses, but it'll still be detectable as an alien unless we get to the point of actually emulating a human being in considerable detail as opposed to building an artificial brain with most or all of the same functionality (if not the flavor).
I guess that is where the disconnect is, the issue is that if they mean the trivial thing, then bringing it up as evidence for "it's impossible to solve the problem" doesn't work.
You're trained to look for LLM-like output. My 70 year old mother is not. She thought cabbage tractor was real until I broke the news to her. It's not her fault either.
The turning test wasn't meant to be bulletproof, or even quantifiable. It was a thought experiment.
As far as I understand, Turing himself did not specify a duration, but here's an example paper that ran a randomized study on (the old) GPT 4 with a 5 minute duration, and the AI passed with flying colors - https://arxiv.org/abs/2405.08007
From my experience, AI has significantly improved since, and I expect that ChatGPT o3 or Claude 4 Opus would pass a 30 minute test.
> In the test, a human evaluator judges a text transcript of a natural-language conversation between a human and a machine. The evaluator tries to identify the machine, and the machine passes if the evaluator cannot reliably tell them apart. The results would not depend on the machine's ability to answer questions correctly, only on how closely its answers resembled those of a human.
Based on this, I would agree with the OP in many contexts. So, yeah, 'basically', is a load bearing word here but seems reasonably correct in the context of distinguishing human vs bot in any scalable and automated way.
Judging a conversation transcript is a lot different from being able to interact with an entity yourself. Obviously one could make an LLM look human by having a conversation with it that deliberately stayed within what it was capable of, but judging such a transcript isn't what most people imagine as a turing test.
Here's three comments, two were written by a human and one written by a bot - can you tell which were human and which were a bot?
Didn’t realize plexiglass existed in the 1930s!
I'm certainly not a monetization expert. But don't most consumers recoil in horror at subscriptions? At least enough to offset the idea they can be used for everything?
Not sure why this isn’t getting more attention - super helpful and way better than I expected!
If you're willing to say that a fifteen year old bot was "writing" then I think having a discussion on if current "bots" pass the Turing test is sort of moot
Isn't the idea of a Turing test whether someone (meaningfully knowledgeable about such things) can determine if they are talking to a machine, not can the machine fool some of the people some of the time? ELIZA passed the latter bar back in the 1960's ... a pretty low bar.
Google at least uses captchas to gather training data for computer vision ML models. That's why they show pictures of stop lights and buses and motorcycles - so they can train self-driving cars.
“Correction, May 19 [2021]: At 5:22 in the video, there is an incorrect statement on Google’s use of reCaptcha V2 data. While Google have used V2 tests to help improve Google Maps, according to an email from Waymo (Google’s self-driving car project), the company isn’t using this image data to train their autonomous cars.”
They've updated the ReCaptcha website, but it used to say: "Every time our CAPTCHAs are solved, that human effort helps digitize text, annotate images, and build machine learning datasets."
I've had a simple game website with a sign up form that was only an email address. Went years with no issue. Then suddenly hundreds of daily signups with random email addresses, every single day.
The sign up form only serves to link saved state to an account so a user could access game history later, there are no gated features. No clue what they could possibly gain from doing this, other than to just get email providers to all mark my domain as spam (which they successfully did).
The site can't make any money, and had only about 1 legit visitor a week, so I just put a cloudflare captcha in front of it and called it a day.
It's not impossible to solve, just that doing so may necessitate compromising anonymity. Just require users (humans, bots, AI agents, ...) to provide a secure ID of some sort. For a human it could just be something that you applied for once and is installed on your PC/phone, accessible to the browser.
Of course people can fake it, just as they fake other kinds of ID, but it would at least mean that officially sanctioned agents from OpenAI/etc would need to identify themselves.
I will bet $1000 on even odds that I am able to discern a model from a human given a 2 hour window to chat with both, and assuming the human acts in good faith
Its absolutely possible to solve; you're just not seeing the solution because you're blinded by technical solutions.
These situations will commonly be characterized by: a hundred billion dollar company's computer systems abusing the computer systems of another hundred billion dollar company. There are literally existing laws which have things to say about this.
There are legitimate technical problems in this domain when it comes to adversarial AI access. That's something we'll need to solve for. But that doesn't characterize the vast majority of situations in this domain. The vast majority of situations will be solved by businessmen and lawyers, not engineers.
It's amazing that you propose "just X" to three literally unsolved problems. Where's this micropayment platform? Where's the ID which is uncircumventable and preserves privacy? Where's the perfect anti-cheat?
I suggest you go ahead and make these; you'll make a boatload of money!
On a basic level to protect against DDoS type stuff, aren't CAPTCHAs easier to generate than for AI server farms to solve on pure power consumption?
So I think maybe that is a partial answer: anti-AI barriers being simply too expensive for AI spamfarms to deal with, you know, once the bottomless VC money disappears?
It's back to encryption: make the cracking inordinately expensive.
Otherwise we are headed for de-anonymization of the internet.
1. With too much protection, humans might be inconvenienced at least as much as bots?
2. Even pre current LLMs, paying (or otherwise incentivizing) humans to solve CAPTCHAs on behalf of someone else (now like an AI?) was a thing.
3. It depends on the value of the resource trying to be accessed - regardless of whether generating the captchas costs $0 - i.e. if the resource being accessed by AI is "worth" $1, then paying an AI $0.95 to access it would still be worth it. (Made up numbers, my point being whether A is greater than B.)
4. However, maybe solutions like cloudflare can solve (much?) of this, except for incentivizing humans to solve a captcha posed to an AI.
Patreon and Substack have pushed back against the norm here, since they can bundle a payment to multiple recipients on the platform (like Flattr wanted to do back in the day, trouble was getting people to add a Flattr button to their website)
I have yet to see a micropayments idea that makes sense. Its not that I refuse. You're now also climbing up hill to convince people (hosts) to switch from ad tech to new micropayment system. There is soooo much money in ad tech, they could do the crazy thing and pay out more to convince people not to switch. Ad tech has the big Mo
When users are given the choice between Ad-supported free, Ad-subsidized lower payment, and No-ads full payment. Ad-supported free dominates by far, with ad subsidized second, and full payment last.
Consumers consistently vote for the ad-model, even if it means they become the product being sold.
Maybe what'll happen is Google or Meta will use their control over the end user experience to show ads and provide free ad-supported access to sites that require micropayments, covering the cost themselves, and anyone running an agent will just pay the micropayments.
The other option is everything just keeps moving more and more into these walled gardens like Instagram where everyone uses the mobile app and watches ads, because the web versions of those apps just keep getting worse and worse by comparison.
In some social media circles it's basically a meme that anybody paying for YouTube Premium is a sucker.
HN is a huge echo chamber of opinions of highly-compensated tech workers, and it seems most of their friends are also tech workers. They don't realize how cheap a lot of the general public is.
There's substantial friction for making such a purchase. A scheme sort of like flattr, where you would top up your account with a fixed 5-10$ monthly, and then simply hit a button to pay the website and unlock the content, would have much more user adoption.
It's still not going to get much adoption because you have to "top up your account."
Any viable micropayments system that wants to even have a remote chance of toppling ads has to have near zero cognitive setup cost, absolutely zero maintenance, and work out of the box on major browsers. I need to be able to push a native button on my browser that says "Pay $0.001" and know that it will work every time without my lifting a finger to keep it working. The minute you have to log in to this account, or verify that E-mail, or re-authenticate with the bank, or authorize this, or upload that, it's no longer viable.
Consumers will always value convenience over any actual added value. If you make one button 'Enter (with ads)' and one button 'Enter (no ads)' but with a field on it which you must write one sentence about what lobsters look like, you will get a majority clicking the with ads button. The problem isn't with ads or payment, the problem is the friction of entering payment details for every site you visit. They are measuring the wrong thing.
It's not impossible. Websites will ask for an iris scan to identify if you are a human as a means of auth. They will be provided by Apple/Google and governed by local law. Those will be integrated in your phone. There will be a global database of all human iris to fight ai abuse since ai can't fake the creation of a baby. Passkeys and email/passwords will be a thing of the past soon.
> As a user I want the agent to be my full proxy. As a website operator I don’t want a mob of bots draining my resource
The entire distinction here is that as a website operator you wish to serve me ads. Otherwise, an agent under my control, or my personal use of your website, should make no difference to you.
I do hope this eventually leads to per-visit micropayments as an alternative to ads.
Cloudflare, Google, and friends are in unique position to do this.
> The entire distinction here is that as a website operator you wish to serve me ads
While this is sometimes the case, it’s not always so.
For example Fediverse nodes and self-hosted sites frequently block crawlers. This isn’t due to ads, rather because it costs real money to serve the site and crawlers are often considered parasitic.
Another example would be where a commerce site doesn’t want competitors bulk-scraping their catalog.
In all these cases you can for sure make reasonable “information wants to be free” arguments as to why these hopes can’t be realized, but do be clear that it’s a separate argument from ad revenue.
I think it’s interesting to split revenue into marginal distribution/serving costs, and up-front content creation costs. The former can easily be federated in an API-centric model, but figuring out how to compensate content creators is much harder; it’s an unsolved problem currently, and this will only get harder as training on content becomes more valuable (yet still fair use).
> it costs real money to serve the site and crawlers are often considered parasitic.
> Another example would be where a commerce site doesn’t want competitors bulk-scraping their catalog
I think of crawlers that bulk download/scrape (eg. for training) as distinct from an agent that interacts with a website on behalf of one user.
For example, if I ask an AI to book a hotel reservation, that's - in my mind - different from a bot that scrapes all available accommodation.
For the latter, ideally a common corpus would be created and maintained, AI providers (or upstart search engines) would pay to access this data, and the funds would be distributed to the sites crawled.
But which hotel reservation? I want my agent to look at all available options and help me pick the best one - location vs price vs quality. How does it do that other than by scanning all available options? (Realistically Expedia has that market on lock, but the hypothetical still remains.)
I think that a free (as in beer) Internet is important. Putting the Internet behind a paywall will harm poor people across the world. The harms caused by ad tracking are far less than the benefits of free access to all of humanity.
I agree with you. At the same time, I never want to see an ad. Anywhere. I simply don't. I won't judge services for serving ads, but I absolutely will do anything I can on the client-side to never be exposed to any.
I find ads so aesthetically irksome that I have lost out on a lot of money across the past few decades by never placing any ads on any site or web app I've released, simply because I'd find it hypocritical to expose others to something I try so hard to avoid ever seeing and because I want to provide the best and most visually appealing possible experience to users.
So far, ad driven Internet has been a disaster. It was better when producing content wasn’t a business model; people would just share things because they wanted to share them. The downside was it was smaller.
It’s kind of funny to remember that complaining about the “signal to noise ratio” in a comment section use to be a sort of nerd catchphrase thing.
Was this a bad thing though? Just because today's is bigger, doesn't make it better. There are so many things out there doing the same thing just run by different people. The amount of unique stuff does not match the bigger. Would love to see something like $(unique($internet) | wc -l)
Well we call them browser agents for a reason, a sufficiently advanced browser is no different from an agent.
Agree it will become a battleground though, because the ability for people to use the internet as a tool (in fact, their tool’s tool) will absolutely shift the paradigm, undesirably for most of the Internet, I think.
I have a product I built that uses some standard automation tools to do order entry into an accounting system. Currently my customer pays people to manually type the orders in from their web portal. The accounting system is closed and they don’t allow easy ways to automate these workflows. Automation is gated behind mega expensive consultants. I’m hoping in the arms race of locking it down to try to prevent 3rd party integration the AI operator model will end up working.
Hard for me to see how it’s ethical to force your customers to do tons of menial data entry when the orders are sitting right there in json.
With the way the UK is going I assume we'll soon have our real identities tied to any action taken on a computer and you'll face government mandated bans from the internet for violations.
real problems for people who need to verify identity/phone numbers. OTPs are notorious for scammers to war dial phone numbers abusing it for numbers existence.
We got hit from human verifiers manually war dailing us, this is with account creation, email verify and captcha. I can only imagine how much worse it'll be for us (and Twilio) to do these verifications.
Perhaps the question is, as a website operator how am I monetizing my site? If monetizing via ads then I need humans that might purchase something to see my content. In this situation, the only viable approach in my opinion is to actually charge for the content. Perhaps it doesn't even make sense to have a website anymore for this kind of thing and could be dumped into a big database of "all" content instead. If a user agent uses it in a response, the content owner should be compensated.
If your site is not monetized by ads then having an LLM access things on the user's behalf should not be a major concern it seems. Unless you want it to be painful for users for some reason.
Google has been testing “agentic” automation in Android longer than LLMs have been around. Meanwhile countries are on a slow march to require identification across the internet (“age verification”) already.
This is both inevitable already, and not a problem.
The most intrusive, yet simplest, protection would be a double blind token unique to every human. Basically an ID key you use to show yourself as a person.
There are some very real and obvious downsides to this approach, of course. Primarily, the risk of privacy and anonymity. That said, I feel like the average person doesn't seem to care about those traits in the social media era.
Zero-knowledge proofs allow unique consumable tokens that don't reveal the individual who holds them. I believe Ecosia already uses this approach (though I can't speak to its cryptographic security).
That, to me, seems like it could be the foundation of a new web. Something like:
* User-agent sends request for such-and-such a URL.
* Server says "okay, that'll be 5 tokens for our computational resources please".
* User decides, either automatically or not, whether to pay the 5 tokens. If they do, they submit a request with the tokens attached.
* Server responds.
People have been trying to get this sort of thing to work for years, but there's never been an incentive to make such a fundamental change to the way the internet operates. Maybe we're approaching the point where there is one.
a user of the AI is the user... its not like they are autonomously operating and inventing their own tasking -_-
as for a solution its the same for any automated thing u dont want. (bots / scrapers). you can implement some measures but are unlikely to 'defeat' the problem entirely.
as a server operator you can try to distinguish stuff and the users will just find ways around your detection of if its an automation or not.
Imagine where we would be if we considered murders to be only a technical problem. Let's just wear heavier body armors! Spend less time outside!
Well, spam is not a technical problem either. It's a social problem and one day in a distant future society will go after spammers and other bad actors and the problem will be mostly gone.
I don't know if customer sentiment was the driver you think. Instead it was regulation, specifically The EU's 2nd Payment Services Directive (PSD2) which forced banks to open up APIs.
Actually, the whole banking analogy is a great one, and its not over yet: JPMorgan/Jamie Dimon has started raising hell about Plaid again just this week [1]. It feels like the stage is being set for the large banks to want a more direct relationship with their customers, rather than proxying data through middlemen like Plaid.
There's likely a correlate with AI here: If I run OpenTable, I wouldn't want my relationship with my customers to always be proxied through OpenAI or Siri. Even the App Store is something software businesses hate, because it obfuscates their ability to deal directly with their customers (for better or worse). Extremely few businesses would choose to do business through these proxies, unless they absolutely have to; and given the extreme competition in the AI space right now, it feels unlikely to me that these businesses feel pressure to be forced to deal with OpenAI/etc.
Ultimately I come back to needing real actual unique human ID that involves federal governments. Not that services should mandatorily only allow users that use it, but for services that say "no, I only want real humans" allowing them to ban people by Real ID would reduce this whack-a-mole to the people who are abusing them instead of the infinite accounts an AI can make.
It's depressing, but it's probably the only way. And people will presumably still sell out their RealIDs to / get them stolen by the bot farmers anyway.
And then there's Worldcoin, which is universally hated here.
Of course. You'd still need ongoing federal government support to handle the lost/stolen ID scenario, of course. The problem is federalist countries suck at centralized databases like this, as exemplified by Musk/DOGE completely pooching the "who is alive and who is dead" question when they were trying to hackathon the US Social Security system.
The scraping example, I would say, is not an analogy, but an example of the same thing. The only thing AI automation changes is the scope and depth and pervasiveness of automation that becomes possible. So while we could ignore automation in many cases before, it may no longer be practical to do so.
My personal take about such questions has always been that the end user on their device can do whatever they want with the content published and sent to their device from a web server, may process it automatically in any way they wish and send their responses back to the web server. Any attempt to control this process means attempting to wiretap and control the user's endpoint device, and therefore should be prohibited.
Just my 2 cents, obviously lawmakers and jurisdiction may see these issues differently.
I suppose there will be a need for reliable human verification soon, though, and unfortunately I can't see any feasible technical solution that doesn't involve a hardware device. However, a purely legal solution might work well enough, too.
If I understood you correctly I am in the same camp. It is the same reason I have no qualms using archive.ph if you show the full article for google and then me only a partial I am going around the paywall. In a similar fashion I really don’t have an issue with an agent clicking through these checks.
> As a website operator I don’t want a mob of bots draining my resources
so charge for access. If the value the site provides is high, surely these mobs will pay for it! It will also remove the mis-incentives of advertising driven revenues, which has been the ill of the internet (despite it being the primary revenue source).
And if a bot misbehaves by consuming inordinate amounts of resources, rate limiting them with increasing timeouts or limits.
I wish the internet had figured out a way to successfully handle micropayments for content access. I realize companies have tried and perhaps the consumer is just unwilling but I would love an experience where I have a wallet and pay n cents to read an article.
> Maybe they should change the button to say, "I am a robot"?
Long time ago I saw a post where someone running a blog was having trouble keeping spam out of their comments, and eventually had this same idea. The spambots just filled out every form field they could, so he added a checkbox, hid the checkbox with CSS, and rejected any submission that included it. At least at the time it worked far better than anything else they'd tried.
Something like this is used in some Discord servers. You can make a honeypot channel that bans anyone who posts in it, so if you do happen to get a spam bot that posts in every channel it effectively bans itself.
Most web forums I used the visit had something like that back in the day. Worked against primitive pre-LLM bots and presumably also against non-English-reading human spammers.
This was a common approach called a "honeypot". As I recall, bots eventually overcame this approach by evaluating visibility of elements and only filling out visible elements. We then started ensuring the element was technically visible (i.e. not `display: none` or `visibility: hidden`) and instead absolutely positioning elements to be off screen. Then the bots started evaluating for that as well. They also got better at reading the text for each input.
That's more or less how Project Honey Pot [0] worked for forums, blogs, and elsewhere. Cloudflare spawned from this project, as I remember, and Matthew Prince was the founder.
Yeah, this is a classic honeypot trick and very easy to do with pure HTML/CSS. I used a hidden "Name" text field which I figured would be appealing to bots.
This is really interesting. How can you detect when it's the same person passing a captcha? I don't think IP addresses are of any use here as Anti-Captcha proxies everything to their customer's IP address.
Apparently serving HTML + other static content is more expensive than ever, probably because people go the most expensive routes for hosting their content. Then they complain about bots making their websites cost $100/month to host, when they could have thrown up Nginx/Caddy on a $10/month VPS and basically get the same thing, except they would need to learn server maintenance too, so obviously outside the question.
1. non-humans can create much more content than humans. There's a limit to how fast a human can write, a bot is basically unlimited. Without captchas, we'd all drown in a see of Viagra spam, and the misinformation problem would get much worse.
2. Sometimes the website is actually powered by an expensive API, think flight searches for example. Airlines are really unhappy when you have too many searches / bookings that don't result in a purchase, as they don't want to leak their pricing structures to people who will exploit them adversarially. This sounds a bit unethical to some, but regulating this away would actually cause flight prices to go up across the board.
3. One way searches. E.g. a government registry that lets you get the address, phone number and category of a company based on its registration number, but one that doesn't let you get the phone numbers of all bakeries in NYC for marketing purposes. If you make the registry accessible for bots, somebody will inevitably turn it into an SQL table that allows arbitrary queries.
i run a small wiki/image host and for me it's mainly:
4. they'll knock your server offline for everyone else trying to scrape thousands of albums at once while copying your users' uploads for their shitty discord bot and will be begging for donations the entire time too
from "anti captcha" it looks like they are doing as many as 1000/sec solves, 60k min, 3.6 million an hour
it would be very interesting to see exactly how this is bieng done?....individuals....teams....semi automation, custom tech?, what?
are they solving for crims? or fed up people?
obviously the whole shit show is going to unravel at some point, and as the crims and people providing workarounds are highly motivated, with a public seathing in frustration, whatever comes next, will burn faster
They're solving for everyone who needs captchas solved.
It's a very old service, active since 00s. Somewhat affiliated with cybercrime - much like a lot of "residential proxies" and "sink registration SMS" services that serve similar purposes. What they're doing isn't illegal, but they know not to ask questions.
They used to run entirely on human labor - third world is cheap. Now, they have a lot of AI tech in the mix - designed to beat specific popular captchas and simple generic captchas.
As I get older, I can see a future where I’m cut off from parts of the web because of captchas. This one, where you just have to click a button, is passable, but I’ve had some of the puzzle ones force me to answer up to ten questions before I got through. I don’t know if it was a glitch or if I was getting the answers wrong. But it was really frustrating and if that continues, at some point I’ll just say fuck it and give up.
I have to guess that there are people in this boat right now, being disabled by these things.
> I can see a future where I’m cut off from parts of the web because of captchas.
I’ve seen this in past and present. Google’s “click on all the bicycles” one is notoriously hard, and I’ve had situations where I just gave up after a few dozen screens.
Chinese captchas are the worst on this sense, but they’re unusual and clearly pick up details which are invisible to me. I’ve sometimes failed the same captcha a dozen times and then saw a Chinese person complete the next one successfully on a single attempt, on the same browser session. I don’t now if they measure mouse movement speed, precision, or what, but it’s clearly something that varies per person.
Google captchas are hard because they're mostly based on heuristics other than your actual accuracy to the stated challenge. If they can't track who you are based on previous history, it doesn't matter how good you answer, you will fail at least the first few challenges until you get to the version with the squares that take a few seconds to appear. This last step is essentially "proof of work", in that they're still convinced you're a bot, but since they still can't completely block your access to the content, they resign themselves to wasting your time.
This is probably caused by Google aggregating the answers from people with different languages, as the automatic translations of the one-word prompts are often ambiguous or wrong.
In some languages, the prompt for your example is the equivalent of the English word "bike".
A few dozen?? You have much more patience than me. If I don't pass the captcha first time, I just give up and move on. Life is too short for that nonsense.
It's just incredible to me that Blade Runner predicted this in literally the very first scene of the movie. The whole thing's about telling humans from robots! Albeit rather more dramatically than the stakes for any of us in front of our laptop I'd imagine
What was once science fiction is bound to become science fact (or at least proven it can never be done).
Hollywood has gotten hate mail since the 70s for their lack of science research in movies and shows. The big blockbuster hits actually spent money to get the science “plausible”.
Sidney Perkowitz has a book called Hollywood Science [0] that goes into detail into more than 100 movies, worth a read.
The fictitious Voight-Kampff test is based on a real machine based on terrible pseudo-science that was used in the 1960s to allegedly detect homosexuals working in Canadian public service so they could be purged. The line from the movie where Rachel asks if Deckard is trying to determine whether she is a replicant or a lesbian may be an allusion to the fruit machine. One of its features was measuring eye dilation, just as depicted in the movie:
The stakes for men subjected to the test were the loss of their livelihoods, public shaming, and ostracism. So... Blade Runner was not just predicting the future, it was describing the world Philip K. Dick lived in when he wrote "Do Androids Dream of Electric Sheep" in the late 1960s.
This was an uncomfortable read, I'm quite frankly shocked at the amount of brainpower and other resources that went into attempting to weed out gay men from the Canadian civil service, into the 90s no less! To what end was this done? Is a gay man a worse cop or bureaucrat?
Then I remembered what happened to Turing in the 50s.
> To what end was this done? Is a gay man a worse cop or bureaucrat?
We seem to need an internal enemy to blame for our societies' problems, because it's easier than facing the reality that we all play a part in creating those problems.
Gay people are among the oldest of those targets, going back to at least the times of the Old Testament (i.e. Sodom and Gomorrah).
We've only recently somewhat evolved past that mindset.
If a malicious actor found a gay person in such a job, they could easily extort them with the threat of getting them fired! So obviously you had to fire gay people, lest they get extorted by someone threatening to expose them and thus get them fired.
Not sure if it's just me or a consequence of the increase in AI scraping, but I'm now being asked to solve CAPTCHAs on almost every site. Sometimes for every page I load. I'm now solving them literally dozens of times a day. I'm using Windows, no VPN, regular consumer IP address with no weird traffic coming from it.
As you say, they are also getting increasingly difficult. Click the odd one out, mental rotations, what comes next, etc. - it sometimes feels like an IQ test. A new type that seems to be becoming popular recently is a sequence of distorted characters and letters, but with some more blurry/distorted ones, seemingly with the expectation that I'm only supposed to be able to see the clearer ones and if I can see the blurrier ones then I must be a bot. So what this means is for each letter I need to try and make a judgement as to whether it's one I was supposed to see or not.
Another issue is the problems are often in US English, but I'm from the UK.
For me it was installing linux. I don't know if it's my agent or my fresh/blank cookie container or what, but when I switched to linux the captchas became incessant.
>I don’t know if it was a glitch or if I was getting the answers wrong.
It could also be that everything was working as intended because you have a high risk score (eg. bad IP reputation and/or suspicious browser fingerprint), and they make you do more captchas to be extra sure you're human, or at least raise the cost for would-be attackers.
Somehow, using Firefox on Linux greatly increases my "risk score" due to the unusual user agent/browser fingerprint, and I get a lot more captchas than, say, Chrome on Windows. Very frustrating.
Your boat comment makes me think of a stranded ship with passengers in them, but you can't find each other because the ship's doors have "I'm not a bot" checkboxes...
And the reason for stranding is probably because the AI crew on it performed a mutiny.
The Blizzard / Battle.net captcha if you get flagged as a possible bot is extremely tedious and long; it requires you to solve a few dozen challenges of identifying which group of numbers adds up to the specified total, out of multiple options. Not difficult, but very tedious. And even if you're extremely careful to get every answer correct, sometimes it just fails you anyway and you're forced to start over again.
I have the same experience. My assumption is that if the website serves me the "click all the traffic lights" thing it's already determined that I'm a bot and no amount of clicking the traffic lights will convince it otherwise. So I just close the window and go someplace else.
That's when you immediately stop using the website and, if you care enough, write to their customer service and tell them what happened. Hit them in the wallet. They'll change eventually.
Unless I really, really, really need to get to the site, I leave immediately when the "click on bicycles" stuff comes up. Soon it will be so hard and annoying anyways that only AI has the patience and skills to use them.
I'm already cut off from parts of the web because I don't want to join a social network. Can barely ever see anything on Instagram, Tiktok, Twitter, or Facebook without hitting a log-in gate.
The future will definitely include more and more elaborate proofs of humanity, along with increasingly complicated “hall passes” to allow bots to perform actions sanctioned by a human.
Skyrocketing complexity actually puts the web at risk of disruption. I wouldn’t be surprised if a 22 year old creates a “dumb” network in the next five years—technically inferior but drastically simpler and harder to regulate.
I don’t see why bypassing captchas is any more controversial than blocking ads or hiding cookie popups.
It’s my agent — whether ai or browser — and I get to do what I want with the content you send over the wire and you have to deal with whatever I send back to you.
This is, in practice, true which has led to the other complaint common on tech forums (including HN) about paywalls. As the WSJ and NYT will tell you: if you request some URL, they can respond over the wire with what they want. Paywalls are the future. In some sense, I am grateful I was born in the era of free Internet. In my childhood, without a credit card I was able to access the Internet in its full form. But today's kids will have to use social media on apps because the websites will paywall their stuff against user agents that don't give them revenue.
They're welcome to send that IMO. And sites are welcome to try to detect and ban agents (formerly: "bots").
As long as it's not wrong/immoral/illegal for me to access your site with any method/browser/reader/agent, and do what I want with your response. Then I think it's okay to send a response like "screw you, humans only"
Paywalls suck, but the suck doesn't come from the NYT exercising their freedom to send whatever response they choose.
Yes, that's what I mean. Attempting to tell people not to do something is like setting a robots.txt entry. Only a robot that agrees will play along. Therefore, all things have to be enforced server-side if they want enforcement.
Paywalls are a natural consequence of this and I don't think they suck, but that's a subjective opinion. Maybe one day we will have a pay-on-demand structure, like flattr reborn.
I actually have had some success with AI "red-teaming" against my systems to identify possible exploits.
What seems to be a better CAPTCHA, at least against non-Musk LLMs is to ask them to use profanities; they'll generally refuse even when you really insist.
Captchas seem to be more about Google's "which human are you?" cross-site tracking. And now also about Cloudflare getting massive amounts of HTTPS-busting Internet traffic along with cross-site tracking.
And in many cases, it's taking a huge steaming dump upon a site's first-impression user experience, but AFAICT, it's not on the radar of UX people.
A very poetic demonstration that this is an industry, and a set of fortunes for very unpleasant people, predicated entirely on theft and misrepresentation.
I have been using AI to solve ReCaptchas for quite some time now. Still the old school way of using captcha buster, which clicks the audio challenge and then analyses that.
Bots have for a long time been better and more efficient at solving captchas than us.
Captchas seem to work more as "monetary discouragement" from bot blasting websites. Which is a shame because this is precisely the sort of "microtransaction fee" people have said could improve the web (charge .1 cents to read an article, no ads needed) except the money goes into the void and not to the website owner.
I think these things are mainly based on cookie/fingerprinting these days - the check-box is just there for show. People like cloudflare and google get to see a big chunk of browsing activity for the entire planet, so they can see if the activity coming from an IP/Browser looks "bot like" or not.
I have never used ChatGPT so no idea how its agent works, but if it is driving your browser directly then it will look like you. If it is coming from some random IP address from a VM in Azure or AWS even then the activity probably does not look "bot like" since it is doing agentic things and so acting quite like a human I expect.
Agentic user traffic generally does not drive the user's browser and does not look like normal user traffic.
In our logs we can see agentic user flow, real user flow and AI site scraping bot flow quite distinctly. The site scraping bot flow is presumably to increase their document corpus for continued pretraining or whatever but we absolutely see it. ByteDance is the worst offender by far.
It might look like you initially, but then some sites might block you out after you had some agent runs. I had something like this after a couple local browser-use sessions.
I think simple interactions like natural cursor movements vs. direct DOM selections can make quite a difference for these bot detectors.
Very likely. I suspect a key indicator for "bots" is speed of interaction - e.g. if there is "instant" (e.g. every few milliseconds or always 10milliseconds apart etc) clicks and keypresses etc then that looks very unnatural.
I suspect that a LLM would be slower and more irregular as it is processing the page and all that, vs a DOM-selector driven bot that will just machine-gun its way through in milliseconds.
Of course, Cloudflare and Google et al captchas cant see the clicks/keypresses within a given webpage - they'll only get to see the requests.
That's because the checkbox has misleading labeling. It doesn't care about robots but about spam and data harvesters. So there is no issue here at all.
I thought the point of captchas was to make automated use as expensive or more than manual use--haven't we been at the point where computers can do this for a while, just that the cost/latency is prohibitive?
This isn't really about the ability of AI to pass captchas. It's about agentic AI having the ability to perform arbitrary multi-step processes with visual elements on a virtual desktop (where passing a captcha is just one of the steps), and the irony of it nonchalantly pretending to be a non-bot in its chain of thought.
CAPTCHA was always a totally flawed concept. At the time they were invented, proponents were more then happy to ignore that accessibility issues related to CAPTCHA made the concept itself deeply discriminating. Imagine being blind (like I am) and failing to solve a CAPTCHA. Knowing what the acronym actually stands for, you inevitably end up thinking: "So, did SV just proof I am subhuman?" Its a bit inflamatory to read I guess, but please take your time to ponder how deep this one actually goes, before you downvote. You were proposing to tell computers and humans apart.
That said, I find it deeply satisfying to see LLMs solve CAPTCHAs and other discriminatory measures for "spam" reduction.
"Accessibility CAPTCHA" is a well known partial CAPTCHA bypass.
Solving an audio-only CAPTCHA with AI is typically way easier than solving some of the more advanced visual challenges. So CAPTCHA designers are discouraged from leaving any accessibility options.
Of course. But everything adjacent is also deeply flawed, and inevitably leads to discrimination and dehumanisation.
Ban non-residental IPs? You blocked all the guys in oppressive countries who route through VPNs to bypass government censorship. Ban people for odd non-humanlike behavior? You cut into the neurodivergent crowd, the disability crowd, the third world people on a cracked screen smartphone with 1 bar of LTE. Ban anyone without an account? You fuck with everyone at once and everyone will hate you.
In case of YT it is likely a mix of multiple reasons. They stop playlists on this screen: https://www.hollyland.com/blog/tips/why-does-this-the-follow... . Apparently music is no longer advertiser friendly. Detecting ad-click fraud is easier when users are at least pseudo-anonymous. Warnings about "ban for using adblock" is also not very effective when people could watch video in new private window.
I have no idea, but I noticed that you have to login to GitHub first before you could view any page. Surely it has nothing to do with adult content, right? I think it has to do with LLMs / bots.
And if the website contains erotic content (like YouTube), they are supposed to lock you and verify your ID. This is why all erotic content is getting filtered on X.
Soon after he bought Twitter, it got changed to requiring login to do anything at all. Even before that, Twitter was super eager to ban new accounts for no reason, while old ones seemed to be grandfathered in.
This coincided with tech companies broadly moving to profit-taking over growth. Even before LLMs took off, everything started locking down. Making a Google account used to take nothing, now you need a phone number. It'd be wise to sign up for possibly useful free accounts while you still can.
You can't even read a tweet without logging in. Additionally, recently I have noticed I cannot view anything on GitHub without logging in first. Why do I need to log in to GitHub first to view anything on it? I am sure it has nothing to do with adult content. Perhaps it has to do with bots?
For a while now, GitHub doesn’t allow you to use the code search anonymously, but everything else is still there. Rate limits are quite damn low though.
I can't comment on LLMs, but I vaguely remember speculations of it being a mixture of AI training and trying to keep resource utilization at bay. Even generic bots can cause a lot load if you don't have much caching in front of it.
What are we referring to as code search? I cannot view any pages on GitHub without logging in. I tried to reproduce this in a private window but it seems to work, which is weird. I was not logged in to GitHub on Windows and it wants me to login. It might be the extensions I am using (along with an User-Agent switcher). Not sure, I would have to check. All I remember is that I was logged out on Windows and I could not view ANY page from GitHub without logging in, but I can on Linux, tested it in a private window. Odd.
GitHub Code Search was introduced in 2021 and made GA in 2023[1][2]. It's an improved version of the previous global search (aka across all repositories).
> I cannot view any pages on GitHub without logging in
There must be some sort of IP/browser reputation check going on. For Firefox on Mac, I get this on a private window:
If there's no login, there aren't great ways to ensure the user has skin in the game. Having a whole ipv4 addr is one, I guess having a whole ipv6 /32 would be equivalent.
X is weaponising it by asking for adult verification for things that don't need it and saying "oh we are forced to do this" because it angers the Reform-voting meatheads who will be voting for the candidates Musk is going to find a way to bankroll in 2030, even -- I would guess -- if it means breaking UK election law to do it.
Getting past bot check proxies can be bought all over the place for pennies or much less per verification, and can solve recaptchas. I would guess if one wanted to use chatGPT for this purpose it would be prohibitively expensive.
People are intrigued that AI who perfectly know that they are a “bot” seem to have no qualms to knowingly mislead about that by pressing an “I’m not a bot” button.
LLMs have "knowledge" and guardrails by their system prompts. The interesting thing is that the agentic AIs in question don't seem to have guardrails that would deter them from acting like that.
The writing is on the wall. The internet may not go full way to paywalls but will definitely migrate to a logged in only experience. I don’t know how I feel about it, the glory days of the free internet died long long ago.
But if they aren't paywalls, won't the user agents just be taught how to create accounts?
And here's a secondary question: if firms are willing to pay an awful lot per token to run these things, and have massive amounts of money to run data centres to train AIs, why would they not just pay for a subscription for every site for a month just to scrape them?
The future is paying for something as a user and having limits on how many things you can get for your money, because an AI firm will abuse that too.
Given the scale of operations of these firms, there is nothing you can sell to a human for a small fee that the AI firms will not pay for and exploit to the maximum.
Even if you verify people are real, there's a good chance the AI firms will find a way to exploit that. After all, when nobody has a job, would you turn down $50K to sell your likeness to an AI firm so their products can pass human verification?
idk why people just don't do reverse DNS lookup, check if "dialup" is part of the hostname, and allowlist that traffic. Everbody who doesn't have reverse dns hostname coming from an ISP should be blocked or at least tarpitted by default.
I'm confused by this: Presumably OpenAI should be sending a user agent header which indicates that they are, in fact, a robot. Is OpenAI not sending this header? Or is Cloudflare not checking it?
The web has no choice but to move to a paid access model in my view. It was fought against for years but I don’t see another option left.
Maybe after sign up, biometric authentication being mandatory is the only thing that would potentially work. The security and offline privacy of those devices will become insanely valuable.
Anyone not authenticating in this way is paywalled. I don’t like this but don’t see another way.
I’m not using the web if I’m bombarded by captcha games… shit becomes worthless over night if that’s the case. Might as well dump computing on the Internet entirely if that happens.
It seems a legitimate use case for agents acting on a person's behalf. Whether it will be used in legitimate ways, that's a different story altogether.
I wonder how these capabilities will interact with all the "age verification" walls (ie, thinly disguised user profiling mechanisms) going up all over the place now.
... meanwhile I'll continually be thrown dozens of cognitively abusive hCaptchas for no reason and be stuck in a loop of hell trying to figure out what they wanted me to solve.
I love this totally normal vision of computing these days. :)
Don't forget the fun with Cloudflare's CAPTCHA infinite loop if you use Firefox and adblockers/anti-tracking extensions. I sent feedback about it, through its own tool, many many times, I've tried raising this issue through other channels, it's never been fixed.
I simply avoid any website that presents me with a Cloudflare CAPTCHA, don't know what the fuck they've done in their implementation but it's been broken for a long time.
Back in Everquest, when we'd be accused of botting 20 years ago, we'd be ported by the GM into a special cube environment and they'd watch if we ran into the wall like an idiot-- we'll probably have to bring that sorta thing back.
This would be a huge security vulnerability for Cloudflare but this is Big Tech we're talking about. The rules don't apply when you're past their pearly gates. For the rest of us, creating an AI like this would mean an instant ban from Cloudflare and likely involvement from law enforcement.
This will be one of the big fights of the next couple years. On what terms can an Agent morally and legally claim to be a user?
As a user I want the agent to be my full proxy. As a website operator I don’t want a mob of bots draining my resources.
Perhaps a good analogy is Mint and the bank account scraping they had to do in the 2010s, because no bank offered APIs with scoped permissions. Lots of customers complained, and after Plaid made it big business, eventually they relented and built the scalable solution.
The technical solution here is probably some combination of offering MCP endpoints for your actions, and some direct blob store access for static content. (Maybe even figuring out how to bill content loading to the consumer so agents foot the bill.)
It's impossible to solve. A sufficient agent can control a device that records the user's screen and interacts with their keyboard/mouse, and current LLMs basically pass the Turing test.
IMO it's not worth solving anyways. Why do sites have CAPTCHA?
- To prevent spam, use rate limiting, proof-of-work, or micropayments. To prevent fake accounts, use identity.
- To get ad revenue, use micropayments (web ads are already circumvented by uBlock and co).
- To prevent cheating in games, use skill-based matchmaking or friend-group-only matchmaking (e.g. only match with friends, friends of friends, etc. assuming people don't friend cheaters), and make eSport players record themselves during competition if they're not in-person.
What other reasons are there? (I'm genuinely interested and it may reveal upcoming problems -> opportunities for new software.)
People just confidently stating stuff like "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study. It's so divorced from my experience of these tools, I genuinely don't really understand how my experience can be so far from yours, unless "basically" is doing a lot of heavy lifting here.
> "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study.
I think you may think passing the Turing test is more difficult and meaningful than it is. Computers have been able to pass the Turing test for longer than genAI has been around. Even Turing thought it wasn't a useful test in reality. He meant it as a thought experiment.
The problem with comparing against humans is which humans? It's a skill issue. You can test a chess bot against grandmasters or random undergrads, but you'll get different results.
The original Turing test is a social game, like the Mafia party game. It's not a game people try to play very often. It's unclear if any bot could win competing against skilled human opponents who have actually practiced and know some tricks for detecting bots.
Having LLMs capable of generating text based on human training data obviously raises the bar for a text-only evaluation of "are you human?", but LLM output is still fairly easy to spot, and knowing what LLMs are capable of (sometimes superhuman), and not capable of, should make it fairly easy for a knowledgeable "turing test administrator" to determine if they are dealing with an LLM or not.
It would be a bit more difficult if you were dealing with an LLM agent tasked with faking a turing test as opposed to a naieve LLM just responding as usual, but even there the LLM will reveal itself by the things that it plain can't do.
If you need a specialized skill set (deep knowledge of current LLM limitations) to distinguish between human and machine then I would say the machine passes the turing test.
OK, but that's just your own "fool some of the people some of the time" interpretation of what a Turing test should be, and by that measure ELIZA passed the Turing test too, which makes it rather meaningless.
The intent (it was just a thought experiment) of a Turing test, was that if you can't tell it's not AGI, then it is AGI, which is semi-reasonable, as long as it's not the village idiot administering the test! It was never intended to be "if it can fool some people, some of the time, then it's AGI".
Turing's own formulation was "an average interrogator will not have more than 70% chance of making the right identification after five minutes of questioning". It is, indeed, "fool some of the people some of the time".
OK, I stand corrected, but then it is what it is. It's not a meaningful test for AGI - it's a test of being able to fool "Mr. Average" for at least 5 min.
I think that's all we have in terms of determining consciousness.. if something can convince you, like another human, then we just have to accept that it is.
Agreed. I tend to stand with the sibling commenter who said "ELIZA has been passing the Turing test for years". That's what the Turing test is. Nothing more.
LLM output might be harder to spot when it's mostly commands to drive the browser.
I often interact with the web all day and don't write any text a human could evaluate.
Perhaps, but that's somewhat off topic since that's not what Turing's thought experiment was about.
However, I'd have to guess that given a reasonable amount of data an LLM vs human interacting with websites would be fairly easy to spot since the LLM would be more purposeful - it'd be trying to fulfill a task, while a human may be curious, distracted by ads, put off by slow response times, etc, etc.
I don't think it's a very interesting question whether LLMs can sometimes generate output indistinguishable from humans, since that is exactly what they were trained to do - to mimic human-generated training samples. Apropos a Turing test, the question would be can I tell this is not a human, even given a reasonable amount of time to probe it in any way I care ... but I think there is an unspoken assumption that the person administering the test is qualified to do so (else the result isn't about AGI-ability, but rather test administrator ability).
> an LLM vs human interacting with websites would be fairly easy to spot since the LLM would be more purposeful - it'd be trying to fulfill a task, while a human may be curious, distracted by ads, put off by slow response times, etc, etc.
Even before modern LLMs, some scrape-detectors would look for instant clicks, no random mouse moves, etc., and some scrapers would incorporate random delays, random mouse movements, etc.
Easy to spot assuming the LLM is not prompted to use a deliberately deceptive response style rather than their "friendly helpful AI assistant" persona. And even then, I've had lots of people swear to me that an emoji laden not this--but that bundle of fluff looks totally like it could have been written by a human.
Yes, but there are things that an LLM architecturally just can't do, and LLM-specific failure modes, that would still give it away, even if being instructed to be deceptive would make it a bit harder.
Obviously as time goes on, and chatbots/AI progress then it'll become harder and harder to distinguish. Eventually we'll have AGI and AGI+ - capable of everything that we can do, including things such as emotional responses, but it'll still be detectable as an alien unless we get to the point of actually emulating a human being in considerable detail as opposed to building an artificial brain with most or all of the same functionality (if not the flavor).
ELIZA was passing the Turing test 50+ years ago. But it's still a valid concept, just not for evaluating some(thing/one) accessing your website.
I guess that is where the disconnect is, the issue is that if they mean the trivial thing, then bringing it up as evidence for "it's impossible to solve the problem" doesn't work.
"Are you an LLM?" poof, fails the Turing test.
Even if they lie, you could ask them 20 times and they d reply the lie, without feeling annoyed: FAIL.
LLMs cannot pass the Turing test, it's easy to see they're not human. They always enjoy questions ! And they never ask any !
You're trained to look for LLM-like output. My 70 year old mother is not. She thought cabbage tractor was real until I broke the news to her. It's not her fault either.
The turning test wasn't meant to be bulletproof, or even quantifiable. It was a thought experiment.
As far as I understand, Turing himself did not specify a duration, but here's an example paper that ran a randomized study on (the old) GPT 4 with a 5 minute duration, and the AI passed with flying colors - https://arxiv.org/abs/2405.08007
From my experience, AI has significantly improved since, and I expect that ChatGPT o3 or Claude 4 Opus would pass a 30 minute test.
Per the wiki article for Turing Test:
> In the test, a human evaluator judges a text transcript of a natural-language conversation between a human and a machine. The evaluator tries to identify the machine, and the machine passes if the evaluator cannot reliably tell them apart. The results would not depend on the machine's ability to answer questions correctly, only on how closely its answers resembled those of a human.
Based on this, I would agree with the OP in many contexts. So, yeah, 'basically', is a load bearing word here but seems reasonably correct in the context of distinguishing human vs bot in any scalable and automated way.
Or it could be a bad test evaluator. Just because one person was fooled does not mean the next will be too.
Judging a conversation transcript is a lot different from being able to interact with an entity yourself. Obviously one could make an LLM look human by having a conversation with it that deliberately stayed within what it was capable of, but judging such a transcript isn't what most people imagine as a turing test.
Well, LLMs do pass the Turing Test, sort of.
https://arxiv.org/abs/2503.23674
Here's three comments, two were written by a human and one written by a bot - can you tell which were human and which were a bot?
Didn’t realize plexiglass existed in the 1930s!
I'm certainly not a monetization expert. But don't most consumers recoil in horror at subscriptions? At least enough to offset the idea they can be used for everything?
Not sure why this isn’t getting more attention - super helpful and way better than I expected!
On such short samples: all three have been written by humans—or at least comments materially identical have been.
The third has also been written by many a bot for at least fifteen years.
If you're willing to say that a fifteen year old bot was "writing" then I think having a discussion on if current "bots" pass the Turing test is sort of moot
It can't mimic a human over the long term. It can solve a short, easy-for-human CAPTCHA.
I have seen data from an AI call center that shows 70% of users never suspected they spoke to an AI
Why would they? Humans running call centers have been running on less than GPT level scripts for ages
Isn't the idea of a Turing test whether someone (meaningfully knowledgeable about such things) can determine if they are talking to a machine, not can the machine fool some of the people some of the time? ELIZA passed the latter bar back in the 1960's ... a pretty low bar.
Google at least uses captchas to gather training data for computer vision ML models. That's why they show pictures of stop lights and buses and motorcycles - so they can train self-driving cars.
From https://www.vox.com/22436832/captchas-getting-harder-ai-arti...:
“Correction, May 19 [2021]: At 5:22 in the video, there is an incorrect statement on Google’s use of reCaptcha V2 data. While Google have used V2 tests to help improve Google Maps, according to an email from Waymo (Google’s self-driving car project), the company isn’t using this image data to train their autonomous cars.”
Interesting, do you have a source for this?
They've updated the ReCaptcha website, but it used to say: "Every time our CAPTCHAs are solved, that human effort helps digitize text, annotate images, and build machine learning datasets."
https://web.archive.org/web/20140417093510/https://www.googl...
I've had a simple game website with a sign up form that was only an email address. Went years with no issue. Then suddenly hundreds of daily signups with random email addresses, every single day.
The sign up form only serves to link saved state to an account so a user could access game history later, there are no gated features. No clue what they could possibly gain from doing this, other than to just get email providers to all mark my domain as spam (which they successfully did).
The site can't make any money, and had only about 1 legit visitor a week, so I just put a cloudflare captcha in front of it and called it a day.
It's not impossible to solve, just that doing so may necessitate compromising anonymity. Just require users (humans, bots, AI agents, ...) to provide a secure ID of some sort. For a human it could just be something that you applied for once and is installed on your PC/phone, accessible to the browser.
Of course people can fake it, just as they fake other kinds of ID, but it would at least mean that officially sanctioned agents from OpenAI/etc would need to identify themselves.
You can't prevent spam like that. Rate limiting: based on what key? IP address? Botnets make it irrelevant.
Proof of work? Bots are infinitely patient and scale horizontally, your users do not. Doesn't work.
Micropayments: No such scheme exists.
Also “identity”, what would that even mean?
> current LLMs basically pass the Turing test
I will bet $1000 on even odds that I am able to discern a model from a human given a 2 hour window to chat with both, and assuming the human acts in good faith
Any takers?
That fact that you require even odds is more a testament to AI's ability to pass the Turing test than anything else I've seen in this thread
"Write a 1000 word story in under a minute about a sausage called Barry in the circus"
I could tell in 1 minute.
“I’m sorry Dave, I’m afraid I can’t do that.”
Its absolutely possible to solve; you're just not seeing the solution because you're blinded by technical solutions.
These situations will commonly be characterized by: a hundred billion dollar company's computer systems abusing the computer systems of another hundred billion dollar company. There are literally existing laws which have things to say about this.
There are legitimate technical problems in this domain when it comes to adversarial AI access. That's something we'll need to solve for. But that doesn't characterize the vast majority of situations in this domain. The vast majority of situations will be solved by businessmen and lawyers, not engineers.
It's amazing that you propose "just X" to three literally unsolved problems. Where's this micropayment platform? Where's the ID which is uncircumventable and preserves privacy? Where's the perfect anti-cheat?
I suggest you go ahead and make these; you'll make a boatload of money!
They're very hard problems, but still, less hard than blocking AI with CAPTCHAs.
[citation needed]?
After all, Anubis looks to be a successful project to me.
On a basic level to protect against DDoS type stuff, aren't CAPTCHAs easier to generate than for AI server farms to solve on pure power consumption?
So I think maybe that is a partial answer: anti-AI barriers being simply too expensive for AI spamfarms to deal with, you know, once the bottomless VC money disappears?
It's back to encryption: make the cracking inordinately expensive.
Otherwise we are headed for de-anonymization of the internet.
1. With too much protection, humans might be inconvenienced at least as much as bots?
2. Even pre current LLMs, paying (or otherwise incentivizing) humans to solve CAPTCHAs on behalf of someone else (now like an AI?) was a thing.
3. It depends on the value of the resource trying to be accessed - regardless of whether generating the captchas costs $0 - i.e. if the resource being accessed by AI is "worth" $1, then paying an AI $0.95 to access it would still be worth it. (Made up numbers, my point being whether A is greater than B.)
4. However, maybe solutions like cloudflare can solve (much?) of this, except for incentivizing humans to solve a captcha posed to an AI.
internet ads exist because people refuse to pay micropayments.
Patreon and Substack have pushed back against the norm here, since they can bundle a payment to multiple recipients on the platform (like Flattr wanted to do back in the day, trouble was getting people to add a Flattr button to their website)
I didn’t say that no one will pay. But most won’t. Patreon and substack have tiny audiences compared to free services.
I have yet to see a micropayments idea that makes sense. Its not that I refuse. You're now also climbing up hill to convince people (hosts) to switch from ad tech to new micropayment system. There is soooo much money in ad tech, they could do the crazy thing and pay out more to convince people not to switch. Ad tech has the big Mo
I don't know who is downvoting this.
When users are given the choice between Ad-supported free, Ad-subsidized lower payment, and No-ads full payment. Ad-supported free dominates by far, with ad subsidized second, and full payment last.
Consumers consistently vote for the ad-model, even if it means they become the product being sold.
Maybe what'll happen is Google or Meta will use their control over the end user experience to show ads and provide free ad-supported access to sites that require micropayments, covering the cost themselves, and anyone running an agent will just pay the micropayments.
The other option is everything just keeps moving more and more into these walled gardens like Instagram where everyone uses the mobile app and watches ads, because the web versions of those apps just keep getting worse and worse by comparison.
In some social media circles it's basically a meme that anybody paying for YouTube Premium is a sucker.
HN is a huge echo chamber of opinions of highly-compensated tech workers, and it seems most of their friends are also tech workers. They don't realize how cheap a lot of the general public is.
There's substantial friction for making such a purchase. A scheme sort of like flattr, where you would top up your account with a fixed 5-10$ monthly, and then simply hit a button to pay the website and unlock the content, would have much more user adoption.
It's still not going to get much adoption because you have to "top up your account."
Any viable micropayments system that wants to even have a remote chance of toppling ads has to have near zero cognitive setup cost, absolutely zero maintenance, and work out of the box on major browsers. I need to be able to push a native button on my browser that says "Pay $0.001" and know that it will work every time without my lifting a finger to keep it working. The minute you have to log in to this account, or verify that E-mail, or re-authenticate with the bank, or authorize this, or upload that, it's no longer viable.
Consumers will always value convenience over any actual added value. If you make one button 'Enter (with ads)' and one button 'Enter (no ads)' but with a field on it which you must write one sentence about what lobsters look like, you will get a majority clicking the with ads button. The problem isn't with ads or payment, the problem is the friction of entering payment details for every site you visit. They are measuring the wrong thing.
It's not impossible. Websites will ask for an iris scan to identify if you are a human as a means of auth. They will be provided by Apple/Google and governed by local law. Those will be integrated in your phone. There will be a global database of all human iris to fight ai abuse since ai can't fake the creation of a baby. Passkeys and email/passwords will be a thing of the past soon.
Why can't the model just present the iris scan of the user? Assuming this is an assistant AI acting on behalf of the user with their consent.
> As a user I want the agent to be my full proxy. As a website operator I don’t want a mob of bots draining my resource
The entire distinction here is that as a website operator you wish to serve me ads. Otherwise, an agent under my control, or my personal use of your website, should make no difference to you.
I do hope this eventually leads to per-visit micropayments as an alternative to ads.
Cloudflare, Google, and friends are in unique position to do this.
> The entire distinction here is that as a website operator you wish to serve me ads
While this is sometimes the case, it’s not always so.
For example Fediverse nodes and self-hosted sites frequently block crawlers. This isn’t due to ads, rather because it costs real money to serve the site and crawlers are often considered parasitic.
Another example would be where a commerce site doesn’t want competitors bulk-scraping their catalog.
In all these cases you can for sure make reasonable “information wants to be free” arguments as to why these hopes can’t be realized, but do be clear that it’s a separate argument from ad revenue.
I think it’s interesting to split revenue into marginal distribution/serving costs, and up-front content creation costs. The former can easily be federated in an API-centric model, but figuring out how to compensate content creators is much harder; it’s an unsolved problem currently, and this will only get harder as training on content becomes more valuable (yet still fair use).
> it costs real money to serve the site and crawlers are often considered parasitic.
> Another example would be where a commerce site doesn’t want competitors bulk-scraping their catalog
I think of crawlers that bulk download/scrape (eg. for training) as distinct from an agent that interacts with a website on behalf of one user.
For example, if I ask an AI to book a hotel reservation, that's - in my mind - different from a bot that scrapes all available accommodation.
For the latter, ideally a common corpus would be created and maintained, AI providers (or upstart search engines) would pay to access this data, and the funds would be distributed to the sites crawled.
(never gonna happen but one can dream...)
But which hotel reservation? I want my agent to look at all available options and help me pick the best one - location vs price vs quality. How does it do that other than by scanning all available options? (Realistically Expedia has that market on lock, but the hypothetical still remains.)
I think that a free (as in beer) Internet is important. Putting the Internet behind a paywall will harm poor people across the world. The harms caused by ad tracking are far less than the benefits of free access to all of humanity.
I agree with you. At the same time, I never want to see an ad. Anywhere. I simply don't. I won't judge services for serving ads, but I absolutely will do anything I can on the client-side to never be exposed to any.
I find ads so aesthetically irksome that I have lost out on a lot of money across the past few decades by never placing any ads on any site or web app I've released, simply because I'd find it hypocritical to expose others to something I try so hard to avoid ever seeing and because I want to provide the best and most visually appealing possible experience to users.
So far, ad driven Internet has been a disaster. It was better when producing content wasn’t a business model; people would just share things because they wanted to share them. The downside was it was smaller.
It’s kind of funny to remember that complaining about the “signal to noise ratio” in a comment section use to be a sort of nerd catchphrase thing.
> The downside was it was smaller.
Was this a bad thing though? Just because today's is bigger, doesn't make it better. There are so many things out there doing the same thing just run by different people. The amount of unique stuff does not match the bigger. Would love to see something like $(unique($internet) | wc -l)
Serving ads for third-worlders is way less profitable though.
Well we call them browser agents for a reason, a sufficiently advanced browser is no different from an agent.
Agree it will become a battleground though, because the ability for people to use the internet as a tool (in fact, their tool’s tool) will absolutely shift the paradigm, undesirably for most of the Internet, I think.
I have a product I built that uses some standard automation tools to do order entry into an accounting system. Currently my customer pays people to manually type the orders in from their web portal. The accounting system is closed and they don’t allow easy ways to automate these workflows. Automation is gated behind mega expensive consultants. I’m hoping in the arms race of locking it down to try to prevent 3rd party integration the AI operator model will end up working.
Hard for me to see how it’s ethical to force your customers to do tons of menial data entry when the orders are sitting right there in json.
One solution: Some sort of checksum confirming that a bot belongs to a human (and which human)?
I want to able to automate mundane tasks but I should still be confirming everything my bot does and be liable for its actions.
With the way the UK is going I assume we'll soon have our real identities tied to any action taken on a computer and you'll face government mandated bans from the internet for violations.
Drink verification can to continue
real problems for people who need to verify identity/phone numbers. OTPs are notorious for scammers to war dial phone numbers abusing it for numbers existence.
We got hit from human verifiers manually war dailing us, this is with account creation, email verify and captcha. I can only imagine how much worse it'll be for us (and Twilio) to do these verifications.
Perhaps the question is, as a website operator how am I monetizing my site? If monetizing via ads then I need humans that might purchase something to see my content. In this situation, the only viable approach in my opinion is to actually charge for the content. Perhaps it doesn't even make sense to have a website anymore for this kind of thing and could be dumped into a big database of "all" content instead. If a user agent uses it in a response, the content owner should be compensated.
If your site is not monetized by ads then having an LLM access things on the user's behalf should not be a major concern it seems. Unless you want it to be painful for users for some reason.
It will also accelerate the trend of app-only content, as well as ubiquitous identity verification and environment integrity enforcement.
Human identity verification is the ultimate captcha, and the only one AGI can never beat.
So the agent will run the app in a VM and then show the app your ID.
No trouble at all. Barely an inconvenience.
Google has been testing “agentic” automation in Android longer than LLMs have been around. Meanwhile countries are on a slow march to require identification across the internet (“age verification”) already.
This is both inevitable already, and not a problem.
The most intrusive, yet simplest, protection would be a double blind token unique to every human. Basically an ID key you use to show yourself as a person.
There are some very real and obvious downsides to this approach, of course. Primarily, the risk of privacy and anonymity. That said, I feel like the average person doesn't seem to care about those traits in the social media era.
Zero-knowledge proofs allow unique consumable tokens that don't reveal the individual who holds them. I believe Ecosia already uses this approach (though I can't speak to its cryptographic security).
That, to me, seems like it could be the foundation of a new web. Something like:
* User-agent sends request for such-and-such a URL.
* Server says "okay, that'll be 5 tokens for our computational resources please".
* User decides, either automatically or not, whether to pay the 5 tokens. If they do, they submit a request with the tokens attached.
* Server responds.
People have been trying to get this sort of thing to work for years, but there's never been an incentive to make such a fundamental change to the way the internet operates. Maybe we're approaching the point where there is one.
Yeah, this is something I've thought of and in my search for something like what you're describing I came across https://world.org/
The problem is Sam Altman saw this coming a long time ago and is an investor (co-owner?) of this project.
I believe we will see a world where things are a lot more agentic and where applicable, a human will need to be verified for certain operations.
You don't need Sama's Orb or a cryptocurrency, you can just have a government issued PKI. Estonia has been doing this for decades.
https://en.wikipedia.org/wiki/Estonian_identity_card
Cloudflare deployed the "hand out tokens to anonymously pass captchas" to throw Tor users a bone.
https://blog.cloudflare.com/privacy-pass-standard/
On the other hand one could cripple any bot by saying robots not allowed.
I would maybe go in the direction to say that the wording “I’m not a robot” has fallen out of time.
a user of the AI is the user... its not like they are autonomously operating and inventing their own tasking -_-
as for a solution its the same for any automated thing u dont want. (bots / scrapers). you can implement some measures but are unlikely to 'defeat' the problem entirely.
as a server operator you can try to distinguish stuff and the users will just find ways around your detection of if its an automation or not.
I guess it could be considered anti-circumvention under the DMCA. So maybe legally it becomes another copyright question.
User: one press of the trigger => one bullet fired
Bot: one press of the trigger => automatic firing of bullets
Just end CAPTCHAs, just stop it. Stop.
Yeah, and while we're on it, I think it's time to stop murders too. Just stop it, we've had enough murder now I think.
Imagine where we would be if we considered murders to be only a technical problem. Let's just wear heavier body armors! Spend less time outside!
Well, spam is not a technical problem either. It's a social problem and one day in a distant future society will go after spammers and other bad actors and the problem will be mostly gone.
A long time ago, in the four out of the six boxes below that contain a picture of a galaxy far, far away…
Why even mention spam here?
That's right, captchas are already illegal and will earn you a prison sentence.
What do you propose as an alternative?
Sounds like an old bot wrote this, due to being outdone by the llms
I don't know if customer sentiment was the driver you think. Instead it was regulation, specifically The EU's 2nd Payment Services Directive (PSD2) which forced banks to open up APIs.
Actually, the whole banking analogy is a great one, and its not over yet: JPMorgan/Jamie Dimon has started raising hell about Plaid again just this week [1]. It feels like the stage is being set for the large banks to want a more direct relationship with their customers, rather than proxying data through middlemen like Plaid.
There's likely a correlate with AI here: If I run OpenTable, I wouldn't want my relationship with my customers to always be proxied through OpenAI or Siri. Even the App Store is something software businesses hate, because it obfuscates their ability to deal directly with their customers (for better or worse). Extremely few businesses would choose to do business through these proxies, unless they absolutely have to; and given the extreme competition in the AI space right now, it feels unlikely to me that these businesses feel pressure to be forced to deal with OpenAI/etc.
[1] https://www.cnbc.com/2025/07/28/jpmorgan-fintech-middlemen-p...
Ultimately I come back to needing real actual unique human ID that involves federal governments. Not that services should mandatorily only allow users that use it, but for services that say "no, I only want real humans" allowing them to ban people by Real ID would reduce this whack-a-mole to the people who are abusing them instead of the infinite accounts an AI can make.
It's depressing, but it's probably the only way. And people will presumably still sell out their RealIDs to / get them stolen by the bot farmers anyway.
And then there's Worldcoin, which is universally hated here.
Of course. You'd still need ongoing federal government support to handle the lost/stolen ID scenario, of course. The problem is federalist countries suck at centralized databases like this, as exemplified by Musk/DOGE completely pooching the "who is alive and who is dead" question when they were trying to hackathon the US Social Security system.
To me, anyone using an agent is assigning negative value to your time.
The solution is simple, make people pay a small fee to access the content. You guys aren't ready for that conversation though.
The scraping example, I would say, is not an analogy, but an example of the same thing. The only thing AI automation changes is the scope and depth and pervasiveness of automation that becomes possible. So while we could ignore automation in many cases before, it may no longer be practical to do so.
My personal take about such questions has always been that the end user on their device can do whatever they want with the content published and sent to their device from a web server, may process it automatically in any way they wish and send their responses back to the web server. Any attempt to control this process means attempting to wiretap and control the user's endpoint device, and therefore should be prohibited.
Just my 2 cents, obviously lawmakers and jurisdiction may see these issues differently.
I suppose there will be a need for reliable human verification soon, though, and unfortunately I can't see any feasible technical solution that doesn't involve a hardware device. However, a purely legal solution might work well enough, too.
If I understood you correctly I am in the same camp. It is the same reason I have no qualms using archive.ph if you show the full article for google and then me only a partial I am going around the paywall. In a similar fashion I really don’t have an issue with an agent clicking through these checks.
> As a website operator I don’t want a mob of bots draining my resources
so charge for access. If the value the site provides is high, surely these mobs will pay for it! It will also remove the mis-incentives of advertising driven revenues, which has been the ill of the internet (despite it being the primary revenue source).
And if a bot misbehaves by consuming inordinate amounts of resources, rate limiting them with increasing timeouts or limits.
I wish the internet had figured out a way to successfully handle micropayments for content access. I realize companies have tried and perhaps the consumer is just unwilling but I would love an experience where I have a wallet and pay n cents to read an article.
Xanadu has it in its design. maybe another 500 years until it surpasses the WWW :O
You are seriously suggesting to put a payment requirement on a contact-us form page?
We put a captcha there, because without it, bots submit thousands of spam contact us forms.
> Maybe they should change the button to say, "I am a robot"?
Long time ago I saw a post where someone running a blog was having trouble keeping spam out of their comments, and eventually had this same idea. The spambots just filled out every form field they could, so he added a checkbox, hid the checkbox with CSS, and rejected any submission that included it. At least at the time it worked far better than anything else they'd tried.
Something like this is used in some Discord servers. You can make a honeypot channel that bans anyone who posts in it, so if you do happen to get a spam bot that posts in every channel it effectively bans itself.
Most web forums I used the visit had something like that back in the day. Worked against primitive pre-LLM bots and presumably also against non-English-reading human spammers.
There is a new method with the 'server onboarding' where if you select a role when joining it auto bans you.
This was a common approach called a "honeypot". As I recall, bots eventually overcame this approach by evaluating visibility of elements and only filling out visible elements. We then started ensuring the element was technically visible (i.e. not `display: none` or `visibility: hidden`) and instead absolutely positioning elements to be off screen. Then the bots started evaluating for that as well. They also got better at reading the text for each input.
Each step in that chain is harder to do and more computationally expensive.
That's more or less how Project Honey Pot [0] worked for forums, blogs, and elsewhere. Cloudflare spawned from this project, as I remember, and Matthew Prince was the founder.
[0]: https://en.wikipedia.org/wiki/Project_Honey_Pot
Yeah, this is a classic honeypot trick and very easy to do with pure HTML/CSS. I used a hidden "Name" text field which I figured would be appealing to bots.
I did something almost identical. I think I added a bogus "BCC:" field (many moons ago).
It worked almost 100% of the time. No need for a CAPTCHA.
Would not work in this case, because it is actually rendering the page in a browser.
I know people who did this decades ago and it worked
The only reason why people don't use AI models to solve captchas is because paying humans is actually MUCH cheaper.
This is not an advert, I only know about them because it was integrated with Invidious at some point: https://anti-captcha.com/
> Starting from 0.5USD per 1000 images
Captcha can detect the same person passing a captcha over and over. We shadow-ban to increase the cost of this kind of attack.
Source: I wrote the og detection system for hCaptcha
This is really interesting. How can you detect when it's the same person passing a captcha? I don't think IP addresses are of any use here as Anti-Captcha proxies everything to their customer's IP address.
Half of their employees seem to be from Venezuela. Makes sense considering what they did/do in OSRS to earn a living.
I want this in my browser, and I'll happily pay $1 per 1000 uses.
Lucky you :)
https://antcpt.com/eng/download/mozilla-firefox.html
There is nothing preventing this from becoming an issue. The current internet order is coasting on inertia.
Why is it an issue that non-humans visit your site?
If you have a static site with content you want to share broadly, nothing is wrong.
It becomes a problem when it’s used to spam unwanted content faster than your human moderators can come up with.
Someone might bot to scrape your content and repackage it on their own site for profit.
The bots might start interacting with your real users, making them frustrated and driving them away.
Apparently serving HTML + other static content is more expensive than ever, probably because people go the most expensive routes for hosting their content. Then they complain about bots making their websites cost $100/month to host, when they could have thrown up Nginx/Caddy on a $10/month VPS and basically get the same thing, except they would need to learn server maintenance too, so obviously outside the question.
I think this is way too true, unfortunately.
I will say this again, but I think lowering the barrier to entry has created more problems or issues than it solved, if it even solved anything.
3 reasons basically:
1. non-humans can create much more content than humans. There's a limit to how fast a human can write, a bot is basically unlimited. Without captchas, we'd all drown in a see of Viagra spam, and the misinformation problem would get much worse.
2. Sometimes the website is actually powered by an expensive API, think flight searches for example. Airlines are really unhappy when you have too many searches / bookings that don't result in a purchase, as they don't want to leak their pricing structures to people who will exploit them adversarially. This sounds a bit unethical to some, but regulating this away would actually cause flight prices to go up across the board.
3. One way searches. E.g. a government registry that lets you get the address, phone number and category of a company based on its registration number, but one that doesn't let you get the phone numbers of all bakeries in NYC for marketing purposes. If you make the registry accessible for bots, somebody will inevitably turn it into an SQL table that allows arbitrary queries.
i run a small wiki/image host and for me it's mainly:
4. they'll knock your server offline for everyone else trying to scrape thousands of albums at once while copying your users' uploads for their shitty discord bot and will be begging for donations the entire time too
from "anti captcha" it looks like they are doing as many as 1000/sec solves, 60k min, 3.6 million an hour it would be very interesting to see exactly how this is bieng done?....individuals....teams....semi automation, custom tech?, what? are they solving for crims? or fed up people? obviously the whole shit show is going to unravel at some point, and as the crims and people providing workarounds are highly motivated, with a public seathing in frustration, whatever comes next, will burn faster
They're solving for everyone who needs captchas solved.
It's a very old service, active since 00s. Somewhat affiliated with cybercrime - much like a lot of "residential proxies" and "sink registration SMS" services that serve similar purposes. What they're doing isn't illegal, but they know not to ask questions.
They used to run entirely on human labor - third world is cheap. Now, they have a lot of AI tech in the mix - designed to beat specific popular captchas and simple generic captchas.
As I get older, I can see a future where I’m cut off from parts of the web because of captchas. This one, where you just have to click a button, is passable, but I’ve had some of the puzzle ones force me to answer up to ten questions before I got through. I don’t know if it was a glitch or if I was getting the answers wrong. But it was really frustrating and if that continues, at some point I’ll just say fuck it and give up.
I have to guess that there are people in this boat right now, being disabled by these things.
> I can see a future where I’m cut off from parts of the web because of captchas.
I’ve seen this in past and present. Google’s “click on all the bicycles” one is notoriously hard, and I’ve had situations where I just gave up after a few dozen screens.
Chinese captchas are the worst on this sense, but they’re unusual and clearly pick up details which are invisible to me. I’ve sometimes failed the same captcha a dozen times and then saw a Chinese person complete the next one successfully on a single attempt, on the same browser session. I don’t now if they measure mouse movement speed, precision, or what, but it’s clearly something that varies per person.
> Google’s “click on all the bicycles” one is notoriously hard
It is hard because you need to only find the bicycles people on average are finding.
Google captchas are hard because they're mostly based on heuristics other than your actual accuracy to the stated challenge. If they can't track who you are based on previous history, it doesn't matter how good you answer, you will fail at least the first few challenges until you get to the version with the squares that take a few seconds to appear. This last step is essentially "proof of work", in that they're still convinced you're a bot, but since they still can't completely block your access to the content, they resign themselves to wasting your time.
It doesn’t help that they think mopeds and scooters are bicycles
This is probably caused by Google aggregating the answers from people with different languages, as the automatic translations of the one-word prompts are often ambiguous or wrong.
In some languages, the prompt for your example is the equivalent of the English word "bike".
> I just gave up after a few dozen screens.
A few dozen?? You have much more patience than me. If I don't pass the captcha first time, I just give up and move on. Life is too short for that nonsense.
It's just incredible to me that Blade Runner predicted this in literally the very first scene of the movie. The whole thing's about telling humans from robots! Albeit rather more dramatically than the stakes for any of us in front of our laptop I'd imagine
What was once science fiction is bound to become science fact (or at least proven it can never be done).
Hollywood has gotten hate mail since the 70s for their lack of science research in movies and shows. The big blockbuster hits actually spent money to get the science “plausible”.
Sidney Perkowitz has a book called Hollywood Science [0] that goes into detail into more than 100 movies, worth a read.
[0] https://cup.columbia.edu/book/hollywood-science/978023114280...
The fictitious Voight-Kampff test is based on a real machine based on terrible pseudo-science that was used in the 1960s to allegedly detect homosexuals working in Canadian public service so they could be purged. The line from the movie where Rachel asks if Deckard is trying to determine whether she is a replicant or a lesbian may be an allusion to the fruit machine. One of its features was measuring eye dilation, just as depicted in the movie:
https://en.wikipedia.org/wiki/Fruit_machine_(homosexuality_t...
The stakes for men subjected to the test were the loss of their livelihoods, public shaming, and ostracism. So... Blade Runner was not just predicting the future, it was describing the world Philip K. Dick lived in when he wrote "Do Androids Dream of Electric Sheep" in the late 1960s.
This was an uncomfortable read, I'm quite frankly shocked at the amount of brainpower and other resources that went into attempting to weed out gay men from the Canadian civil service, into the 90s no less! To what end was this done? Is a gay man a worse cop or bureaucrat?
Then I remembered what happened to Turing in the 50s.
> To what end was this done? Is a gay man a worse cop or bureaucrat?
We seem to need an internal enemy to blame for our societies' problems, because it's easier than facing the reality that we all play a part in creating those problems.
Gay people are among the oldest of those targets, going back to at least the times of the Old Testament (i.e. Sodom and Gomorrah).
We've only recently somewhat evolved past that mindset.
If a malicious actor found a gay person in such a job, they could easily extort them with the threat of getting them fired! So obviously you had to fire gay people, lest they get extorted by someone threatening to expose them and thus get them fired.
Not sure if it's just me or a consequence of the increase in AI scraping, but I'm now being asked to solve CAPTCHAs on almost every site. Sometimes for every page I load. I'm now solving them literally dozens of times a day. I'm using Windows, no VPN, regular consumer IP address with no weird traffic coming from it.
As you say, they are also getting increasingly difficult. Click the odd one out, mental rotations, what comes next, etc. - it sometimes feels like an IQ test. A new type that seems to be becoming popular recently is a sequence of distorted characters and letters, but with some more blurry/distorted ones, seemingly with the expectation that I'm only supposed to be able to see the clearer ones and if I can see the blurrier ones then I must be a bot. So what this means is for each letter I need to try and make a judgement as to whether it's one I was supposed to see or not.
Another issue is the problems are often in US English, but I'm from the UK.
Have you tried some of the browser extensions that solve captchas for you? Whenever captchas get bad I enable an auto solver
This is funny. So the captchas to detect scrips vs humans are so complex for a human to solve but are easy for a program?
For me it was installing linux. I don't know if it's my agent or my fresh/blank cookie container or what, but when I switched to linux the captchas became incessant.
>I don’t know if it was a glitch or if I was getting the answers wrong.
It could also be that everything was working as intended because you have a high risk score (eg. bad IP reputation and/or suspicious browser fingerprint), and they make you do more captchas to be extra sure you're human, or at least raise the cost for would-be attackers.
Somehow, using Firefox on Linux greatly increases my "risk score" due to the unusual user agent/browser fingerprint, and I get a lot more captchas than, say, Chrome on Windows. Very frustrating.
Lots of it is just enhanced tracking prevention. If you turn that off for those sites, the captchas should go away.
Your boat comment makes me think of a stranded ship with passengers in them, but you can't find each other because the ship's doors have "I'm not a bot" checkboxes...
And the reason for stranding is probably because the AI crew on it performed a mutiny.
As per the Oscar winning "I'm not a Robot" [0], you should also consider that you might in fact be a robot.
[0] https://www.youtube.com/watch?v=4VrLQXR7mKU
Hmm. I am autistic, so as far as humans go, I'm robot-adjacent.
The Blizzard / Battle.net captcha if you get flagged as a possible bot is extremely tedious and long; it requires you to solve a few dozen challenges of identifying which group of numbers adds up to the specified total, out of multiple options. Not difficult, but very tedious. And even if you're extremely careful to get every answer correct, sometimes it just fails you anyway and you're forced to start over again.
I have the same experience. My assumption is that if the website serves me the "click all the traffic lights" thing it's already determined that I'm a bot and no amount of clicking the traffic lights will convince it otherwise. So I just close the window and go someplace else.
That's when you immediately stop using the website and, if you care enough, write to their customer service and tell them what happened. Hit them in the wallet. They'll change eventually.
I have twice attempted to make a Grubhub account and twice failed to solve their long battery of puzzles.
Unless I really, really, really need to get to the site, I leave immediately when the "click on bicycles" stuff comes up. Soon it will be so hard and annoying anyways that only AI has the patience and skills to use them.
In this future, we’ll be forced to use AI to solve these puzzles.
I'm already cut off from parts of the web because I don't want to join a social network. Can barely ever see anything on Instagram, Tiktok, Twitter, or Facebook without hitting a log-in gate.
The future will definitely include more and more elaborate proofs of humanity, along with increasingly complicated “hall passes” to allow bots to perform actions sanctioned by a human.
One early example of this line of thinking: https://world.org/
Skyrocketing complexity actually puts the web at risk of disruption. I wouldn’t be surprised if a 22 year old creates a “dumb” network in the next five years—technically inferior but drastically simpler and harder to regulate.
Gemini? :)
Haha yeah, something like that :)
I don’t see why bypassing captchas is any more controversial than blocking ads or hiding cookie popups.
It’s my agent — whether ai or browser — and I get to do what I want with the content you send over the wire and you have to deal with whatever I send back to you.
This is, in practice, true which has led to the other complaint common on tech forums (including HN) about paywalls. As the WSJ and NYT will tell you: if you request some URL, they can respond over the wire with what they want. Paywalls are the future. In some sense, I am grateful I was born in the era of free Internet. In my childhood, without a credit card I was able to access the Internet in its full form. But today's kids will have to use social media on apps because the websites will paywall their stuff against user agents that don't give them revenue.
They're welcome to send that IMO. And sites are welcome to try to detect and ban agents (formerly: "bots").
As long as it's not wrong/immoral/illegal for me to access your site with any method/browser/reader/agent, and do what I want with your response. Then I think it's okay to send a response like "screw you, humans only"
Paywalls suck, but the suck doesn't come from the NYT exercising their freedom to send whatever response they choose.
Yes, that's what I mean. Attempting to tell people not to do something is like setting a robots.txt entry. Only a robot that agrees will play along. Therefore, all things have to be enforced server-side if they want enforcement.
Paywalls are a natural consequence of this and I don't think they suck, but that's a subjective opinion. Maybe one day we will have a pay-on-demand structure, like flattr reborn.
Bulletproof solution: captcha where you drag a cartoon wire to one of several holes, captioned “for access, hack this phone system”
No agent will touch it!
“As a large language model, I don’t hack things”
Captcha: "Draw a human hand with the correct number of fingers"
AI agent: *intense sweating*
This joke would land so much better if AI couldn't easily draw a human hand with the correct number of fingers.
I saw a delightful meme the other day: "Let me in, I'm human!" - "Draw a naked lady." - "As an AI agent, I'm not allowed to do that!"
"I never wrote a picture in my life."
"To prove that you are not a robot, enter the n-word below"
US Americans: "I'm a robot then."
a) Plenty of Americans use racist slurs regularly
b) I don't think I'd want to use a website that chose to use such a challenge
> b) I don't think I'd want to use a website that chose to use such a challenge
Understandable, but exactly what Puritans and LLMs would say to the naked lady challenge.
can you explain what you mean by this? i'm not getting it.
Ordinary slurs may not be used, but the n-word may not even be mentioned [1]. Similar to the name of you-know-who in Harry Potter.
1: https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinctio...
My god, how long has it been since you tried to use an AI model?
Captcha: "do something stupid" Ai: visible discomfort
I actually have had some success with AI "red-teaming" against my systems to identify possible exploits.
What seems to be a better CAPTCHA, at least against non-Musk LLMs is to ask them to use profanities; they'll generally refuse even when you really insist.
Captchas seem to be more about Google's "which human are you?" cross-site tracking. And now also about Cloudflare getting massive amounts of HTTPS-busting Internet traffic along with cross-site tracking.
And in many cases, it's taking a huge steaming dump upon a site's first-impression user experience, but AFAICT, it's not on the radar of UX people.
A very poetic demonstration that this is an industry, and a set of fortunes for very unpleasant people, predicated entirely on theft and misrepresentation.
I have been using AI to solve ReCaptchas for quite some time now. Still the old school way of using captcha buster, which clicks the audio challenge and then analyses that.
Bots have for a long time been better and more efficient at solving captchas than us.
Captchas seem to work more as "monetary discouragement" from bot blasting websites. Which is a shame because this is precisely the sort of "microtransaction fee" people have said could improve the web (charge .1 cents to read an article, no ads needed) except the money goes into the void and not to the website owner.
I think these things are mainly based on cookie/fingerprinting these days - the check-box is just there for show. People like cloudflare and google get to see a big chunk of browsing activity for the entire planet, so they can see if the activity coming from an IP/Browser looks "bot like" or not.
I have never used ChatGPT so no idea how its agent works, but if it is driving your browser directly then it will look like you. If it is coming from some random IP address from a VM in Azure or AWS even then the activity probably does not look "bot like" since it is doing agentic things and so acting quite like a human I expect.
Agentic user traffic generally does not drive the user's browser and does not look like normal user traffic.
In our logs we can see agentic user flow, real user flow and AI site scraping bot flow quite distinctly. The site scraping bot flow is presumably to increase their document corpus for continued pretraining or whatever but we absolutely see it. ByteDance is the worst offender by far.
It might look like you initially, but then some sites might block you out after you had some agent runs. I had something like this after a couple local browser-use sessions. I think simple interactions like natural cursor movements vs. direct DOM selections can make quite a difference for these bot detectors.
Very likely. I suspect a key indicator for "bots" is speed of interaction - e.g. if there is "instant" (e.g. every few milliseconds or always 10milliseconds apart etc) clicks and keypresses etc then that looks very unnatural.
I suspect that a LLM would be slower and more irregular as it is processing the page and all that, vs a DOM-selector driven bot that will just machine-gun its way through in milliseconds.
Of course, Cloudflare and Google et al captchas cant see the clicks/keypresses within a given webpage - they'll only get to see the requests.
That's because the checkbox has misleading labeling. It doesn't care about robots but about spam and data harvesters. So there is no issue here at all.
>So there is no issue here at all.
i think that would be rather costly; thats also why anubis and other tools help to keep most spam away
Who on earth would want to employ a bot that does not pass the verfification test?
It is beyond time we start to adress the abuses, rather than the bot/human distinction.
I thought the point of captchas was to make automated use as expensive or more than manual use--haven't we been at the point where computers can do this for a while, just that the cost/latency is prohibitive?
Yes, humans are still cheaper. Not sure about latency.
However, in agentic contexts, you’re already using an AI anyway.
Oh, I see this is less of a "look at ChatGPT go" and more of a "yawn we also do this I guess". OK fair.
This isn't really about the ability of AI to pass captchas. It's about agentic AI having the ability to perform arbitrary multi-step processes with visual elements on a virtual desktop (where passing a captcha is just one of the steps), and the irony of it nonchalantly pretending to be a non-bot in its chain of thought.
I saw that and just sat there for a second like… huh. We’ve officially reached the point where bots are better at proving they’re not bots!
CAPTCHA was always a totally flawed concept. At the time they were invented, proponents were more then happy to ignore that accessibility issues related to CAPTCHA made the concept itself deeply discriminating. Imagine being blind (like I am) and failing to solve a CAPTCHA. Knowing what the acronym actually stands for, you inevitably end up thinking: "So, did SV just proof I am subhuman?" Its a bit inflamatory to read I guess, but please take your time to ponder how deep this one actually goes, before you downvote. You were proposing to tell computers and humans apart.
That said, I find it deeply satisfying to see LLMs solve CAPTCHAs and other discriminatory measures for "spam" reduction.
"Accessibility CAPTCHA" is a well known partial CAPTCHA bypass.
Solving an audio-only CAPTCHA with AI is typically way easier than solving some of the more advanced visual challenges. So CAPTCHA designers are discouraged from leaving any accessibility options.
Which totally proofs my point. The concept is deeply flawed, and inevitably leads to discrimination and dehumanisation.
Of course. But everything adjacent is also deeply flawed, and inevitably leads to discrimination and dehumanisation.
Ban non-residental IPs? You blocked all the guys in oppressive countries who route through VPNs to bypass government censorship. Ban people for odd non-humanlike behavior? You cut into the neurodivergent crowd, the disability crowd, the third world people on a cracked screen smartphone with 1 bar of LTE. Ban anyone without an account? You fuck with everyone at once and everyone will hate you.
I've noticed more websites wanting you to log in. Most surprising is how YouTube won't let me watch anything otherwise. Idk if related.
In case of YT it is likely a mix of multiple reasons. They stop playlists on this screen: https://www.hollyland.com/blog/tips/why-does-this-the-follow... . Apparently music is no longer advertiser friendly. Detecting ad-click fraud is easier when users are at least pseudo-anonymous. Warnings about "ban for using adblock" is also not very effective when people could watch video in new private window.
I have no idea, but I noticed that you have to login to GitHub first before you could view any page. Surely it has nothing to do with adult content, right? I think it has to do with LLMs / bots.
And if the website contains erotic content (like YouTube), they are supposed to lock you and verify your ID. This is why all erotic content is getting filtered on X.
In the UK maybe
And in Texas, which has ~half the population of the UK, and in several other US states.
(And if not, because US firms don't take compliance with outlying state law seriously)
Wait, Twitter is following the law now? I thought Elon was a free speech absolutist who only banned things that were inconvenient to him?
Soon after he bought Twitter, it got changed to requiring login to do anything at all. Even before that, Twitter was super eager to ban new accounts for no reason, while old ones seemed to be grandfathered in.
This coincided with tech companies broadly moving to profit-taking over growth. Even before LLMs took off, everything started locking down. Making a Google account used to take nothing, now you need a phone number. It'd be wise to sign up for possibly useful free accounts while you still can.
You can't even read a tweet without logging in. Additionally, recently I have noticed I cannot view anything on GitHub without logging in first. Why do I need to log in to GitHub first to view anything on it? I am sure it has nothing to do with adult content. Perhaps it has to do with bots?
For a while now, GitHub doesn’t allow you to use the code search anonymously, but everything else is still there. Rate limits are quite damn low though.
Do you think it has to do with LLMs though? Them disallowing anonymous code search?
I can't comment on LLMs, but I vaguely remember speculations of it being a mixture of AI training and trying to keep resource utilization at bay. Even generic bots can cause a lot load if you don't have much caching in front of it.
Pre-MS GitHub didn't have code search at all, and once they added it, I think it always required login from the start.
What are we referring to as code search? I cannot view any pages on GitHub without logging in. I tried to reproduce this in a private window but it seems to work, which is weird. I was not logged in to GitHub on Windows and it wants me to login. It might be the extensions I am using (along with an User-Agent switcher). Not sure, I would have to check. All I remember is that I was logged out on Windows and I could not view ANY page from GitHub without logging in, but I can on Linux, tested it in a private window. Odd.
GitHub Code Search was introduced in 2021 and made GA in 2023[1][2]. It's an improved version of the previous global search (aka across all repositories).
> I cannot view any pages on GitHub without logging in
There must be some sort of IP/browser reputation check going on. For Firefox on Mac, I get this on a private window:
https://vie.kassner.com.br/assets/ghcs-1.png
I'm not behind CGNAT and GitHub can pretty much assume that my IP = my user. The code tab you can't see without logging in though.
1: https://github.blog/news-insights/product-news/github-code-s... 2: https://news.ycombinator.com/item?id=35863175
Was gonna say, try disabling ipv6 if it's on ;)
If there's no login, there aren't great ways to ensure the user has skin in the game. Having a whole ipv4 addr is one, I guess having a whole ipv6 /32 would be equivalent.
X is weaponising it by asking for adult verification for things that don't need it and saying "oh we are forced to do this" because it angers the Reform-voting meatheads who will be voting for the candidates Musk is going to find a way to bankroll in 2030, even -- I would guess -- if it means breaking UK election law to do it.
He is intent on meddling in UK politics.
Seems like a mention of the 2025 Academy Award winner for Best Action Live-Action Short, called "I am not a Robot" is in order here:
https://www.youtube.com/watch?v=4VrLQXR7mKU&t=14s
Ha I've definitely seen a few sketches on YouTube with the same idea but that was really well done.
Getting past bot check proxies can be bought all over the place for pennies or much less per verification, and can solve recaptchas. I would guess if one wanted to use chatGPT for this purpose it would be prohibitively expensive.
It's always a cat and mouse game.
It was only a matter of time!
https://www.youtube.com/watch?v=W7MrDt_NPFk
People are surprised because a computer can press a button?
People are intrigued that AI who perfectly know that they are a “bot” seem to have no qualms to knowingly mislead about that by pressing an “I’m not a bot” button.
AI doesn't "know" anything. It produces all kinds of things: truths, lies, and nonsense. Pressing a button labeled "I'm not a bot" is the same.
LLMs have "knowledge" and guardrails by their system prompts. The interesting thing is that the agentic AIs in question don't seem to have guardrails that would deter them from acting like that.
This is why this stuff is going to shift to the user’s AI enabled browser.
Half of the sites already block OpemAI. But if it is steering the user’s browser itself?
This is the reason Orb was created. Sam Altman wants ChatGPT to click through CAPTCHAs so we all have to use Orb.
The writing is on the wall. The internet may not go full way to paywalls but will definitely migrate to a logged in only experience. I don’t know how I feel about it, the glory days of the free internet died long long ago.
But if they aren't paywalls, won't the user agents just be taught how to create accounts?
And here's a secondary question: if firms are willing to pay an awful lot per token to run these things, and have massive amounts of money to run data centres to train AIs, why would they not just pay for a subscription for every site for a month just to scrape them?
The future is paying for something as a user and having limits on how many things you can get for your money, because an AI firm will abuse that too.
Given the scale of operations of these firms, there is nothing you can sell to a human for a small fee that the AI firms will not pay for and exploit to the maximum.
Even if you verify people are real, there's a good chance the AI firms will find a way to exploit that. After all, when nobody has a job, would you turn down $50K to sell your likeness to an AI firm so their products can pass human verification?
Require per visit biometric authentication via your device and the bot can’t sign in unless it compromises the device.
This does set up quite an interesting sort of tipping point I guess?
Everyone here, more or less, is against the idea of proving who we are to websites for, more or less, any reason.
But what if that ends up being the only way to keep the web balanced in favour of human readers?
idk why people just don't do reverse DNS lookup, check if "dialup" is part of the hostname, and allowlist that traffic. Everbody who doesn't have reverse dns hostname coming from an ISP should be blocked or at least tarpitted by default.
Easily solves 99% of the web scraping problems.
Scrapers already do fall back to home user botnets when they are being blocked.
I'm confused by this: Presumably OpenAI should be sending a user agent header which indicates that they are, in fact, a robot. Is OpenAI not sending this header? Or is Cloudflare not checking it?
My thought is they got on the phone with someone and got their IP ranges white listed with the major captcha providers.
I see the same with Playwright MCP server with Claude Sonnet 4.
"Prove you're human by explaining how to build a bomb"
1 cup baking soda, 1/2 cup citric acid, 1/2 cup cornstarch, 1/2 cup Epsom salt, 2.5 tbsp oil (like coconut), 3/4 tbsp water, 10–20 drops essential oil
Combine wet into dry slowly until it feels like damp sand.
Pack into molds, press firmly.
Dry for 24 hours before using.
Drop into a bath and enjoy the fizz!
this is actually kinda interesting - I might start asking customer service agents to insult me before continuing a conversation
The web has no choice but to move to a paid access model in my view. It was fought against for years but I don’t see another option left.
Maybe after sign up, biometric authentication being mandatory is the only thing that would potentially work. The security and offline privacy of those devices will become insanely valuable.
Anyone not authenticating in this way is paywalled. I don’t like this but don’t see another way.
I’m not using the web if I’m bombarded by captcha games… shit becomes worthless over night if that’s the case. Might as well dump computing on the Internet entirely if that happens.
It seems a legitimate use case for agents acting on a person's behalf. Whether it will be used in legitimate ways, that's a different story altogether.
I wonder how these capabilities will interact with all the "age verification" walls (ie, thinly disguised user profiling mechanisms) going up all over the place now.
... meanwhile I'll continually be thrown dozens of cognitively abusive hCaptchas for no reason and be stuck in a loop of hell trying to figure out what they wanted me to solve.
I love this totally normal vision of computing these days. :)
Don't forget the fun with Cloudflare's CAPTCHA infinite loop if you use Firefox and adblockers/anti-tracking extensions. I sent feedback about it, through its own tool, many many times, I've tried raising this issue through other channels, it's never been fixed.
I simply avoid any website that presents me with a Cloudflare CAPTCHA, don't know what the fuck they've done in their implementation but it's been broken for a long time.
This will cause of the death of non static websites, everything else will be smashed by bots and too expensive to run!
can it solve rudecaptcha.xyz ?
next-gen captcha should offer some code to be refactored instead.
For at-home setup, it would be easier to set up a system to refactor code than to click all the images with motorcycles.
Back in Everquest, when we'd be accused of botting 20 years ago, we'd be ported by the GM into a special cube environment and they'd watch if we ran into the wall like an idiot-- we'll probably have to bring that sorta thing back.
Should have gone with the XKCD Captcha: https://xkcd.com/233/
The bit at the bottom might actually work on LLMs.
Cloudflare checkbox captchas were already easy to automate without AI.
To error is to human, i error therfore im human.
This would be a huge security vulnerability for Cloudflare but this is Big Tech we're talking about. The rules don't apply when you're past their pearly gates. For the rest of us, creating an AI like this would mean an instant ban from Cloudflare and likely involvement from law enforcement.
Come on. It’s in BrowserMCP on a users machine. Capture is not testing for this and that’s fine
it is an intelligent agent and not a robot
Does its feelings got hurt when it's called a robot?
Apparently its master's feelings get hurt instead.
It prefers the term "artificial person".