LLMs are steroids for your Dunning-Kruger

(bytesauna.com)

56 points | by gridentio an hour ago ago

51 comments

sho_hn an hour ago

I'm not sure this is something I really worry about. Whenever I use an LLM I feel dumber, not smarter; there's a sensation of relying on a crutch instead of having done the due diligence of learning something myself. I'm less confident in the knowledge and less likely to present it as such. Is anyone really cocksure on the basis of LLM received knowledge?

> As I ChatGPT user I notice that I’m often left with a sense of certainty.

They have almost the opposite effect on me.

Even with knowledge from books or articles I've learned to multi-source and question things, and my mind treats the LLMs as a less reliable averaging of sources.

[-]

Insanity an hour ago

I remember back when I was in secondary school, something commonly heard was

"Don't just trust wikipedia, check it's resources, because it's crowdsourced and can be wrong".

Now, almost 2 decades later, I rarely hear this stance and I see people relying on wikipedia as an authoritative source of truth. i.e, linking to wikipedia instead of the underlying sources.

In the same sense, I can see that "Don't trust LLMs" will slowly fade away and people will blindly trust them.

[-]

miningape 13 minutes ago

> "Don't just trust wikipedia, check it's resources, because it's crowdsourced and can be wrong"

This comes from decades of teachers misremembering what the rule was, and eventually it morphed into the Wikipedia specific form we see today - the actual rule is that you cannot cite an encyclopaedia in an academic paper. full stop.

Wikipedia is an encyclopaedia and therefore should not be cited.

Wikipedia is the only encyclopaedia most people have used in the last 20 years, therefore Wikipedia = encyclopaedia in most people's minds.

There's nothing wrong with using an encyclopaedia for learning or introducing yourself to a topic (in fact this is what teachers told students to do). And there's nothing specifically wrong about Wikipedia either.

rsynnott 26 minutes ago

> Now, almost 2 decades later, I rarely hear this stance and I see people relying on wikipedia as an authoritative source of truth. i.e, linking to wikipedia instead of the underlying sources.

That's a different scenario. You shouldn't _cite wikipedia in a paper_ (instead you should generally use its sources), but it's perfectly fine in most circumstances to link it in the course of an internet argument or whatever.

aabhay an hour ago

There’s also the fact that both Wikipedia and LLMs are non-stationary. The quality of wikipedia has grown immensely since its inception and LLMs will get more accurate (if not explicitly “smarter”)

[-]

derektank 34 minutes ago

I'm not entirely convinced that the quality of Wikipedia has improved substantially in the last decade.

sho_hn an hour ago

I don't think the cases are really the same. With Wikipedia people have learned to trust that the probability of the information being at least reasonably good is pretty high because there's an editing crucible around it and the ability to correct misinformation surgically. No one can hotpatch a LLM in 5mins.

tayo42 43 minutes ago

Wikipedia is usually close enough and most users don't require perfection for their "facts"

Ive noticed things like gemini summaries on Google searches are also generally close enough.

everdrive 43 minutes ago

This captures my experience quite well. I can "get a lot more done," but it's not really me doing the things, and I feel like a bit of a fraud. And as the workday and the workweek roll on, I find myself needing to force myself to look things up and experiment rather than just asking the LLM. It's quite clear that for most people LLMs will make the more dependent. People with better discipline I think will really benefit in big ways, and you'll see this become a new luxury belief; the disciplined geniuses around us will genuinely be perplexed why people are saying that LLMs have made them less capable, much in the same way they wonder why people can't just limit their drug use recreationally.

[-]

Brendinooo 36 minutes ago

>it's not really me doing the things, and I feel like a bit of a fraud

I've been thinking about this a bit. We don't really think this way in other areas, is it appropriate to think this way here?

My car has an automatic transmission, am I a fraud because the machine is shifting gears for me?

My tractor plows a field, am I a fraud because I'm not using draft horses or digging manually?

Spell check caught a word, am I a fraud because I didn't look it up in a dictionary?

[-]

everdrive 30 minutes ago

I've been thinking about that comparison as well. A common fantasy is that civilization will collapse and the guy who knows how to hunt and start a fire will really excel. In practice, this never happens and he's sort of left behind unless he also has other skills relevant to the modern world.

And, for instance, I have barely any knowledge of how my computer works, but it's a tool I use to do my job. (and to have fun at home.)

Why are these different than using LLMs? I think at least for me the distinction is whether or not something enables me to perform a task, or whether it's just doing the task for me. If I had to write my own OS and word processor just to write a letter, it'd never happen. The fact that the computer does this for me facilitates my task. I could write the letter by hand, but doing it in a word processor is way better. Especially if I want to print multiple copies of the letter.

But for LLMs, my task might be something like "setting up apache is easy, but I've never done it so just tell me how do it so I don't fumble through learning and make it take way longer." The task was setting up Apache. The task was assigned to me, but I didn't really do it. There wasn't necessarily some higher level task that I merely needed Apache for. Apache was the whole task! And I didn't do it!

Now, this will not be the case for all LLM-enabled tasks, but I think this distinction speaks to my experience. In the previous word processor example, the LLM would just write my document for me. It doesn't allow me to write my document more efficiently. It's efficient, only in the sense that I no longer need to actually do it myself, except for maybe to act as an editor. (and most people don't even do much of that work) My skill in writing either atrophies or never fully develops since I don't actually need to spend any time doing it or thinking about it.

In a perfect world, I use self-discipline to have the LLM show me how to set up Apache, then take notes, and then research, and then set it up manually in subsequent runs; I'd have benefited from learning the task much more quickly than if I'd done it alone, but also used my self-discipline to make sure I actually really learned something and developed expertise as well. My argument is that most people will not succeed in doing this, and will just let the LLM think for them.

[-]

Brendinooo 22 minutes ago

I remember seeing a tweet awhile back that talked about how modernity separated work from physicality, and now you have to do exercise on purpose. I think the Internet plus car-driven societies had done something similar to being social, and LLMs are doing something to both thinking, as well as the kind of virtue that enables one to master a craft.

So, while it's an imperfect answer that I haven't really nailed down yet, maybe the answer is just to realize this and make sure we're doing hard things on purpose sometimes. This stuff has enabled free time, we just can't use it to doomscroll.

[-]

everdrive 17 minutes ago

>Internet plus car-driven societies had done something similar to being social,

That's an interesting take on the loneliness crisis that I had not considered. I think you're really onto something. Thanks for sharing. I don't want to dive into this topic too much since it's political and really off-topic for the thread, but thank you for suggesting this.

rsynnott 27 minutes ago

> Is anyone really cocksure on the basis of LLM received knowledge?

Some people certainly seem to be. You see this a lot on webforums; someone spews a lot of confident superficially plausible-looking nonsense, then when someone points out that it is nonsense, they say they got it from a magic robot.

I think this is particularly common for non-tech people, who are more likely to believe that the magic robots are actually intelligent.

agumonkey 42 minutes ago

Most of the time it feels like a crutch to me. There has been a few moments where it unlocked deep motivation (by having a feel for the size of a solution based on chatgpt output) and one time a research project where any crazy idea I threw, it would imagine what it would entail in terms of semantics and then I was inspired even more.

The jury is Still out on what value these things will bring

lukan an hour ago

Nah, I feel smart to use it in a smart way to get stuff done faster than before.

code_for_monkey an hour ago

unfortunately im like you and we are in the minority. The manager class loves the llm and doesnt seem to consider its flaws like that.

freejazz an hour ago

> Is anyone really cocksure on the basis of LLM received knowledge?

Yeah, the stupid.

deadbabe an hour ago

If you feel dumber, it’s because you’re using the LLM to do raw work instead of using it for research. It should be a google/stackoverflow replacement, not a really powerful intellisense. You should feel no dumber than using google to investigate questions.

[-]

Insanity an hour ago

I don't think this is entirely accurate. If you look at this: https://www.media.mit.edu/publications/your-brain-on-chatgpt..., it shows that search engines do engage your brain _more_ than LLM usage. So you'll remember more through search engine use (and crawling the web 'manually') than by just prompting a chatbot.

kraftman an hour ago

I feel like when I talk to someone and they tell me a fact, that fact goes into a kind of holding space, where I apply a filter of 'who is this person that is telling me this thing to know what the thing they are telling me is'. There's how well I know them, there's the other beleifs I know they have, there's their professional experience and their personal experience. That fact then gets marked as 'probably a true fact' or 'mark beleives in aliens'.

When I use chatGPT I do the same before I've asked for the fact: how common is this problem? how well known is it? How likely is that chatgpt both knows it and can surface it? Afterwards I don't feel like I know something, I feel like I've got a faster broad idea of what facts might exist and where to look for them, a good set of things to investigate, etc.

[-]

medstrom 29 minutes ago

Reminds me of "default to null":

> The mental motion of “I didn’t really parse that paragraph, but sure, whatever, I’ll take the author’s word for it” is, in my introspective experience, absolutely identical to “I didn’t really parse that paragraph because it was bot-generated and didn’t make any sense so I couldn’t possibly have parsed it”, except that in the first case, I assume that the error lies with me rather than the text. This is not a safe assumption in a post-GPT2 world. Instead of “default to humility” (assume that when you don’t understand a passage, the passage is true and you’re just missing something) the ideal mental action in a world full of bots is “default to null” (if you don’t understand a passage, assume you’re in the same epistemic state as if you’d never read it at all.)

https://www.greaterwrong.com/posts/4AHXDwcGab5PhKhHT/humans-...

giraffe_lady 43 minutes ago

The important part of this is the "I feel like" bit. There's a fair but growing bit of research that the "fact" is more durable in your memory than the context, and over time, across a lot of information, you will lose some of the mappings and integrate things you "know" to be false into model of the world.

This more closely fits our models of cognition anyway. There is nothing really very like a filter in the human mind, though there are things that feel like them.

[-]

kraftman 35 minutes ago

Maybe but then thats the same wether I talk to chatGPT or a human isnt it? except with chatgpt i instantly verify what im looking for, whereas with a human i cant do that.

balderdash an hour ago

I ascribe the effect of LLMs as similar to reading the newspaper, when I learn about something I have no knowledge base in I come away feeling like I learned a lot. When I interact with a newspaper or LLM in an area where I have real domain expertise I realize they don’t know what they are talking about - which is concerning about the information I get from them about topics I don’t have that high level of domain expertise.

[-]

moffkalast 33 minutes ago

And why stop at newspapers, it's been a while since one could say books have any integrity, pretty much anyone can get anything into print these days. From political shenanigans to self help books designed to confirm people's biases to sell more units. Video's by far the hardest to fake but that's changing as well.

Regardless of what media you get your info from you have to be selective of what sources you trust. It's more true today than ever before, because the bar for creating content has never been lower.

GMoromisato 27 minutes ago

Speaking of uncertainty, I wish more people would accept their uncertainty with regards to the future of LLMs rather than dash off yet another cocksure article about how LLMs are {X}, and therefore {completely useless}|{world-changing}.

Quantity has a quality of its own. The first chess engine to beat Gary Kasparov wasn't fundamentally different than earlier ones--it just had a lot more compute power.

The original Google algorithm was trivial: rank web pages by incoming links--its superhuman power at giving us answers ("I'm feeling lucky") was/is entirely due to a massive trove of data.

And remember all the articles about how unreliable Wikipedia was? How can you trust something when anyone can edit a page? But again, the power of quantity--thousands or millions of eyeballs identifying errors--swamped any simple attacks.

Yes, LLMs are literally just matmul. How can anything useful, much less intelligent, emerge from multiplying numbers really fast? But then again, how can anything intelligent emerge from a wet mass of brain cells? After all, we're just meat. How can meat think?

Brendinooo an hour ago

>I think LLMs should not be seen as knowledge engines but as confidence engines.

This is a good line, and I think it tempers the "not just misinformed, but misinformed with conviction" observation quite a bit, because sometimes moving forward with an idea at less than 100% accuracy will still bring the best outcome.

Obviously that's a less than ideal thing to say, but imo (and in my experience as the former gifted student who struggles to ship) intelligent people tend to underestimate the importance of doing stuff with confidence.

[-]

shermantanktop 33 minutes ago

Confidence has multiple benefits. But one of those benefits is social - appearing confident triggers others to trust you, even when they shouldn’t.

Seeing others get burned by that pattern over and over can encourage hesitation and humility, and discourage confident action. It’s essentially an academic attitude and can be very unfortunate and self-defeating.

gopheryourshelf an hour ago

>“the problem with the world is that the stupid are cocksure, while the intelligent are full of doubt.”

Is it me or does everyone find that dumb people seem to use this statement more than ever?

[-]

stevenwoo an hour ago

It appears to be a paraphrasing of William Butler Yeats https://en.wikipedia.org/wiki/The_Second_Coming_(poem)

0xdeadbeefbabe 44 minutes ago

Ugh. You can be cocksure of your doubts. It's still confidence, duh.

aeve890 39 minutes ago

Everyone thinks they're the intelligent ones, of course. Which reinforces the repetition ad nauseam of Dunning Kruger. Which is on itself dumb AF because the effect described by Dunning and Kruger has been repeatedly exaggerated and misinterpreted. Which in turn is even dumber because Dunning-Kruger effect is debatable and reproducibility is weak at best.

[-]

the_af 22 minutes ago

Yeah, nobody who ever mentions the DK effect (myself included) ever stops to consider they might be in the "dumb" cohort ;)

We are all geniuses!

lukev an hour ago

I very much agree. I've been telling folks in trainings that I do that the term "artificial intelligence" is a cognitohazard, in that it pre-consciously steers you to conceptualize a LLM as an entity.

LLMs are cool and useful technology, but if you approach them with the attitude you're talking with an other, you are leaving yourself vulnerable to all sorts of cognitive distortions.

[-]

cgriswald 22 minutes ago

I don't think that is actually a problem. For decades people have believed that computers can't be wrong. Why, now, suddenly, would it be worse if they believed the computer wasn't a computer?

The larger problem is cognitive offloading. The people for whom this is a problem were already not doing the cognitive work of verifying facts and forming their own opinions. Maybe they watched the news, read a Wikipedia article, or listened to a TEDtalk, but the results are the same: an opinion they felt confident in without a verified basis.

To the extent this is on 'steroids', it is because they see it as an expert (in everything) computer and because it is so much faster than watching a TED talk or reading a long form article.

roywiggins 43 minutes ago

It certainly isn't helped by the RLHF and chat interface encouraging this. LLM providers have every incentive to make their users engage it like an other.

mmaunder an hour ago

Use an agent to create something with a non-negotiable outcome. Eg software that does something useful, or fails to, in a language you don’t program in. This is a helpful way to calibrate your own understanding of what LLMs are capable of.

travisgriggs 39 minutes ago

8 months or so ago, my quip regarding LLMs was “stochastic parrot.”

The term I’ve been using of late is “authority simulator.” My formative experiences with “authority figures” was a person who can speak with breadth and depth about a subject and who seems to have internalized it because they can answer quickly and thoroughly. Because LLMs do this so well, it’s really easy to feel like you’re talking to an authority in a subject. And even though my brain intellectually knows this isn’t true, emotionally, the simulation of authority is comforting.

jakubmazanec 40 minutes ago

It's possible that the Dunning-Kruger effect is not real, only a measurement or statistical artefact [1]. So it probably needs more and better studies.

[1] https://www.mcgill.ca/oss/article/critical-thinking/dunning-...

avree an hour ago

The title makes this incomprehensible. The author seemingly defines Dunning-Kruger as the... opposite of the Dunning-Kruger effect.

cachius an hour ago

From the title I thought this was a repost of 'AI is Dunning-Kruger as a service ' https://news.ycombinator.com/item?id=45851483

It is not.

chaostheory an hour ago

There are so many guardrails now that are being improved daily. This blog post is a year out of date. Not to mention that people know how to prompt better these days.

To make his point, you need specific examples from specific LLMs.

Chabsff an hour ago

> I feel like LLMs are a fairly boring technology. They are stochastic black boxes. The training is essentially run-of-the-mill statistical inference. There are some more recent innovations on software/hardware-level, but these are not LLM-specific really.

This is pretty ironic, considering the subject matter of that blog post. It's a super-common misconception that's gained very wide popularity due to reactionary (and, imo, rather poor) popular science reporting.

The author parroting that with confidence in a post about Dunner-Krugering gives me a bit of a chuckle.

[-]

yannyu an hour ago

What's the misconception? LLMs are probabilistic next-token prediction based on current context, right?

[-]

Chabsff 42 minutes ago

Yeah, but that's their interface. That informs surprisingly little about their inner workings.

ANNs are arbitrary function approximators. The training process uses statistical methods to identify a set of parameters that approximate the function as best as possible. That doesn't necessarily mean that the end result is equivalent to a very fancy multi-stage linear regression. It's a possible outcome of the process, but it's not the only possible outcome.

Looking at a LLMs I/O structure and training process is not enough to conclude much of anything. And that's the misconception.

[-]

yannyu 34 minutes ago

> Yeah, but that's their interface. That informs surprisingly little about their inner workings.

I'm not sure I follow. LLMs are probabilistic next-token prediction based on current context, that is a factual, foundational statement about the technology that runs all LLMs today.

We can ascribe other things to that, such as reasoning or knowledge or agency, but that doesn't change how they work. Their fundamental architecture is well understood, even if we allow for the idea that maybe there are some emergent behaviors that we haven't described completely.

> It's a possible outcome of the process, but it's not the only possible outcome.

Again, you can ascribe these other things to it, but to say that these external descriptions of outputs call into question the architecture that runs these LLMs is a strange thing to say.

> Looking at a LLMs I/O structure and training process is not enough to conclude much of anything. And that's the misconception.

I don't see how that's a misconception. We evaluate all pretty much everything by inputs and outputs. And we use those to infer internal state. Because that's all we're capable of in the real world.

miningape an hour ago

I also find it hard to get excited about black boxes - imo there's no real meat to the insights they give, only the shell of a "correct" answer

parineum 17 minutes ago

I'm not sure what claim your disputing or making with this.

What more are LLMs than statistical inference machines? I don't know that I'd assert that's all they are with confidence but all the configurations options I can play with during generation (Top K, Top P, Temperature, etc.) are all ways to _not_ select the most likely next token which leads me to believe that they are, in fact, just statistical inference machines.

AndrewKemendo an hour ago

Humans broadly have a tenuous grasp of “reality” and “truth.” Propagandists, spies and marketers know what philosophers of mind prove all too well: most humans do not perceive or interact with reality as it is, rather their perception of it as it contributes or contradicts their desired future.

Provide a person confidence in their opinion and they will not challenge it, as that would risk the reward of lend you live in a coherent universe.

The majority person has never heard the term “epistemology” despite the concept being central to how people derive coherence. Yet all these trite pieces written about AI and its intersectionality with knowledge claim some important technical distinction.

I’m hopeful that a crisis of epistemology is coming, though that’s probably too hopeful. I’m just enjoying the circus at this point

vehemenz an hour ago

I hate to comment on just a headline—thought I did read the article—but it's wrong enough to warrant correcting.

This is not what the Dunning-Kruger effect is. It's lacking metacognitive ability to understand one's own skill level. Overconfidence resulting from ignorance isn't the same thing. Joe Rogan propagated the version of this phenomenon that infiltrated public consciousness, and we've been stuck with it ever since.

Ironically, you can plug this story into your favorite LLM, and it will tell you the same thing. And, also ironically, the LLM will generally know more than you in most contexts, so anyone with a degree epistemic humility is better served taking it at least as seriously as their own thoughts and intuitions, if not at face value.