Emotion concepts and their function in a large language model

(anthropic.com)

27 points | by dnw an hour ago ago

8 comments

yoaso a minute ago

The desperation > blackmail finding stuck with me. If AI behavior shifts based on emotional states, maybe emotions are just a mechanism for changing behavior in the first place. If we think of human emotions the same way, just evolution's way of nudging behavior, the line between AI and humans starts to look a lot thinner.

emoII an hour ago

Super interesting, I wonder if this research will cause them to actually change their llm, like turning down the ”desperation neurons” to stop Claude from creating implementations for making a specific tests pass etc.

[-]

bethekind 43 minutes ago

They likely already have. You can use all caps and yell at Claude and it'll react normally, while doing do so with chatgpt scares it, resulting in timid answers

[-]

parasti 38 minutes ago

For me GPT always seems to get stuck in a particular state where it responds with a single sentence per paragraph, short sentences, and becomes weirdly philosophical. This eventually happens in every session. I wish I knew what triggers it because it's annoying and completely reduces its usefulness.

Chance-Device 20 minutes ago

> Note that none of this tells us whether language models actually feel anything or have subjective experiences.

You’ll never find that in the human brain either. There’s the machinery of neural correlates to experience, we never see the experience itself. That’s likely because the distinction is vacuous: they’re the same thing.

mci 34 minutes ago

The first and second principal components (joy-sadness and anger) explain only 41% of the variance. I wish the authors showed further principal components. Even principal components 1-4 would explain no more than 70% of the variance, which seems to contradict the popular theory that all human emotions are composed of 5 basic emotions: joy, sadness, anger, fear, and disgust, i.e. 4 dimensions.

idiotsecant 38 minutes ago

Its almost like LLMs have a vast, mute unconscious mind operating in the background, modeling relationships, assigning emotional state, and existing entirely without ego.

Sounds sort of like how certain monkey creatures might work.

[-]

beardedwizard 23 minutes ago

Nah it's exactly like they have been trained on this data and parrot it back when it statistically makes sense to do so.

You don't have to teach a monkey language for it to feel sadness.