Slightly tangential, but this is the reason I'm baffled why people think that AI-driven podcasts would ever be worth listening to.
If you can find an LLM+TTS generated podcast with even a FRACTION of the infectious energy as something like Good Job Brain, Radio Lab, or Ask Me Another, then I'll eat my hat. I don't even own one. I'll drive to the nearest hat store, purchase the tallest stovetop hat that I can afford and eat it.
I totally agree, but I also think it's just not geared towards us because of our age and preexisting beliefs of "what podcasts/news/videos are supposed to be". Think of kids around age <12. If they'll keep consuming AI-driven media, they'll take that as normal and won't blink twice if that becomes the de-facto standard in about 10 years.
It's the same as short-video format for me. Sure, I can watch some TikToks from time to time, but making them, or continuously getting my news from them? Yeah, that's not gonna fly for me. However, for my niblings (age 8-17) that's basically where they're getting all the "current affairs" from. Microtransaction is probably one of the easiest example as well. 15 years ago, anyone who bought games would laugh at you if you said every game that you paid $80 for would also have endless amount of small items that you can buy for real money. Right now? Well, kids that grew up with Fortnite and Roblox just think that's the norm.
The younger generation loves short-form, to-the-point stuff. Which is the exact opposite of what the current crop of GenAI makes. In a tiktok video, every sentence, every word counts because there's a time limit. If people don't engage with your content in the first 3 seconds, it's worthless. The video linked in another post starts off with 15 seconds of complete fluff. You'd have better engagement if you have a guy opening with BIIIG NEWS!! LABOR DAY BREAKFAST GOES HAYWIRE!!! and hook people.
GenAI is great at generating "stuff", but what makes good content isn't the quantity. What makes good content is when there's nothing left to take away.
Isn't it just a matter of time until AI gets trained to generate attention-grabbing videos? Also, "the first three seconds" isn't exactly the case anymore. There's a push for algorithm to favour videos that are longer than 1+ minutes. Which, to my understanding, is TikTok's way of fighting for YouTube's userbase.
The videos are longer in total length, but you've never seen the average TikTok/Insta user if you think people are letting videos play for more than a few seconds before scrolling onto the next one. This is why movie trailer videos now have a "trailer for the trailer" in the opening seconds with like "THE TRAILER FOR SONIC 3... STARTS NOW" with all of the most attention-grabbing scenes frontloaded.
> Isn't it just a matter of time until AI gets trained to...
blah blah yes "it's a matter or time" for every one of the myriad shortcomings of the technology to be resolved. If you're a true believer everything is "a matter of time". I'll believe it when I see it.
I'm also pretty aure you'll see it eventually... Consider the possibility that this is both a bubble waiting to pop, as well as the stuff that will shape the future. Kind of like the Internet around the year 2000.
Depends on the technology. It’s hard to look at the progress from December of 2022 till today, and think we won’t go further. Image generation is getting better every day. Parts of the video generation pipelines are also advancing.
Is that true of tiktoks in general? I feel like a lot of the short form videos out there purposefully bait the watcher and drag on for 3/4 of their runtime.
If you want people to watch your content there's definitely a time limit. I don't mean anything imposed by the platform; I just mean if people aren't interested within 3 seconds, they're scrolling to the next video in their feed.
Novelty grabs people's attention. A system based on the statistical analysis of past content won't do novelty. This seems like a very basic issue to me.
Novelty itself is easy, the hard part is the kind of novelty that is familiar enough to be engaging while also unusual enough to attract all the people bored by the mainstream.
Worse, as people attempt to automate novelty, they will be (and have been) repeatedly thwarted by the fact that the implicit patterns of the automation system themselves become patterns to be learned and recognised… which is why all modern popular music sounds so similar that this video got made 14 years ago: https://www.youtube.com/watch?v=5pidokakU4I
(This is already a thing with GenAI images made by people who just prompt-and-go, though artists using it as a tool can easily do much better).
But go too soon, be too novel, and you're in something like uncanny valley: When Saint-Saëns' Danse macabre was first performed, it was poorly received by violating then-current expectations, now it's considered a masterpiece.
> If people don't engage with your content in the first 3 seconds, it's worthless.
Is that based on a vigorous experience as a content creator on TikTok as a long form content creator or are you going off what you've heard about TikTok, or it's what your feed is full of? (which says more about you/your feed than it does about TikTok)
All of those immediately have something attention-grabbing within the first few seconds: picture of Mars' surface, run map showing funny human shape, text with "Gen Z programmers are crazy" prefacing the anecdote, going immediately into the IT-related rap, guy holding a big and cool-looking stick, "dealing with your 10x coworker" immediately showing the point of the video. The only one that doesn't is the SQL one I guess but that's a very low view count (relatively) on a niche channel.
So thanks for providing a bunch of examples that prove my point, I guess?
none of them are breathless "BIIIG NEWS!! LABOR DAY BREAKFAST GOES HAYWIRE!!!" attention demanding within the first three seconds imo, but, sure, whatever, you're totally right
I've never used tiktok and this post was... enlightening. The ClickUP HR guys are actually pretty funny... but wtf is this from that first channel you posted... lol? https://www.tiktok.com/@luckypassengers/video/74395775119539... - I mean, she's not wrong.
The idea that "TikTok is for Gen Z" seems like a very stale meme, although I only have anecdata to back that up.
Microtransactions are way older than 15 years. Wizards of the Coast was selling randomized MtG booster packs in 1993. I'm guessing that the earliest loot boxes for kids were baseball card packs, with very similar psychological purpose to today's game cosmetic collectibles.
I totally agree, but it’s just easier for people to accept something if they grew up with it. Sure there will always be people from older generations dabbling with new stuff. But quite a few people refuse to change their behaviour as the age. I wrote them as examples, because it is the biggest contrast I can see in online behaviour between myself and my nephews/nieces plus their circles.
I think the main difference is digital vs physical goods. I know it's minor since a card is just a cheap piece of cardboard, but it's still something tangible (unless the game cosmetics also include a physical item, in which case...I'm dumb).
"If they'll keep consuming AI-driven media, they'll take that as normal and won't blink twice if that becomes the de-facto standard in about 10 years."
There is a sense in which that is true.
However, we all develop taste, and in a hypothetical world where current AI ends up being the limit for another 10 or 20 years, eventually a lot of people would figure out that there's not as much "there" there as they supposed.
The wild card is that we probably don't live in that world, and it's difficult to guess how good AI is going to get.
Even now, the voice of AI that people are complaining about is just the current default voice, which will probably eventually be looked on about as favorably as bell-bottomed jeans or beehive hairdos. It is driven less by the technology itself than a complicated set of desires around not wanting to give the media nifty soundbites about how mean (or politically incorrect) AI can be, and not wanting to be sued. It's minimal prompt engineering even now to change it a lot. "Make a snappy TikTok video about whatever" is not something the tech is going to struggle with. In fact given the general poverty of the state space I would guess it'll outcompete humans pretty quickly.
"eventually a lot of people would figure out that there's not as much 'there' there as they supposed."
Let's be honest: most of us here know there is more 'there' on the myriad university-press books available free on Anna's Archive, than on HN. The reason we still hang out here is desire for socializing, laziness, or pathological doomscrolling; information density doesn't really factor into our choices.
Personally I don’t think it has anything to do with normalcy.
I don’t consume AI media because it’s not very good.
I watched a lot a bad movies and read a lot of bad books as a kid that I can’t stomach now because I’ve read better books and watched better movies. My guess is that kids today would do the same, assuming AI doesn’t improve.
> anyone who bought games would laugh at you if you said every game that you paid $80 for would also have endless amount of small items that you can buy for real money.
> kids that grew up with Fortnite and Roblox just think that's the norm.
How so? Fortnite and Roblox don’t cost $80, they are free to play with in game purchases.
They’re F2P, but the social pressure from your friends to buy the next skin, mini game and etc. completely normalizes the behaviour as you grow up. Then you take it as a “usual thing” and don’t bat an eye when a game you buy also has different skins in the game.
At least that was my perception when I played Call of Duty with my younger family members.
Agreed. I finally got around to listening to a NotebookLM generated podcast this weekend, and found it absolutely unlistenable.
For some reason the LLM seemed to latch on to the idea that one host should say something, and the other host should just repeat a key phrase in the sentence for emphasis -- and say nothing else, and they'd do it like multiple times in a sentence, over and over again throughout the entire thing.
Slightly less weird -- but it seems like the LLM caught on that a good narrative structure for a two-host podcast is that one host is the 'expert' on the topic, and the other host plays dumb and ask questions. Not an unreasonable narrative structure. Except that the hosts would seamlessly, and very weirdly, switch roles constantly throughout the podcast.
And ultimately the result was just a high level summary of the article I had provided. They told me in the intro and the outro about the interesting parts they were going to dive into, but they never actually got around to diving into those parts.
I tried it with something fairly abstract about a decision I was working on making. Fed it a bunch of information, details about me, my background, the factors I'm considering in the decision and the impact of getting it right/wrong.
It was interesting. I certainly wouldn't say it was useless, I think the contrived dialogue actually touched on some angles I hadn't considered and I think it was useful. Not in a 'oh shit it's clear to me now' kind of way but it definitely advanced my thinking.
Go back to the 1990s (even probably the 2000s?) and ask literally anyone what they think of the idea of people spending huge quantities of their time on this mortal plane watching videos of other people talking about how they do their make up and pick their outfits.
Are those videos "worth watching"? Are videos of people playing video games "worth watching"? Are videos of people opening products and saying out loud the information written on the box and also easily accessible on the public internet "worth watching"?
I'm not happy about these developments, but that isn't a factor of much concern to the people driving and following these trends, it turns out.
It's been a long time since "quality" and "success" have been decoupled. Which is to say - I hope you like the taste of hats.
Bad TV is a lot older than that. It probably seems weird nowadays to watch “I Dream of Jeanie” and “Gilligan’s Island” reruns because that’s what was on TV. Or how about game shows and soap operas? Daytime TV was the worst. But people watched.
At least in the past people were celebrities for a reason other than the number of followers they had on social. It'd be nice if we could return to a time when people were part of the public discourse because they were good at something (or their parents were rich, sadly).
No dog in this fight, I don't know what exactly you're doing, but I'd cautiously point out that it is, in fact, novel to the internet era to watch rando microcelebrities doing makeup step by step, no matter how long we delay acknowledging that with microquibbles.
I could throw in an example of how I'll watch boring videos of a couple playing with their birds for 90 minutes on Youtube. You can link me to the Wikipedia page on slow TV (via Norway), and it won't erase the simple, boring, straightforward, fact that it is a phenomenon.
I didn’t interpret the original comment to be about micro celebrities, but about people supposedly wasting time today in ways they didn’t beforehand. I agree that micro celebrities are a new phenomenon somewhat (although they are also sort of a return to more regional distribution of fame.) But that wasn’t the point being made.
> At least in the past people were celebrities for a reason other than the number of followers they had on social.
“Other than”? Obviously, as social media didn't exist. “Better than” or “more relevant to the things for which there celebrity status was used to direct attention”? Not particularly.
I'm pretty sure humans have been finding ways to unproductively waste time for millenium...
I'm not sure how watching lightweight videos on subjects you're curious about is any worse than how people wasted time in the past?
Personally I waste time watching videos on outdoor gear, coffee making equipment and PC hardware. I certainy don't regret it because I had no plans to do anything productive with that time either.
> It's been a long time since "quality" and "success" have been decoupled.
I think so, too.
I guess quality was a property of interest in the old days, because the path e.g. for commercial music was: Maximize profit -> Maximize sales -> Maximize what the target audience likes -> Maximize quality.
For TikTok etc. they bypass the market sales stuff and replace it by an 'algorithm', that optimizes for retention, which is tightly coupled to ad revenue. I imagine the algorithm as a function of many arguments.
Just relying on quality is an inefficient approximation in contrast to that.
I'm not disputing that people waste inordinate amounts of time running out the clock before they shuffle off this mortal coil. (cough every X demograph reacts to Y video cough).
Agreement to eat said head apparel is predicated upon "infectious energy" (i.e. quality) - NOT success. I'll draft up a more officious document later.
Note: I am the sole arbiter of what constitutes quality.
To me, unboxing videos are documentation, not entertainment. When I need to know exactly what's in a box and how it's packaged, an unboxing video is the only source of that information.
I have probably written this comment about a dozen times on HN already, but: I agree completely, because people don't listen to podcasts purely for information, they listen for information plus community, personality, or just a basic human connection.
If you're a content creator today, the best thing you can do to "AI-proof" your work is to inject your personality into it as much as possible. Preferably your physical personality, on video. The future of human content is being as human as possible. AI isn't going to replace that in our lifetimes, if ever.
> The future of human content is being as human as possible. AI isn't going to replace that in our lifetimes, if ever.
I have been working on machine learning algorithms for a long time. Since the time when telling someone I worked on AI was a conversation killer, even with technical people.
AI's are going to understand people better than people understand people. That could be in five years, maybe - many things are happening faster than expected. Or in 15 years. But that might be the outside range.
There is something about human psychology where the faster something changes, the less we are aware of the rate of change. We don't review the steps and their increasing rate that happened before we cared about that tech. We just accept the new thing that finally gets our attention like it was a one off event, instead of an accelerating compounding flood, and imagine it isn't really going to change much soon.
--
I know this isn't a popular view.
But what is happening is on the order of the transition to the first multi-cellular creatures, or the first bodies with specialized cells, the first nervous systems, the first brains, the first creatures to use language. Far bigger than advances such as writing or even the Internet. This is a transition to a completely new mode for the substrate of life and intelligence. The lack of dependency on any particular substrate.
"We", the new generation of things we are building, will have none of the limits and inefficiencies of our ancient slow DNA-style life. Or our stark physical bottlenecks on action, communication, or scaling, and our inability to understand or directly edit our own internal traits.
We will have to face many challenges in the coming years. It can't hurt to mindfully face them earlier than later.
Yes, this seems possible. I only wonder if it is too fragile to be self perpetuating. But we're here after all, and this place used to be just a wet rock.
What machine learning algorithm have you worked on that leads you to believe they are capable of having a rich internal cognitive representation anywhere close to any sentient conscious animal?
> AI's are going to understand people better than people understand people.
Maybe, but very little of the “data” that humans use to build their understanding of humans is recorded and available. If it were it’s not obvious it would be economical to train on. If it were economical it’s not obvious that current techniques would actually work that well and by definition no future techniques are known to work yet. I’m not inclined to say it will never happen but there are a few reasons to predict it’ll prove to be significantly harder to build AI that gets out of the uncanny valley that it’s currently in.
You are describing the current state of AI as if it were a stable point.
AI today is far ahead of two years ago. Every year for many years before that, deep learning models broke benchmark after benchmark before that breakout.
There is no indication of any slow down. The complete reverse - we are seeing dramatic acceleration of an already fast moving field.
Both research and resources are pouring into major improvements in multi-modal learning and learning via other means than human data. Such as reinforcement learning, competitive learning, interacting with systems they need to understand via simulated environments and directly.
This makes me think about attention span. Scenes in movies, sound bites, everything has been getting shorter over the decades. I know mine has gotten shorter, because I now watch highly edited and produced videos at 2x speed. Sometimes when I watch at 1x speed I find myself thinking "why does this person speak so slowly?"
Algorithmic content is likely to be even more densely packed with stimuli. At some point, will we find ourselves unable to attend to content produced by a human being because algorithmic content has wrecked our attention span?
People are judging AI by what abusers of AI put out there for lols and what they themselves can wring out of it. They haven't yet seen what a bunch of AAA professionals with nothing to lose can build and align.
Billions of dollars and all of Silicon Valley's focus has been spent over the last 2 years trying to get AI to work, the "AAA professionals" are already working on AI and I still have yet to see an AI generated product that's interesting or compelling.
Sorry but I don’t think this is much evidence of anything. The point at which an AI can imitate a live-streaming human being is decades away. By then, we will almost certainly have developed a “real Human ID” system that verifies one’s humanity. I wrote about this more here:
The idea that AI is just going to eat all human creative activity because technology accelerates quickly is not a real argument, nor does it stand up to any serious projections of the future.
AI is already eating their way up the creative ladder, this is 100% irrefutable. Interns, junior artists, junior developers, etc are all losing jobs to AI now.
The main problem for AI is it doesn't have a coherent creative direction or real output consistency. The second problem is that creativity thrives on novelty but AI likes to output common things. The first is solvable, probably within 5 years, and is going to hollow out creative departments everywhere. The second is effectively unsolvable, though you might find algorithms that mask it temporarily (I'm not sure if this is any different than what humans do).
We're going to end up with "rock star" teams of creative leads who have more agility in discovering novelty and curating aesthetics than AI models. They'll work with a small department comprised of a mix of AI wranglers and artisans that can put manual finishing touches on AI generated output. Overall creative department sizes are probably going to shrink to 20% of current levels but output will increase by 200%+.
Of course it’s possible I’m wrong. But if we make any sort of projection based on current developments, it would certainly seem that live-streaming AI indistinguishable from a human being is vastly beyond the capabilities of anything out today, and given current expenses and development times, seems to be at least a few decades in the future. To me that is an optimistic assumption, especially assuming that live presence or video quality will continue to improve as well (making it harder to fake.)
If you have a projection that says otherwise, I’d be glad to hear it. But if you don’t, then this idea is merely science fiction.
Making predictions about the future that are based on current accelerating developments are how you get people in the 1930s predicting flying cars by 2000.
For many people current AI created videos are already confused for real videos and vice versa.
When you follow up on technology by browsing HN, and see the latest advancements, its easier to see or hear the differences, because you know at what to look.
If I see on tv some badly encoded video, especially in fog or water surfaces, it immediately stands out, because I was working with video decoding during the time it was of much less quality. Most people will not notice.
Infectious energy is something I expect faked relatively easily, though I don't know your examples and doing it in AI might be a "first 90%" situation just like self driving cars; for me the problem is that they're fairly mediocre at the actual script — based on me putting a blog post I wrote into one and listening to what came out.
Given how many podcasts exist, I think you need to be at least 2 standard deviations above mean to even get noticed, 3 to be a moderate success, and 4 to be in the charts.
I'd guess AI is "good enough" to be 1 above average, as the NotebookLM voices sound like people speaking clearly and with some joy into decent microphones in sound isolating studios.
As someone who's struggled to really get into podcasts, I'm convinced that most people who enjoy podcasts don't really actively listen to them, they just like that extra bit of noise in the background while they do something else, with the added bonus that they might just listen at just the right time to pick up some interesting factoid.
Don't get me wrong, I like podcasts, Fall of Civilizations being a personal favorite of mine, it's just that my desire to try and actively listen to them requires that I carve out an hour (or much longer in the case of FoC, worth it) of my free-time to eliminate any distractions and focus. Pretty hard to do when there are so many other things I could be doing with my time besides sitting there trying my best to listen.
Anyway, I bring it up because I'm convinced that the people promoting AI-podcasts are mostly made up of the aforementioned people who just listen to them for noise.
I switched to improvised pods for this reason. Podcasts are for doing the dishes and mowing the lawn and playing factorio. I don't turn on anything with 'meat' unless I have a long car trip ahead of me. I just am not able to sit still and only listen without feeling like I could be doing something else, and when I'm doing something else I'm gonna start tuning out eventually.
Hey Riddle Riddle, Hello from the magic tavern, Artists on artists on artists on artists are my picks for now.
I used to listen to podcasts during my commute, when they took the place of listening to NPR or talk radio. Now that I work from home full time, I listen to them when I'm doing dishes or similar chores, going on walks around the city, and when I'm doing any significant amount of driving. I listen to a combination of comedy/history (love The Dollop), politics, and sci-fi commentary.
For the most part, I'm listening pretty actively, but if I'm just sitting there listening without a fairly mindless activity going on, I'll get distracted pretty quickly and find myself looking at my phone...
I agree. I think I listen to podcasts just for a bit of "social noise" when I'm doing something by myself and for occasionally picking up on something new I hadn't heard of before. For pure information content, I think they're actually very poor. It's not unlike listening to an old AM radio talk show. Hosts repeat themselves, engage in banter, and often oversimplify topics for the sake of a narrative.
I also think AI podcasts could become popular one day for people who just want some background noise and bits of trivia every once in a while. I would argue that a lot of YouTube channels I sometimes have in the background just summarize Wikipedia articles and don't have much of a personal touch anyway, so an AI could do the same thing.
I figured people listened to them while commuting. It seemed the best fit: lots of time, mostly little attention needed (except when you alight or are driving in adverse conditions).
I don't get them either since the hosts are typically "regular people" talking about often complicated subjects that they are by no means domain experts in.
I find the format of "Dumdum Host A read some articles about something last night and Dumdum Host B asks questions about it" especially grating, unless it's purely opinion-driven and then I still probably won't care unless I read about the hosts and find out they are probably people with opinions worth listening to.
I'd rather read a book or be with my own thoughts without having them be even more crowded by some randos telling me stuff they think they know
> I don't get them either since the hosts are typically "regular people" talking about often complicated subjects that they are by no means domain experts in.
Huh? Most podcasts I listen to feature either experts in the field in question or very skilled journalists compiling the opinions of experts.
This is partially true for me, but when I listen to podcasts it’s generally in what would otherwise be dead time for me such as when grocery shopping or exercising.
> Don't get me wrong, I like podcasts, Fall of Civilizations being a personal favorite of mine, it's just that my desire to try and actively listen to them requires that I carve out an hour (or much longer in the case of FoC, worth it) of my free-time to eliminate any distractions and focus.
I don't see why that should be true. Podcasts are great when coupled with mindless work like mowing the lawn, weeding the garden, stacking firewood, walking to the shop, driving etc. You can get virtually 100% out of a podcast while performing tasks like this.
I don't understand enjoying podcasts while just using them for background noise. Maybe because I actually listen, and most podcasts I listen to seemed to have an engaged audience. But then I like listening to people talk about philosophy, politics, sports, true crime, science, history.
> I'm convinced that most people who enjoy podcasts don't really actively listen to them,
Or you know not everyone is the same. Just because you struggle with something doesn’t mean that it is not easy and effortless for others.
I can’t dribble a basketball and walk at the same time, still I won’t make the claim that everyone who claims to enjoy playing basketball is somehow fudging it.
Having this strong of an opinion on something you admittedly don't understand (the appeal of podcasts) is not a great choice. Maybe ask people instead of making up stuff.
Excuse me for being an old geezer but at least AI bots don't tend to pepper their sentences with frequent utterances of 'like'. I don't normally find this speech mannerism annoying, and I do it myself too, but when 'like' is overused I switch off.
For this reason I don't listen to the otherwise highly entertaining CineFix podcast. Example: the recent episode discussing Kill Bill vol 1 contains 624 utterances in 75 mins (8.3 per min). IANAL I know.
The value in AI podcasts at the moment isn't in replacing human content, but in filling the niches that human content just doesn't cover. Doesn't matter if it's not the best podcast ever, when it's literally the only podcast on this planet discussing the topic you want to listen too.
I've heard so many people say that podcasts are just something they have playing in the background while they do other things that I have no trouble believing that they'd play an AI podcast in the same way.
I personally would not bother with AI generated podcasts, they're such low bar, why waste time where there's so much other great content to catch up on? But I think you may be right, I wouldn't be surprised if people take them in with no fuss. But then what do I care? What I care most is that they'll pollute the search space. I'd filter out all GenAI content if I had the option and Im guessing that will become an option soon.
From 2020 until now, we've gone from crude blurry or clearly generative artefacts to being able to create full professional illustrations based upon textual prompts. That is huge. Classic generative art techniques look like cave paintings compared to what the latest image generation models put out (and I'm not talking about "AI slop" type stuff that DALL-E does).
Similarly, tools could fabricate podcasts years ago that sounded terrible. Now we have NotebookLM doing a "reasonable" job with two cliched-sounding "hosts". In a few years, will they potentially be able to create something akin to a professionally produced podcast given some smart prompting? The progress made so far points to yes, and I haven't seen any evidence so far to be pessimistic about it happening.
I've heard amazing catchy songs from Suno and Udio. So much so they're still stuck in my head as earworms several months later. If they'd been streaming on youtube or spotify I wouldn't have given it a thought that they might be AI generated.
So, I can certainly imagine a podcast doing the same to some degree. Maybe not a podcast where AI wrote the script, but, a podcast where AI read a story dramatically doesn't seem too far off or, easier, a podcast that read news to me.
I think we will get a surprising amount of AI generated content in the future. During the first year of the Urkain invasion there was an enormous amount of AI voiced and scripted video content on YouTube. I think AI content will take over in the easy parts first. And over time take up more and more views.
Businesses/CEOs want to show profitability by spending less on human employees. Human consumers don't want to lose the human touch. Will be really interesting to see how many of the consumer facing AI startups actually make it.
Again - think of where we were two years ago. I never understand this hubris people have to think AI can never do X while being proved repeatedly wrong.
GPT styled LLMs were introduced back in 2018 so SIX years ago.
Have they gotten more COHERENT? Absolutely. Is coherence the same thing as NOVELTY? NOT EVEN REMOTELY. I've played with markov chains in the 90s that were capable of producing surprising content.
Unless there is a radical advancement in the underlying tech, I don't see any indication that they'll be capable of genuine novelty any time in the near future.
Take satire for example. I have yet to see anything come out of an LLM that felt particularly funny. I suppose if the height of humor for you is Dad jokes, reddit level word punnery, and the backs of snapple lids though that might be different.
If you have a particular style of witty observational humor that you prefer, providing the model some examples of that will help it generate better output. It's capable of generating pretty much anything if you prompt it the right way. For truly nuanced or novel things, you have to give it a nucleus of novelty and axis of nuance for it to start generating things in your desired space.
The main reason im not concerned about AI-based entertainment is the same reason I watch human chess players. It's not only about technical capabilities. I can't explain fully why though..
People think they want to use the Minority Report computer interface because it looks cool and advanced but they don't put the least amount of thought into it and realize it's terribly impractical. Our arms would get tired very quickly. A mouse resting on a table isn't further from ideal just because it was invented earlier.
Fooling people with the promises of AI is pretty simple. People are easy to fool. They like shiny objects.
> If you can find an LLM+TTS generated podcast with even a FRACTION of the infectious energy as something like Good Job Brain, Radio Lab, or Ask Me Another, then I'll eat my hat.
Come on, we can all see how much faster these things are getting better. In a few years it will be impossible to distinguish them from a real person.
Another few after that video will be the same.
I'm not saying it's a good thing, but clearly it's a thing that will happen.
I dont think LLM + TTS generated podcasts even make sense. The whole reason for long form content and podcasts is that people dislike fake and impersonal content.
I think there are a few niche users who just want to listen to the news as an audio book but the whole idea of an LLM generated podcast totally misunderstands why people want a podcast over the normal corporate drivel media.
Incredibly, no videos linked in an article about a video newscast. I think this is an example. The AI doesn't even pronounce "AI" correctly. Interestingly, it looks slightly offscreen just the way real newsreaders do when they're on prompter.
They do a lot right. There's interaction between the bots. They look kind of professional but not Los Angeles/New York quality, which is what you'd expect from a smallish market. Their movement is also kind of stiff and amateurish, which I believe is intentional.
Newscast teleprompters are directly in front of the camera lens specifically to not have them looking away from the lens. This has been a solved technology for decades. Perhaps you're thinking of cue cards or the teleprompters speakers use in a speech live audience type of setting?
Well you got me. I haven't watched broadcast TV for decades. I do see that phenomenon a lot with vloggers at present.
Am I also incorrect that they appear not to be looking directly at the camera? Looking back after your comment I still think it feels like they aren't.
The AI characters? Nothing about them feels right. The audio looks out of sync with the fake lip flaps. The dude's arm gestures are horrendous. It's AI/cgi, yet the fake background looks like a bad chromakey. You already pointed out some of the audio/voice issues.
> It's AI/cgi, yet the fake background looks like a bad chromakey.
I genuinely wonder if that's intentional. Maybe that looks more "realistic", and gives the audience something to stumble over that's not other AI artefacts?
The result to me looked more like a Zoom background replacement rather than a weather chromakey. That's what really looked bad to me. Even the full studio chromakey looks much better where the anchors are at a desk in front of a color vs a full studio.
I could only find this video. James' arms go up and down in an alarming manner. Rose has more natural movements but the voice you hear when her mouth moves is worse than the worst foreign film voice-over. Somehow the person and the voice mismatched in "tone" in a way that's hard to describe.
Looks like they're using something like motion matching to recover fragments of the presenter's motion that match the pronounced phonemes. The actors were probably instructed to avoid almost all movement to make sure it was blendable. That would explain why the guy's hand have such erratic and non-natural movement.
I was surprised at how game the AI was to pronounce the Hawaiian place names, it was confident enough that I assumed the pronunciation was correct. The article notes that it is butchering the placenames though.
To me this illustrates a common cognitive mismatch when evaluating AI, it can be confident in a way that most humans can't, and that misleading social cue is another reason we trust its output.
The first thing I thought of when I saw this is that some mid-tier dictatorships could replace a lot of their newscasters with this approach. Can always guarantee they’ll say what they need to say, and a lack of emotion is a plus maybe? Except with the dear leader passes then you bring out a real person for the emotions.
The way the mouths move are so far off from the words they're speaking that my first impressions would be they're just playing a video loop of these people talking about other random things and dubbing over it.
The problem with such "videocasts" (as opposed to "podcasts") is that there is another channel that the AI has to control: the video. Generating convincing video is much harder than generating convincing audio.
"James began his tenure as lead anchor, at which point he was unable to blink and his hands were constantly vibrating. He was demoted to second anchor in mid-October, where he began blinking more regularly and his odd hand vibration was replaced by a single emphatic gesture."
Human presenters aren't too expensive and are quite flexible, are easily replaced and can make or break a show. Yeah, there's the novelty factor now but am not sure how long it'll take until GenAI on broadcasts will signal second rate, subpar knockoff.
Love this so much, not in the way intended. Its just so strange! I can't put my finger on it, but feels like something Tim and Eric, or Tim Robinson, or even Alan Resnick would have a hand in.
There is a kind of aesthetic immanence to whole thing, everything is right on the surface. The voices are only just embodied "enough," their unearned confidence, their "affectations." The deadpan delivery on an absurd stage. The colors all feel like a cake that is too sweet. Like approximating a memory of a broadcast.
Yeah I can see all of it, but the problem with me is that I bet I would have watched it a few seconds and clicked off out of boredom, never suspecting they were AI. I really want to claim I would have figured it out instantly, but I can't. If I were a regular consumer I think I'd notice.
They mention right up front they're "powered by AI" but to me that implies they had help with article writing. I would not immediately assume from that statement that the actual newsreaders themselves were AI.
Ooh wow I hate this. Totally soulless appearance and delivery - and the robot fidgeting the dude is doing with his hands completely distracts from everything else. It’s totally normal to do that movement while speaking for emphasis - but whatever he’s doing does not look normal. (The mouths look nightmarish as well)
While I enjoyed the article, it’s just another in a line of the same article with different flavors and authors that all have the same fundamental error.
The prevailing counterargument against AI consistently hinges on the present state of AI rather than the trajectory of its development: but what matters is not the capabilities of AI as they exist today, but the astonishing velocity at which those capabilities are evolving.
We've been through this song and dance before. AI researchers make legitimately impressive breakthroughs in specific tasks, people extrapolate linear growth, the air comes out of the balloon after a couple years when it turns out we couldn't just throw progressively larger models at the problem to emulate human cognition.
I'm surprised that tech workers who should be the most skeptical about this kind of stuff end up being the most breathlessly hyperbolic. Everyone is so eager to get rich off the trend they discard any skepticism.
One problem is that people assume the end goal is to create a human-cognition-capable AI. I think it' pretty obvious by this point that that's not going to happen. But there is no need for that at all to still cause a huge disruption; let's say most current workers in roles that benefit from AI (copilot, writing, throwaway clipart, repetitive tasks, summarizing, looking up stuff, etc.) lead not even to job loss but fewer future jobs created - what does that mean for the incoming juniors? What does that mean for the people looking for that kind of work? It's not obvious at all how big of a problem that will create.
Two things can both be true. I keep arguing both sides because:
1 Unless you’re aware of near term limits you think AI is going to the stars next year.
2 Architectures change. The only thing that doesn’t change is that we generally push on, temporarily limits are usually overcome and there’s a lot riding on this. It’s not a smart move to bet against progress over the medium term. This is also where the real benefits and risks lie.
Is AI in general more like going to space, or string theory? One is hard but doable. Other is a tar pit for money and talent. We are all currently placing our bets.
point 2 is the thing that i think is most important to point out:
"architectures change"
sure, that's a fact. let me apply this to other fields:
"there could be a battery breakthrough that gives electric cars a 2,000 mile range."
"researchers could discover a new way to build nanorobots that attacks cancer directly and effectively cures all versions of it."
"we could invent a new sort of aviation engine that is 1,000x more fuel efficient than the current generation."
i mean, yeah, sure. i guess.
the current hype is built on LLMs, and being charitable "LLMs built with current architecture." there are other things in the works, but most of the current generation of AI hype are a limited number of algorithms and approaches, mixed and matched in different ways, with other features bolted on to try and corral them into behaving as we hope. it is much more realistic to expect that we are in the period of diminishing returns as far as investing in these approaches than it is to believe we'll continue to see earth-shattering growth. nothing has appeared that had the initial "wow" factor of the early versions of suno, or gpt, or dall-e, or sora, or whatever else.
this is clearly and plainly a tech bubble. it's so transparently one, it's hard to understand how folks aren't seeing it. all these tools have been in the mainstream for a pretty substantial period of time (relatively) and the honest truth is they're just not moving the needle in many industries. the most frequent practical application of them in practice has been summarization, editing, and rewriting, which is a neat little parlor trick - but all the same, it's indicative of the fact that they largely model language, so that's primarily what they're good at.
you can bet on something entirely new being discovered... but what? there just isn't anything inching closer to that general AI hype we're all hearing about that exists in the real world. i'm sure folks are cooking on things, but that doesn't mean they're near production-ready. saying "this isn't a bubble because one day someone might invent something that's actually good" is kind of giving away the game - the current generation isn't that good, and we can't point to the thing that's going to overtake it.
I think the mistake is that in the media it is extrapolating linear growth but in practice it is a wobbly path. And this wobbly path allows anyone to create whatever nearrative they want.
It reminds me of seeing headlines last week that NVDA is down after investors were losing faith after the last earnings. Then you look at the graph and NVDA is only like 10% off its all times high and still in and out of the most valuable company in the world.
Advancement is never linear. But I believe AI trends will continue up and to the right and even in 20 years when AI can do remarkably advanced things that we can barely comprehend, there will be internet commentary about how its all just hype.
There's a reason why so many of the people on the crypto grift in 2020-2022 have jumped to the AI grift. Same logic of "revolution is just around the corner", with the added mix of AGI millenarianism which hits a lot of nerds' soft spots.
This is confusing. We've never had a ChatGPT-like innovation before to compare to. Yes, there have been AI hype cycles for decades, but the difference is that we now have permanent invaluable and society-changing tools out of the current AI cycle, combined with hundreds of billions of dollars being thrown at it in a level of investment we've never seen before. Unless you're on the bleeding edge of AI research yourself, or one of the people investing billions of dollars, it is really unclear to me how anyone can have confidence of where AI is not going
There's a big difference between something that benefits productivity versus something that benefits humanity.
I think a good test for if it genuinely has changed society is if all gen AI were to disappear overnight. I would argue that nothing would really fundamentally change.
Contrast that with the sudden disappearance of the internet, or the combustion engine.
Because the hype will always outdistance the utility, on average.
Yes, you'll get peaks where innovation takes everyone by surprise.
Then the salesbots will pivot, catch up, and ingest the innovation into the pitch machine as per usual.
So yes, there is genuine innovation and surprise. That's not what is being discussed. It's the hype that inevitably overwhelms the innovation, and also inevitably pollutes the pool with increasing noise. That's just human nature, trying to make a quick buck from the new-hotness.
What is your time horizon? We're already at a date where people were saying these jobs would be gone. The people most optimistic about the trajectory of this technology were clearly wrong.
If you tell me AI newscasters will be fully functional in 10 or 15 years, I'll believe it. But that far in the future, I'd also believe news will be totally transformed due to some other technology we aren't thinking about yet.
But we don't know if AI development is following an exponential or sigmoid curve (actually we do kind of, now, but that's beside the point for this post.)
A wise institution will make decisions based on current capabilities, not a prognostication.
If investors didn't invest based on expected future performance, the share market would look completely different than it actually does today. So, I can't understand how anyone can claim that.
I honestly believe this specific case is a Pareto situation where the first 80% came at breakneck speeds, and the final 20% just won't come in a satisfactory way. And the uncanny valley effect demands a percentage that's extremely close to 100% before it has any use. Neural networks are great at approximations, but an approximate person is just a nightmare.
Isn't this essentially the same argument as "there are only 10 covid cases in this area, nothing to worry about"?
It's really missing the point, the point is whether or not exponential growth is happening or not. It doesn't with husbands, it does with covid, time will tell about AI.
Transformers have been around for 7 years, ChatGPT for 2. This isn't the first few samples of what could be exponential growth. These are several quarters of overpromise and underdelivery. The chatbot is cool and it outperforms what awful product search has become. But is it enough to support a 3.5 trillion dollar sized parts supplier?
Unfortunately, I see this as only a small temporary setback in the unstoppable quest to replace humans with AI to cut corporate costs.
The aspect of AI that isn't discussed enough is not what are those formerly employed going to do next, but rather that it will potentially represent the largest transfer of wealth in history as money which was going into employees' salaries is instead going to shareholders via those large companies who have the ability to produce the AIs (even if there are a host of small companies who act as an intermediate layer, such as the Israeli firm in this case).
I do think there is going to be a backlash because of our need for and desire for human connections, which AI won't provide. But that will become more expensive, and not the norm. Just like we can still buy farmer-fresh food but only a minority segment of the population can afford it because it costs 3x what is pushed as "food" at Walmart.
hmm no one has discussed this "carpenter group", who appear to be on a spree in 2024 of snatching up smaller news orgs, and cutting 50% of the staff. seems like a pretty big gamble on AI.
> Carpenter Media Group announced earlier this month it had acquired another group of newspapers, Pamplin Media Group, in Oregon. The company now owns and manages 180 newspapers in the United States and Canada.
> “We are committed to Everett, The Herald and all who have a stake in its success,” Chairman Todd Carpenter said. “We have deep sympathy for those affected by these changes and will work hard with each of them to see they are well-compensated through a transition period that helps them move forward in a positive way.
“Our responsibility to the community and our readers requires us to make difficult business decisions, and then invest in and organize our team to move forward to produce a product that continues to improve and serve. Our track record in this process is good."
Give it time y'all. This is the first inning, and I'm already terrified.
I encourage you to have a conversation with ChatGPT's advanced audio for a taste of what's to come. If you can, have someone talk to it in a relatively unpopular language like Afrikaans or even Icelandic--they will shit their pants.
I'm not so convinced. A lot of people have been noting the rapid development of ML systems in the past few years and projecting continued exponential improvements based on previous growth rates, but unbounded exponential improvement doesn't happen. This is an S curve and I think we're already well into the diminishing returns part of the curve. I think future growth is going to require increasingly impractical amounts of hardware for ever smaller levels of improvement.
People saying that this version is flawed but still amazing, so the next version is going to be perfect and mind blowing are going to be disappointed I think. The next version will be slightly better but still flawed. The version after that will be a touch better but annoyingly still not quite there. Constantly teasing you that full success is just around the corner while never quite getting there.
The article might give the impression of some kind of LLM SOTA model being slotted in, but visiting the website of the company they used, Caledo, it looks like they are using 2014 level technology.
Poorly animated CGI newscaster reading an "AI" written script. Really their tech looks awful and dated.
This is an extreme example, also this person would never lose their job that easy if there were laws protecting workers.
But what I have seen here in Sweden is AI voiceover used in news reports where people have hidden identities. So clearly this has taken a job from someone who used to do this voiceover. But it's working, we have to adapt to these small changes that AI is bringing.
the CocaCola commercial that everyone "hated" that "destroyed the brand equity" according to numerous news outlets solidified to me that we are passing a mark and this is the last desperate gasp of multiple industries coping with the rapid advancements AI is bringing.
Sure the Coke ad was a bit cringe. But the reality is in 5 years most ads you see will be AI and nobody will care.
That along with the Ben Affleck Rant that AI won't replace Hollywood. Watching Ben Affleck talking about how you can't replace the chemistry of actors working on set made me imagine what the conversations of famous stage actors were when film came along. You can argue film didn't replace the stage, after all broadway is still there, but it is not comparable to the economic influence of TV and Film.
In 10-20 years, human made Film, TV will still exist. But it will most likely be a small amount of the economic activity that AI produced media will be.
It doesn't matter if you care how terrible or how cringe AI is. Or that AI quality is worse than what existed before. What matters is in 5-20 years with a new generation what 90% of them care and if AI has more utility that not using AI. In the authors case in the short term for that particular medium, AI was not more useful. But in the long term AI will dominate every facet of our media.
This isn't an economically driven platform that is directly revenue driven like ads, movies, news articles, podcasts, etc for one.
go to reddit or even linkedin which is revenue driven and it is a sea of bots parroting the same tired points with comments that are 1 or 2 words different from other comments directly above them.
My entire point is that human production is not going to zero, but is moving to a minority.
I still go to the theatre and listen to live orchestras but I would be a fool to say that other mass media types didn't replace the cultural role of entertainment that those forms once had.
Thats irrelevant to what I am saying. You want all or nothing. I am saying AI will dominate in the majority and human generated content will still survive in the minority.
It doesn't matter what luddites like myself think. It matters what is economically viable to the masses.
Perhaps with such excellent examples of human thought such as your comment eventually ChatGPT comments won't seem so bad.
Is it just me or does everyone feel this way ? Its an instant put off for me when i hear that article was generated by AI. I would lose trust of that publication and move on.
People think AI is the magic answer for shitty content. In some cases AI is only adding speed not quality.
Clickbait title for a decent write-up. The author left Hawaii, and as indicated in the article the local press resorted to AI presenters because it was hard to keep talent on board. This is not a "job replacement" story so much as a "backup plan" story in my opinion.
The disruptive technology adoption process is at least somewhat predictable, including:
* Most people react negatively to the disruption because of the risk, fear of the unknown, and also because people don't like change.
* Also, early in the development and adoption of a new, immature technology, there is a lot of trial and error regarding applications, mostly error. Sometimes those failures are because the application isn't a good match for the technology; often they are because the technology isn't mature and will still improve and add major features, or because the details of the interface between technology and application are still being worked out.
* The people reacting negatively will point out those errors as signs that the technology is hopeless. Often they are wrong: the tech will mature and improve, and those people will be eclipsed.
The good news is, they won't remember it that way: First they laugh at you, then they tell you it isn't in the Bible (i.e., it violates the orthodoxy, the way things always have been done), then they say they knew it all along. AI is in stage 2.
At least this is funny in the way it looks robotic and dorky. When it gets better it will become increasingly scary to go on this road. Why would anyone want to watch anything gen AI other than a one off curiosity?
Slightly tangential, but this is the reason I'm baffled why people think that AI-driven podcasts would ever be worth listening to.
If you can find an LLM+TTS generated podcast with even a FRACTION of the infectious energy as something like Good Job Brain, Radio Lab, or Ask Me Another, then I'll eat my hat. I don't even own one. I'll drive to the nearest hat store, purchase the tallest stovetop hat that I can afford and eat it.
I totally agree, but I also think it's just not geared towards us because of our age and preexisting beliefs of "what podcasts/news/videos are supposed to be". Think of kids around age <12. If they'll keep consuming AI-driven media, they'll take that as normal and won't blink twice if that becomes the de-facto standard in about 10 years.
It's the same as short-video format for me. Sure, I can watch some TikToks from time to time, but making them, or continuously getting my news from them? Yeah, that's not gonna fly for me. However, for my niblings (age 8-17) that's basically where they're getting all the "current affairs" from. Microtransaction is probably one of the easiest example as well. 15 years ago, anyone who bought games would laugh at you if you said every game that you paid $80 for would also have endless amount of small items that you can buy for real money. Right now? Well, kids that grew up with Fortnite and Roblox just think that's the norm.
The younger generation loves short-form, to-the-point stuff. Which is the exact opposite of what the current crop of GenAI makes. In a tiktok video, every sentence, every word counts because there's a time limit. If people don't engage with your content in the first 3 seconds, it's worthless. The video linked in another post starts off with 15 seconds of complete fluff. You'd have better engagement if you have a guy opening with BIIIG NEWS!! LABOR DAY BREAKFAST GOES HAYWIRE!!! and hook people.
GenAI is great at generating "stuff", but what makes good content isn't the quantity. What makes good content is when there's nothing left to take away.
What's funny is every GenAI "incredible email/essay" would be better communicated with the prompt used to generate it.
"Your essay must be at least X words" has always been an impediment to truly good writing skills, but now it's just worthless.
Isn't it just a matter of time until AI gets trained to generate attention-grabbing videos? Also, "the first three seconds" isn't exactly the case anymore. There's a push for algorithm to favour videos that are longer than 1+ minutes. Which, to my understanding, is TikTok's way of fighting for YouTube's userbase.
The videos are longer in total length, but you've never seen the average TikTok/Insta user if you think people are letting videos play for more than a few seconds before scrolling onto the next one. This is why movie trailer videos now have a "trailer for the trailer" in the opening seconds with like "THE TRAILER FOR SONIC 3... STARTS NOW" with all of the most attention-grabbing scenes frontloaded.
> Isn't it just a matter of time until AI gets trained to...
blah blah yes "it's a matter or time" for every one of the myriad shortcomings of the technology to be resolved. If you're a true believer everything is "a matter of time". I'll believe it when I see it.
I'm also pretty aure you'll see it eventually... Consider the possibility that this is both a bubble waiting to pop, as well as the stuff that will shape the future. Kind of like the Internet around the year 2000.
Depends on the technology. It’s hard to look at the progress from December of 2022 till today, and think we won’t go further. Image generation is getting better every day. Parts of the video generation pipelines are also advancing.
Is that true of tiktoks in general? I feel like a lot of the short form videos out there purposefully bait the watcher and drag on for 3/4 of their runtime.
I ... don't think there's a time limit on TikToks unless you mean that 60 minutes is a time limit. Are you thinking of Vine?
If you want people to watch your content there's definitely a time limit. I don't mean anything imposed by the platform; I just mean if people aren't interested within 3 seconds, they're scrolling to the next video in their feed.
That's a pretty new thing though, in 2020 it was 1 min I believe, and most people skipped after 15-30s
Then it got increased to 3 min and now 10/60 min
Counterpoint, GenAI is great at copying styles and typically works best with shorter content.
For example, I could very easily see GenAI being able to produce 1 million TikTok dance challenges.
Which will make them completely worthless by dilution and not stand out. Oops.
Novelty grabs people's attention. A system based on the statistical analysis of past content won't do novelty. This seems like a very basic issue to me.
Novelty itself is easy, the hard part is the kind of novelty that is familiar enough to be engaging while also unusual enough to attract all the people bored by the mainstream.
Worse, as people attempt to automate novelty, they will be (and have been) repeatedly thwarted by the fact that the implicit patterns of the automation system themselves become patterns to be learned and recognised… which is why all modern popular music sounds so similar that this video got made 14 years ago: https://www.youtube.com/watch?v=5pidokakU4I
(This is already a thing with GenAI images made by people who just prompt-and-go, though artists using it as a tool can easily do much better).
But go too soon, be too novel, and you're in something like uncanny valley: When Saint-Saëns' Danse macabre was first performed, it was poorly received by violating then-current expectations, now it's considered a masterpiece.
> If people don't engage with your content in the first 3 seconds, it's worthless.
Is that based on a vigorous experience as a content creator on TikTok as a long form content creator or are you going off what you've heard about TikTok, or it's what your feed is full of? (which says more about you/your feed than it does about TikTok)
https://www.tiktok.com/t/ZTYheCgBq/ 1.1 M views
https://www.tiktok.com/t/ZTYhephfK/ 1.2 M views
or for some more niche stuff which isn't "BIIIG NEWS!! LABOR DAY BREAKFAST GOES HAYWIRE!!!" level of intro:
https://www.tiktok.com/t/ZTYheTVsh/ 54.6 k views
https://www.tiktok.com/t/ZTYheE9m2/
36.3 k views
https://www.tiktok.com/t/ZTYheoK2G/
416 k views
This voice-over definitely isn't going for attention grabbing
https://www.tiktok.com/t/ZTYhe7qTT/ 16 k views
https://www.tiktok.com/t/ZTYhdmJxq/ 64 k views
okay finally found something that's 5 mins long https://www.tiktok.com/t/ZTYhdAfNj/ 270 k views
All of those immediately have something attention-grabbing within the first few seconds: picture of Mars' surface, run map showing funny human shape, text with "Gen Z programmers are crazy" prefacing the anecdote, going immediately into the IT-related rap, guy holding a big and cool-looking stick, "dealing with your 10x coworker" immediately showing the point of the video. The only one that doesn't is the SQL one I guess but that's a very low view count (relatively) on a niche channel.
So thanks for providing a bunch of examples that prove my point, I guess?
none of them are breathless "BIIIG NEWS!! LABOR DAY BREAKFAST GOES HAYWIRE!!!" attention demanding within the first three seconds imo, but, sure, whatever, you're totally right
I've never used tiktok and this post was... enlightening. The ClickUP HR guys are actually pretty funny... but wtf is this from that first channel you posted... lol? https://www.tiktok.com/@luckypassengers/video/74395775119539... - I mean, she's not wrong.
The idea that "TikTok is for Gen Z" seems like a very stale meme, although I only have anecdata to back that up.
Microtransactions are way older than 15 years. Wizards of the Coast was selling randomized MtG booster packs in 1993. I'm guessing that the earliest loot boxes for kids were baseball card packs, with very similar psychological purpose to today's game cosmetic collectibles.
I totally agree, but it’s just easier for people to accept something if they grew up with it. Sure there will always be people from older generations dabbling with new stuff. But quite a few people refuse to change their behaviour as the age. I wrote them as examples, because it is the biggest contrast I can see in online behaviour between myself and my nephews/nieces plus their circles.
I think the main difference is digital vs physical goods. I know it's minor since a card is just a cheap piece of cardboard, but it's still something tangible (unless the game cosmetics also include a physical item, in which case...I'm dumb).
"If they'll keep consuming AI-driven media, they'll take that as normal and won't blink twice if that becomes the de-facto standard in about 10 years."
There is a sense in which that is true.
However, we all develop taste, and in a hypothetical world where current AI ends up being the limit for another 10 or 20 years, eventually a lot of people would figure out that there's not as much "there" there as they supposed.
The wild card is that we probably don't live in that world, and it's difficult to guess how good AI is going to get.
Even now, the voice of AI that people are complaining about is just the current default voice, which will probably eventually be looked on about as favorably as bell-bottomed jeans or beehive hairdos. It is driven less by the technology itself than a complicated set of desires around not wanting to give the media nifty soundbites about how mean (or politically incorrect) AI can be, and not wanting to be sued. It's minimal prompt engineering even now to change it a lot. "Make a snappy TikTok video about whatever" is not something the tech is going to struggle with. In fact given the general poverty of the state space I would guess it'll outcompete humans pretty quickly.
"eventually a lot of people would figure out that there's not as much 'there' there as they supposed."
Let's be honest: most of us here know there is more 'there' on the myriad university-press books available free on Anna's Archive, than on HN. The reason we still hang out here is desire for socializing, laziness, or pathological doomscrolling; information density doesn't really factor into our choices.
Personally I don’t think it has anything to do with normalcy.
I don’t consume AI media because it’s not very good.
I watched a lot a bad movies and read a lot of bad books as a kid that I can’t stomach now because I’ve read better books and watched better movies. My guess is that kids today would do the same, assuming AI doesn’t improve.
Blipverts are probably the future and not just for ads but for content as well
I regularly watch 10 minute or longer TikToks and I'm sure you've turned off a YouTube video in the first 10 seconds.
Fortnite and roblox are f2p though.
> anyone who bought games would laugh at you if you said every game that you paid $80 for would also have endless amount of small items that you can buy for real money.
> kids that grew up with Fortnite and Roblox just think that's the norm.
How so? Fortnite and Roblox don’t cost $80, they are free to play with in game purchases.
They’re F2P, but the social pressure from your friends to buy the next skin, mini game and etc. completely normalizes the behaviour as you grow up. Then you take it as a “usual thing” and don’t bat an eye when a game you buy also has different skins in the game.
At least that was my perception when I played Call of Duty with my younger family members.
Agreed. I finally got around to listening to a NotebookLM generated podcast this weekend, and found it absolutely unlistenable.
For some reason the LLM seemed to latch on to the idea that one host should say something, and the other host should just repeat a key phrase in the sentence for emphasis -- and say nothing else, and they'd do it like multiple times in a sentence, over and over again throughout the entire thing.
Slightly less weird -- but it seems like the LLM caught on that a good narrative structure for a two-host podcast is that one host is the 'expert' on the topic, and the other host plays dumb and ask questions. Not an unreasonable narrative structure. Except that the hosts would seamlessly, and very weirdly, switch roles constantly throughout the podcast.
And ultimately the result was just a high level summary of the article I had provided. They told me in the intro and the outro about the interesting parts they were going to dive into, but they never actually got around to diving into those parts.
I tried it with something fairly abstract about a decision I was working on making. Fed it a bunch of information, details about me, my background, the factors I'm considering in the decision and the impact of getting it right/wrong.
It was interesting. I certainly wouldn't say it was useless, I think the contrived dialogue actually touched on some angles I hadn't considered and I think it was useful. Not in a 'oh shit it's clear to me now' kind of way but it definitely advanced my thinking.
Hmm, does Google not use the Claude or ChatGPT API for it? I still don't hear of people chatting with Gemini nearly as often as with the other two.
Edit: looks like the subreddit is still called Bard. Well played, Internets. https://www.reddit.com/r/Bard/comments/1g0egad/gemini_vs_not...
There are a lot of "uh huh", "yeah", "agree" types of statements in odd places as well.
Go back to the 1990s (even probably the 2000s?) and ask literally anyone what they think of the idea of people spending huge quantities of their time on this mortal plane watching videos of other people talking about how they do their make up and pick their outfits.
Are those videos "worth watching"? Are videos of people playing video games "worth watching"? Are videos of people opening products and saying out loud the information written on the box and also easily accessible on the public internet "worth watching"?
I'm not happy about these developments, but that isn't a factor of much concern to the people driving and following these trends, it turns out.
It's been a long time since "quality" and "success" have been decoupled. Which is to say - I hope you like the taste of hats.
Bad TV is a lot older than that. It probably seems weird nowadays to watch “I Dream of Jeanie” and “Gilligan’s Island” reruns because that’s what was on TV. Or how about game shows and soap operas? Daytime TV was the worst. But people watched.
Every now and then I play the Gilligan's Island theme song because it's so catchy. Still no idea what the show is about
And pulp fiction is even older!
videos of other people talking about how they do their make up and pick their outfits.
Doesn't seem that much different from a fashion magazine interview about what X celebrity likes to wear. Those have been around for quite a long time.
At least in the past people were celebrities for a reason other than the number of followers they had on social. It'd be nice if we could return to a time when people were part of the public discourse because they were good at something (or their parents were rich, sadly).
That’s not true at all. Fame for no reason or dumb reasons is hardly a 21st century phenomenon.
The scale isn't even close, and it's become normalized now. I think it bears talking about as a 21st century phenomenon.
No dog in this fight, I don't know what exactly you're doing, but I'd cautiously point out that it is, in fact, novel to the internet era to watch rando microcelebrities doing makeup step by step, no matter how long we delay acknowledging that with microquibbles.
I could throw in an example of how I'll watch boring videos of a couple playing with their birds for 90 minutes on Youtube. You can link me to the Wikipedia page on slow TV (via Norway), and it won't erase the simple, boring, straightforward, fact that it is a phenomenon.
I didn’t interpret the original comment to be about micro celebrities, but about people supposedly wasting time today in ways they didn’t beforehand. I agree that micro celebrities are a new phenomenon somewhat (although they are also sort of a return to more regional distribution of fame.) But that wasn’t the point being made.
> At least in the past people were celebrities for a reason other than the number of followers they had on social.
“Other than”? Obviously, as social media didn't exist. “Better than” or “more relevant to the things for which there celebrity status was used to direct attention”? Not particularly.
Why does it matter if the celebrity was good at something or not if you are just discussing what they wear?
I'm pretty sure humans have been finding ways to unproductively waste time for millenium...
I'm not sure how watching lightweight videos on subjects you're curious about is any worse than how people wasted time in the past?
Personally I waste time watching videos on outdoor gear, coffee making equipment and PC hardware. I certainy don't regret it because I had no plans to do anything productive with that time either.
> It's been a long time since "quality" and "success" have been decoupled.
I think so, too.
I guess quality was a property of interest in the old days, because the path e.g. for commercial music was: Maximize profit -> Maximize sales -> Maximize what the target audience likes -> Maximize quality.
For TikTok etc. they bypass the market sales stuff and replace it by an 'algorithm', that optimizes for retention, which is tightly coupled to ad revenue. I imagine the algorithm as a function of many arguments.
Just relying on quality is an inefficient approximation in contrast to that.
I'm not disputing that people waste inordinate amounts of time running out the clock before they shuffle off this mortal coil. (cough every X demograph reacts to Y video cough).
Agreement to eat said head apparel is predicated upon "infectious energy" (i.e. quality) - NOT success. I'll draft up a more officious document later.
Note: I am the sole arbiter of what constitutes quality.
You also seem to be the sole arbiter of what constitutes worthy activities.
To me, unboxing videos are documentation, not entertainment. When I need to know exactly what's in a box and how it's packaged, an unboxing video is the only source of that information.
What's in the box is almost always on the box, and how it's packaged seems to be very useless information.
I have probably written this comment about a dozen times on HN already, but: I agree completely, because people don't listen to podcasts purely for information, they listen for information plus community, personality, or just a basic human connection.
If you're a content creator today, the best thing you can do to "AI-proof" your work is to inject your personality into it as much as possible. Preferably your physical personality, on video. The future of human content is being as human as possible. AI isn't going to replace that in our lifetimes, if ever.
> The future of human content is being as human as possible. AI isn't going to replace that in our lifetimes, if ever.
I have been working on machine learning algorithms for a long time. Since the time when telling someone I worked on AI was a conversation killer, even with technical people.
AI's are going to understand people better than people understand people. That could be in five years, maybe - many things are happening faster than expected. Or in 15 years. But that might be the outside range.
There is something about human psychology where the faster something changes, the less we are aware of the rate of change. We don't review the steps and their increasing rate that happened before we cared about that tech. We just accept the new thing that finally gets our attention like it was a one off event, instead of an accelerating compounding flood, and imagine it isn't really going to change much soon.
--
I know this isn't a popular view.
But what is happening is on the order of the transition to the first multi-cellular creatures, or the first bodies with specialized cells, the first nervous systems, the first brains, the first creatures to use language. Far bigger than advances such as writing or even the Internet. This is a transition to a completely new mode for the substrate of life and intelligence. The lack of dependency on any particular substrate.
"We", the new generation of things we are building, will have none of the limits and inefficiencies of our ancient slow DNA-style life. Or our stark physical bottlenecks on action, communication, or scaling, and our inability to understand or directly edit our own internal traits.
We will have to face many challenges in the coming years. It can't hurt to mindfully face them earlier than later.
Yes, this seems possible. I only wonder if it is too fragile to be self perpetuating. But we're here after all, and this place used to be just a wet rock.
What machine learning algorithm have you worked on that leads you to believe they are capable of having a rich internal cognitive representation anywhere close to any sentient conscious animal?
> AI's are going to understand people better than people understand people.
Maybe, but very little of the “data” that humans use to build their understanding of humans is recorded and available. If it were it’s not obvious it would be economical to train on. If it were economical it’s not obvious that current techniques would actually work that well and by definition no future techniques are known to work yet. I’m not inclined to say it will never happen but there are a few reasons to predict it’ll prove to be significantly harder to build AI that gets out of the uncanny valley that it’s currently in.
You are describing the current state of AI as if it were a stable point.
AI today is far ahead of two years ago. Every year for many years before that, deep learning models broke benchmark after benchmark before that breakout.
There is no indication of any slow down. The complete reverse - we are seeing dramatic acceleration of an already fast moving field.
Both research and resources are pouring into major improvements in multi-modal learning and learning via other means than human data. Such as reinforcement learning, competitive learning, interacting with systems they need to understand via simulated environments and directly.
This makes me think about attention span. Scenes in movies, sound bites, everything has been getting shorter over the decades. I know mine has gotten shorter, because I now watch highly edited and produced videos at 2x speed. Sometimes when I watch at 1x speed I find myself thinking "why does this person speak so slowly?"
Algorithmic content is likely to be even more densely packed with stimuli. At some point, will we find ourselves unable to attend to content produced by a human being because algorithmic content has wrecked our attention span?
People are judging AI by what abusers of AI put out there for lols and what they themselves can wring out of it. They haven't yet seen what a bunch of AAA professionals with nothing to lose can build and align.
Billions of dollars and all of Silicon Valley's focus has been spent over the last 2 years trying to get AI to work, the "AAA professionals" are already working on AI and I still have yet to see an AI generated product that's interesting or compelling.
I find your ideas intreguing, and I wish to subscribe to your newsletter
Sorry but I don’t think this is much evidence of anything. The point at which an AI can imitate a live-streaming human being is decades away. By then, we will almost certainly have developed a “real Human ID” system that verifies one’s humanity. I wrote about this more here:
https://news.ycombinator.com/item?id=42154928
The idea that AI is just going to eat all human creative activity because technology accelerates quickly is not a real argument, nor does it stand up to any serious projections of the future.
AI is already eating their way up the creative ladder, this is 100% irrefutable. Interns, junior artists, junior developers, etc are all losing jobs to AI now.
The main problem for AI is it doesn't have a coherent creative direction or real output consistency. The second problem is that creativity thrives on novelty but AI likes to output common things. The first is solvable, probably within 5 years, and is going to hollow out creative departments everywhere. The second is effectively unsolvable, though you might find algorithms that mask it temporarily (I'm not sure if this is any different than what humans do).
We're going to end up with "rock star" teams of creative leads who have more agility in discovering novelty and curating aesthetics than AI models. They'll work with a small department comprised of a mix of AI wranglers and artisans that can put manual finishing touches on AI generated output. Overall creative department sizes are probably going to shrink to 20% of current levels but output will increase by 200%+.
How do you know it is decades away? A few years ago did you think LLMs would be where they are today?
Is it possible you’re wrong?
Of course it’s possible I’m wrong. But if we make any sort of projection based on current developments, it would certainly seem that live-streaming AI indistinguishable from a human being is vastly beyond the capabilities of anything out today, and given current expenses and development times, seems to be at least a few decades in the future. To me that is an optimistic assumption, especially assuming that live presence or video quality will continue to improve as well (making it harder to fake.)
If you have a projection that says otherwise, I’d be glad to hear it. But if you don’t, then this idea is merely science fiction.
Making predictions about the future that are based on current accelerating developments are how you get people in the 1930s predicting flying cars by 2000.
For many people current AI created videos are already confused for real videos and vice versa.
When you follow up on technology by browsing HN, and see the latest advancements, its easier to see or hear the differences, because you know at what to look.
If I see on tv some badly encoded video, especially in fog or water surfaces, it immediately stands out, because I was working with video decoding during the time it was of much less quality. Most people will not notice.
Infectious energy is something I expect faked relatively easily, though I don't know your examples and doing it in AI might be a "first 90%" situation just like self driving cars; for me the problem is that they're fairly mediocre at the actual script — based on me putting a blog post I wrote into one and listening to what came out.
Given how many podcasts exist, I think you need to be at least 2 standard deviations above mean to even get noticed, 3 to be a moderate success, and 4 to be in the charts.
I'd guess AI is "good enough" to be 1 above average, as the NotebookLM voices sound like people speaking clearly and with some joy into decent microphones in sound isolating studios.
As someone who's struggled to really get into podcasts, I'm convinced that most people who enjoy podcasts don't really actively listen to them, they just like that extra bit of noise in the background while they do something else, with the added bonus that they might just listen at just the right time to pick up some interesting factoid.
Don't get me wrong, I like podcasts, Fall of Civilizations being a personal favorite of mine, it's just that my desire to try and actively listen to them requires that I carve out an hour (or much longer in the case of FoC, worth it) of my free-time to eliminate any distractions and focus. Pretty hard to do when there are so many other things I could be doing with my time besides sitting there trying my best to listen.
Anyway, I bring it up because I'm convinced that the people promoting AI-podcasts are mostly made up of the aforementioned people who just listen to them for noise.
I switched to improvised pods for this reason. Podcasts are for doing the dishes and mowing the lawn and playing factorio. I don't turn on anything with 'meat' unless I have a long car trip ahead of me. I just am not able to sit still and only listen without feeling like I could be doing something else, and when I'm doing something else I'm gonna start tuning out eventually.
Hey Riddle Riddle, Hello from the magic tavern, Artists on artists on artists on artists are my picks for now.
I used to listen to podcasts during my commute, when they took the place of listening to NPR or talk radio. Now that I work from home full time, I listen to them when I'm doing dishes or similar chores, going on walks around the city, and when I'm doing any significant amount of driving. I listen to a combination of comedy/history (love The Dollop), politics, and sci-fi commentary.
For the most part, I'm listening pretty actively, but if I'm just sitting there listening without a fairly mindless activity going on, I'll get distracted pretty quickly and find myself looking at my phone...
I agree. I think I listen to podcasts just for a bit of "social noise" when I'm doing something by myself and for occasionally picking up on something new I hadn't heard of before. For pure information content, I think they're actually very poor. It's not unlike listening to an old AM radio talk show. Hosts repeat themselves, engage in banter, and often oversimplify topics for the sake of a narrative.
I also think AI podcasts could become popular one day for people who just want some background noise and bits of trivia every once in a while. I would argue that a lot of YouTube channels I sometimes have in the background just summarize Wikipedia articles and don't have much of a personal touch anyway, so an AI could do the same thing.
I figured people listened to them while commuting. It seemed the best fit: lots of time, mostly little attention needed (except when you alight or are driving in adverse conditions).
I don't get them either since the hosts are typically "regular people" talking about often complicated subjects that they are by no means domain experts in.
I find the format of "Dumdum Host A read some articles about something last night and Dumdum Host B asks questions about it" especially grating, unless it's purely opinion-driven and then I still probably won't care unless I read about the hosts and find out they are probably people with opinions worth listening to.
I'd rather read a book or be with my own thoughts without having them be even more crowded by some randos telling me stuff they think they know
Not all podcasts are Joe Rogan
> I don't get them either since the hosts are typically "regular people" talking about often complicated subjects that they are by no means domain experts in.
Huh? Most podcasts I listen to feature either experts in the field in question or very skilled journalists compiling the opinions of experts.
This is partially true for me, but when I listen to podcasts it’s generally in what would otherwise be dead time for me such as when grocery shopping or exercising.
> Don't get me wrong, I like podcasts, Fall of Civilizations being a personal favorite of mine, it's just that my desire to try and actively listen to them requires that I carve out an hour (or much longer in the case of FoC, worth it) of my free-time to eliminate any distractions and focus.
I don't see why that should be true. Podcasts are great when coupled with mindless work like mowing the lawn, weeding the garden, stacking firewood, walking to the shop, driving etc. You can get virtually 100% out of a podcast while performing tasks like this.
I don't understand enjoying podcasts while just using them for background noise. Maybe because I actually listen, and most podcasts I listen to seemed to have an engaged audience. But then I like listening to people talk about philosophy, politics, sports, true crime, science, history.
> I'm convinced that most people who enjoy podcasts don't really actively listen to them,
Or you know not everyone is the same. Just because you struggle with something doesn’t mean that it is not easy and effortless for others.
I can’t dribble a basketball and walk at the same time, still I won’t make the claim that everyone who claims to enjoy playing basketball is somehow fudging it.
Having this strong of an opinion on something you admittedly don't understand (the appeal of podcasts) is not a great choice. Maybe ask people instead of making up stuff.
Excuse me for being an old geezer but at least AI bots don't tend to pepper their sentences with frequent utterances of 'like'. I don't normally find this speech mannerism annoying, and I do it myself too, but when 'like' is overused I switch off.
For this reason I don't listen to the otherwise highly entertaining CineFix podcast. Example: the recent episode discussing Kill Bill vol 1 contains 624 utterances in 75 mins (8.3 per min). IANAL I know.
Ha that’s not what IANAL means but perhaps should be
The value in AI podcasts at the moment isn't in replacing human content, but in filling the niches that human content just doesn't cover. Doesn't matter if it's not the best podcast ever, when it's literally the only podcast on this planet discussing the topic you want to listen too.
I've heard so many people say that podcasts are just something they have playing in the background while they do other things that I have no trouble believing that they'd play an AI podcast in the same way.
I personally would not bother with AI generated podcasts, they're such low bar, why waste time where there's so much other great content to catch up on? But I think you may be right, I wouldn't be surprised if people take them in with no fuss. But then what do I care? What I care most is that they'll pollute the search space. I'd filter out all GenAI content if I had the option and Im guessing that will become an option soon.
I've yet to hear an LLM-generated podcast about an article that wouldn't have been better if the AI had simply read the article aloud.
Someone in 2020: "If you can find a computer program that can generate an image with even a FRACTION of the aesthetic appeal of human created art...."
Give it time. Progress is fast.
> Give it time. Progress is fast.
Why pick 2020 as your starting point? That is simply around the time the current set of techniques came about.
We had generative art back in the late 90's - my screensaver has been generative art for over 20 years now.
Obviously generative art has come a long way but people have been working on various approaches to it for at least 30 years.
From 2020 until now, we've gone from crude blurry or clearly generative artefacts to being able to create full professional illustrations based upon textual prompts. That is huge. Classic generative art techniques look like cave paintings compared to what the latest image generation models put out (and I'm not talking about "AI slop" type stuff that DALL-E does).
Similarly, tools could fabricate podcasts years ago that sounded terrible. Now we have NotebookLM doing a "reasonable" job with two cliched-sounding "hosts". In a few years, will they potentially be able to create something akin to a professionally produced podcast given some smart prompting? The progress made so far points to yes, and I haven't seen any evidence so far to be pessimistic about it happening.
Can current techniques be scaled/improved/optimized to do this or do we need new techniques?
It took 30 years in the generative art world to move from cave paintings to the level that we have today because we needed new techniques.
For podcasts we are at the cave paintings level.
If we can get to professional level quality podcasts with the current techniques then we might only be a few years away.
I think it is more likely we will need new techniques which puts us potentially decades away.
If we look at LLMs the improvements over the last 18+ months since gpt4 was released have been minor despite incredible levels of investment.
“Generate” does not have the same connotation that “create” does. Are they really that interchangeable?
I think it's quite easy for AI to surpass the median podcast. The bar is fairly low there.
I guess that matches the state of AI in general, better than a novice; worse than an expert.
We will have to wait and see what future AIs deliver. Insight and nuance is what I look for in media like that, that's a much harder nut to crack.
I've heard amazing catchy songs from Suno and Udio. So much so they're still stuck in my head as earworms several months later. If they'd been streaming on youtube or spotify I wouldn't have given it a thought that they might be AI generated.
So, I can certainly imagine a podcast doing the same to some degree. Maybe not a podcast where AI wrote the script, but, a podcast where AI read a story dramatically doesn't seem too far off or, easier, a podcast that read news to me.
We had this recently, yes.
https://news.ycombinator.com/item?id=41693087
It's kind of horrible.
> Slightly tangential, but this is the reason I'm baffled why people think that AI-driven podcasts would ever be worth listening to
the only real people that believe in that that I’ve seen are ones who are heavily invested in it succeeding
I think we will get a surprising amount of AI generated content in the future. During the first year of the Urkain invasion there was an enormous amount of AI voiced and scripted video content on YouTube. I think AI content will take over in the easy parts first. And over time take up more and more views.
Do you not use autocorrect?
Businesses/CEOs want to show profitability by spending less on human employees. Human consumers don't want to lose the human touch. Will be really interesting to see how many of the consumer facing AI startups actually make it.
Many don't even care if they make it in the long run. Making it is cashing in in the short term, screwing the investors.
Again - think of where we were two years ago. I never understand this hubris people have to think AI can never do X while being proved repeatedly wrong.
Everybody trots out this argument.
GPT styled LLMs were introduced back in 2018 so SIX years ago.
Have they gotten more COHERENT? Absolutely. Is coherence the same thing as NOVELTY? NOT EVEN REMOTELY. I've played with markov chains in the 90s that were capable of producing surprising content.
Unless there is a radical advancement in the underlying tech, I don't see any indication that they'll be capable of genuine novelty any time in the near future.
Take satire for example. I have yet to see anything come out of an LLM that felt particularly funny. I suppose if the height of humor for you is Dad jokes, reddit level word punnery, and the backs of snapple lids though that might be different.
If you have a particular style of witty observational humor that you prefer, providing the model some examples of that will help it generate better output. It's capable of generating pretty much anything if you prompt it the right way. For truly nuanced or novel things, you have to give it a nucleus of novelty and axis of nuance for it to start generating things in your desired space.
Two years ago we had ChatGPT and Midjourney. Now... we also have those?
The accelerationism argument made a little bit of sense two years ago, but now, after two years of marginal improvements at best? Really?
How would you go about eating said stovetop hat?
Would you eat it raw or would you have to boil it down to soften it up and then drink/eat its hat soup?
Cook it like stovetop stuffing, I'd imagine....
Isn't it a stovePIPE hat?
The main reason im not concerned about AI-based entertainment is the same reason I watch human chess players. It's not only about technical capabilities. I can't explain fully why though..
>I'm baffled why people think that AI-driven podcasts would ever be worth listening to
It doesn't take much to entertain a substantial population of imbeciles. Reference: https://www.youtube.com/watch?v=SYkbqzWVHZI
Now for the hat...
ah, a motte and bailey lunch invitation, "would ever be worth listening to", so "if you can find [now]", then I'll eat my hat
People think they want to use the Minority Report computer interface because it looks cool and advanced but they don't put the least amount of thought into it and realize it's terribly impractical. Our arms would get tired very quickly. A mouse resting on a table isn't further from ideal just because it was invented earlier.
Fooling people with the promises of AI is pretty simple. People are easy to fool. They like shiny objects.
I agree with your overall statement, but
> If you can find an LLM+TTS generated podcast with even a FRACTION of the infectious energy as something like Good Job Brain, Radio Lab, or Ask Me Another, then I'll eat my hat.
Come on, we can all see how much faster these things are getting better. In a few years it will be impossible to distinguish them from a real person.
Another few after that video will be the same.
I'm not saying it's a good thing, but clearly it's a thing that will happen.
I dont think LLM + TTS generated podcasts even make sense. The whole reason for long form content and podcasts is that people dislike fake and impersonal content.
I think there are a few niche users who just want to listen to the news as an audio book but the whole idea of an LLM generated podcast totally misunderstands why people want a podcast over the normal corporate drivel media.
Incredibly, no videos linked in an article about a video newscast. I think this is an example. The AI doesn't even pronounce "AI" correctly. Interestingly, it looks slightly offscreen just the way real newsreaders do when they're on prompter.
https://www.youtube.com/watch?v=Aa7Q2S7VWUk
They do a lot right. There's interaction between the bots. They look kind of professional but not Los Angeles/New York quality, which is what you'd expect from a smallish market. Their movement is also kind of stiff and amateurish, which I believe is intentional.
Newscast teleprompters are directly in front of the camera lens specifically to not have them looking away from the lens. This has been a solved technology for decades. Perhaps you're thinking of cue cards or the teleprompters speakers use in a speech live audience type of setting?
Well you got me. I haven't watched broadcast TV for decades. I do see that phenomenon a lot with vloggers at present.
Am I also incorrect that they appear not to be looking directly at the camera? Looking back after your comment I still think it feels like they aren't.
The AI characters? Nothing about them feels right. The audio looks out of sync with the fake lip flaps. The dude's arm gestures are horrendous. It's AI/cgi, yet the fake background looks like a bad chromakey. You already pointed out some of the audio/voice issues.
> It's AI/cgi, yet the fake background looks like a bad chromakey.
I genuinely wonder if that's intentional. Maybe that looks more "realistic", and gives the audience something to stumble over that's not other AI artefacts?
The result to me looked more like a Zoom background replacement rather than a weather chromakey. That's what really looked bad to me. Even the full studio chromakey looks much better where the anchors are at a desk in front of a color vs a full studio.
I could only find this video. James' arms go up and down in an alarming manner. Rose has more natural movements but the voice you hear when her mouth moves is worse than the worst foreign film voice-over. Somehow the person and the voice mismatched in "tone" in a way that's hard to describe.
https://www.facebook.com/watch/?v=1040710730435452
Uncanny, I'd say
Looks like they're using something like motion matching to recover fragments of the presenter's motion that match the pronounced phonemes. The actors were probably instructed to avoid almost all movement to make sure it was blendable. That would explain why the guy's hand have such erratic and non-natural movement.
I was surprised at how game the AI was to pronounce the Hawaiian place names, it was confident enough that I assumed the pronunciation was correct. The article notes that it is butchering the placenames though.
To me this illustrates a common cognitive mismatch when evaluating AI, it can be confident in a way that most humans can't, and that misleading social cue is another reason we trust its output.
The first thing I thought of when I saw this is that some mid-tier dictatorships could replace a lot of their newscasters with this approach. Can always guarantee they’ll say what they need to say, and a lack of emotion is a plus maybe? Except with the dear leader passes then you bring out a real person for the emotions.
Well if the article is to be believed, they actually can't guarantee they'll say what they need to say, but I think your larger point holds.
AI's not mandatory for that
The way the mouths move are so far off from the words they're speaking that my first impressions would be they're just playing a video loop of these people talking about other random things and dubbing over it.
James' lips don't seem to move at all.
The problem with such "videocasts" (as opposed to "podcasts") is that there is another channel that the AI has to control: the video. Generating convincing video is much harder than generating convincing audio.
>> The AI doesn't even pronounce "AI" correctly.
You can call me Al.
The guy locks like a deceised used as a marionette and the girl speaks like a tenor.
But I guess it will become better. TV will turn so soul less, even when compared to today.
Imagine Rakuten Dog Does Funny Stuff channel with this added as some filler. Dystopic.
"James began his tenure as lead anchor, at which point he was unable to blink and his hands were constantly vibrating. He was demoted to second anchor in mid-October, where he began blinking more regularly and his odd hand vibration was replaced by a single emphatic gesture."
What problem are they trying to solve though?
Paying human presenters.
Human presenters aren't too expensive and are quite flexible, are easily replaced and can make or break a show. Yeah, there's the novelty factor now but am not sure how long it'll take until GenAI on broadcasts will signal second rate, subpar knockoff.
"James" arm motion in a loop.
Love this so much, not in the way intended. Its just so strange! I can't put my finger on it, but feels like something Tim and Eric, or Tim Robinson, or even Alan Resnick would have a hand in.
There is a kind of aesthetic immanence to whole thing, everything is right on the surface. The voices are only just embodied "enough," their unearned confidence, their "affectations." The deadpan delivery on an absurd stage. The colors all feel like a cake that is too sweet. Like approximating a memory of a broadcast.
It is hilarious and beautiful. No notes.
Yes! There is something massive and beautiful available in this direction.
Yeah I can see all of it, but the problem with me is that I bet I would have watched it a few seconds and clicked off out of boredom, never suspecting they were AI. I really want to claim I would have figured it out instantly, but I can't. If I were a regular consumer I think I'd notice.
They mention right up front they're "powered by AI" but to me that implies they had help with article writing. I would not immediately assume from that statement that the actual newsreaders themselves were AI.
Ooh wow I hate this. Totally soulless appearance and delivery - and the robot fidgeting the dude is doing with his hands completely distracts from everything else. It’s totally normal to do that movement while speaking for emphasis - but whatever he’s doing does not look normal. (The mouths look nightmarish as well)
While I enjoyed the article, it’s just another in a line of the same article with different flavors and authors that all have the same fundamental error.
The prevailing counterargument against AI consistently hinges on the present state of AI rather than the trajectory of its development: but what matters is not the capabilities of AI as they exist today, but the astonishing velocity at which those capabilities are evolving.
We've been through this song and dance before. AI researchers make legitimately impressive breakthroughs in specific tasks, people extrapolate linear growth, the air comes out of the balloon after a couple years when it turns out we couldn't just throw progressively larger models at the problem to emulate human cognition.
I'm surprised that tech workers who should be the most skeptical about this kind of stuff end up being the most breathlessly hyperbolic. Everyone is so eager to get rich off the trend they discard any skepticism.
One problem is that people assume the end goal is to create a human-cognition-capable AI. I think it' pretty obvious by this point that that's not going to happen. But there is no need for that at all to still cause a huge disruption; let's say most current workers in roles that benefit from AI (copilot, writing, throwaway clipart, repetitive tasks, summarizing, looking up stuff, etc.) lead not even to job loss but fewer future jobs created - what does that mean for the incoming juniors? What does that mean for the people looking for that kind of work? It's not obvious at all how big of a problem that will create.
Two things can both be true. I keep arguing both sides because:
1 Unless you’re aware of near term limits you think AI is going to the stars next year.
2 Architectures change. The only thing that doesn’t change is that we generally push on, temporarily limits are usually overcome and there’s a lot riding on this. It’s not a smart move to bet against progress over the medium term. This is also where the real benefits and risks lie.
Is AI in general more like going to space, or string theory? One is hard but doable. Other is a tar pit for money and talent. We are all currently placing our bets.
point 2 is the thing that i think is most important to point out:
"architectures change"
sure, that's a fact. let me apply this to other fields:
"there could be a battery breakthrough that gives electric cars a 2,000 mile range." "researchers could discover a new way to build nanorobots that attacks cancer directly and effectively cures all versions of it." "we could invent a new sort of aviation engine that is 1,000x more fuel efficient than the current generation."
i mean, yeah, sure. i guess.
the current hype is built on LLMs, and being charitable "LLMs built with current architecture." there are other things in the works, but most of the current generation of AI hype are a limited number of algorithms and approaches, mixed and matched in different ways, with other features bolted on to try and corral them into behaving as we hope. it is much more realistic to expect that we are in the period of diminishing returns as far as investing in these approaches than it is to believe we'll continue to see earth-shattering growth. nothing has appeared that had the initial "wow" factor of the early versions of suno, or gpt, or dall-e, or sora, or whatever else.
this is clearly and plainly a tech bubble. it's so transparently one, it's hard to understand how folks aren't seeing it. all these tools have been in the mainstream for a pretty substantial period of time (relatively) and the honest truth is they're just not moving the needle in many industries. the most frequent practical application of them in practice has been summarization, editing, and rewriting, which is a neat little parlor trick - but all the same, it's indicative of the fact that they largely model language, so that's primarily what they're good at.
you can bet on something entirely new being discovered... but what? there just isn't anything inching closer to that general AI hype we're all hearing about that exists in the real world. i'm sure folks are cooking on things, but that doesn't mean they're near production-ready. saying "this isn't a bubble because one day someone might invent something that's actually good" is kind of giving away the game - the current generation isn't that good, and we can't point to the thing that's going to overtake it.
It's like con artists and management consultants. They are the most susceptible because they drink the koolaid.
I think the mistake is that in the media it is extrapolating linear growth but in practice it is a wobbly path. And this wobbly path allows anyone to create whatever nearrative they want.
It reminds me of seeing headlines last week that NVDA is down after investors were losing faith after the last earnings. Then you look at the graph and NVDA is only like 10% off its all times high and still in and out of the most valuable company in the world.
Advancement is never linear. But I believe AI trends will continue up and to the right and even in 20 years when AI can do remarkably advanced things that we can barely comprehend, there will be internet commentary about how its all just hype.
>> I'm surprised that tech workers ... end up being the most breathlessly hyperbolic.
We're not.
There's a reason why so many of the people on the crypto grift in 2020-2022 have jumped to the AI grift. Same logic of "revolution is just around the corner", with the added mix of AGI millenarianism which hits a lot of nerds' soft spots.
You articulated my view perfectly. I just don't get the buy in from people who should know better than trust vc funded talking heads.
This comment summarize my thought in the best way.
This is confusing. We've never had a ChatGPT-like innovation before to compare to. Yes, there have been AI hype cycles for decades, but the difference is that we now have permanent invaluable and society-changing tools out of the current AI cycle, combined with hundreds of billions of dollars being thrown at it in a level of investment we've never seen before. Unless you're on the bleeding edge of AI research yourself, or one of the people investing billions of dollars, it is really unclear to me how anyone can have confidence of where AI is not going
I don't agree with this.
There's a big difference between something that benefits productivity versus something that benefits humanity.
I think a good test for if it genuinely has changed society is if all gen AI were to disappear overnight. I would argue that nothing would really fundamentally change.
Contrast that with the sudden disappearance of the internet, or the combustion engine.
It will take time though, if the internet had completely disappeared in the mid 90s nothing would have fundamentally changed
Because the hype will always outdistance the utility, on average.
Yes, you'll get peaks where innovation takes everyone by surprise.
Then the salesbots will pivot, catch up, and ingest the innovation into the pitch machine as per usual.
So yes, there is genuine innovation and surprise. That's not what is being discussed. It's the hype that inevitably overwhelms the innovation, and also inevitably pollutes the pool with increasing noise. That's just human nature, trying to make a quick buck from the new-hotness.
What is your time horizon? We're already at a date where people were saying these jobs would be gone. The people most optimistic about the trajectory of this technology were clearly wrong.
If you tell me AI newscasters will be fully functional in 10 or 15 years, I'll believe it. But that far in the future, I'd also believe news will be totally transformed due to some other technology we aren't thinking about yet.
But we don't know if AI development is following an exponential or sigmoid curve (actually we do kind of, now, but that's beside the point for this post.)
A wise institution will make decisions based on current capabilities, not a prognostication.
If investors didn't invest based on expected future performance, the share market would look completely different than it actually does today. So, I can't understand how anyone can claim that.
I honestly believe this specific case is a Pareto situation where the first 80% came at breakneck speeds, and the final 20% just won't come in a satisfactory way. And the uncanny valley effect demands a percentage that's extremely close to 100% before it has any use. Neural networks are great at approximations, but an approximate person is just a nightmare.
It's getting old but there's an xkcd for your kind of reasoning:
https://xkcd.com/605/
Isn't this essentially the same argument as "there are only 10 covid cases in this area, nothing to worry about"?
It's really missing the point, the point is whether or not exponential growth is happening or not. It doesn't with husbands, it does with covid, time will tell about AI.
No, because as you rightly point out we know exponential growth is very possible with Covid but we don't know if that will happen with AI.
In fact, the only evidence we have for exponential growth in AI is the word of the people selling it to us.
Transformers have been around for 7 years, ChatGPT for 2. This isn't the first few samples of what could be exponential growth. These are several quarters of overpromise and underdelivery. The chatbot is cool and it outperforms what awful product search has become. But is it enough to support a 3.5 trillion dollar sized parts supplier?
Unfortunately, I see this as only a small temporary setback in the unstoppable quest to replace humans with AI to cut corporate costs.
The aspect of AI that isn't discussed enough is not what are those formerly employed going to do next, but rather that it will potentially represent the largest transfer of wealth in history as money which was going into employees' salaries is instead going to shareholders via those large companies who have the ability to produce the AIs (even if there are a host of small companies who act as an intermediate layer, such as the Israeli firm in this case).
I do think there is going to be a backlash because of our need for and desire for human connections, which AI won't provide. But that will become more expensive, and not the norm. Just like we can still buy farmer-fresh food but only a minority segment of the population can afford it because it costs 3x what is pushed as "food" at Walmart.
hmm no one has discussed this "carpenter group", who appear to be on a spree in 2024 of snatching up smaller news orgs, and cutting 50% of the staff. seems like a pretty big gamble on AI.
> Carpenter Media Group announced earlier this month it had acquired another group of newspapers, Pamplin Media Group, in Oregon. The company now owns and manages 180 newspapers in the United States and Canada.
> “We are committed to Everett, The Herald and all who have a stake in its success,” Chairman Todd Carpenter said. “We have deep sympathy for those affected by these changes and will work hard with each of them to see they are well-compensated through a transition period that helps them move forward in a positive way.
“Our responsibility to the community and our readers requires us to make difficult business decisions, and then invest in and organize our team to move forward to produce a product that continues to improve and serve. Our track record in this process is good."
https://www.heraldnet.com/news/this-breaks-my-heart-roughly-...
would be very interested to see todd carpenter's data on that last point.
Give it time y'all. This is the first inning, and I'm already terrified.
I encourage you to have a conversation with ChatGPT's advanced audio for a taste of what's to come. If you can, have someone talk to it in a relatively unpopular language like Afrikaans or even Icelandic--they will shit their pants.
I'm not so convinced. A lot of people have been noting the rapid development of ML systems in the past few years and projecting continued exponential improvements based on previous growth rates, but unbounded exponential improvement doesn't happen. This is an S curve and I think we're already well into the diminishing returns part of the curve. I think future growth is going to require increasingly impractical amounts of hardware for ever smaller levels of improvement.
People saying that this version is flawed but still amazing, so the next version is going to be perfect and mind blowing are going to be disappointed I think. The next version will be slightly better but still flawed. The version after that will be a touch better but annoyingly still not quite there. Constantly teasing you that full success is just around the corner while never quite getting there.
I'd be terrified if anyone considers this reporting watchable. I felt depressed after 30 seconds. It's like an episode of Black Mirror.
> This is the first inning
I think this is the key part.
But this could be a strange form of baseball that only has two innings. No one knows for sure either way.
How is a very off putting start a good thing?
The videos in question would’ve been considered science fiction a mere 5 years ago, so I definitely don’t consider them a very off putting start.
That doesn’t mean that I think they’re perfect or that the technology will keep improving at a fast pace (nobody knows for sure).
Terrified of what? and First inning of what, when did that inning start?
If you're wondering just how much to cringe, here's a sample. https://youtu.be/48Y-pCffh10
I _had_ to hear the pumpkin clip in context (at 2m50s): https://www.facebook.com/TheGardenIsland/videos/112002117287...
Any ten seconds of that clip will do though, really.
Thanks, this is really important context, I expected a link to an example video as the first comment.
And yes, it's really bad. No idea how someone could think this could replace a human.
The article might give the impression of some kind of LLM SOTA model being slotted in, but visiting the website of the company they used, Caledo, it looks like they are using 2014 level technology.
Poorly animated CGI newscaster reading an "AI" written script. Really their tech looks awful and dated.
Small discussion (10 points, 23 hours ago, 5 comments) https://news.ycombinator.com/item?id=42203016
A way to lay off with justification, then hire back cheaper talent? If i'm being cynical.
This is an extreme example, also this person would never lose their job that easy if there were laws protecting workers.
But what I have seen here in Sweden is AI voiceover used in news reports where people have hidden identities. So clearly this has taken a job from someone who used to do this voiceover. But it's working, we have to adapt to these small changes that AI is bringing.
the CocaCola commercial that everyone "hated" that "destroyed the brand equity" according to numerous news outlets solidified to me that we are passing a mark and this is the last desperate gasp of multiple industries coping with the rapid advancements AI is bringing.
Sure the Coke ad was a bit cringe. But the reality is in 5 years most ads you see will be AI and nobody will care.
That along with the Ben Affleck Rant that AI won't replace Hollywood. Watching Ben Affleck talking about how you can't replace the chemistry of actors working on set made me imagine what the conversations of famous stage actors were when film came along. You can argue film didn't replace the stage, after all broadway is still there, but it is not comparable to the economic influence of TV and Film.
In 10-20 years, human made Film, TV will still exist. But it will most likely be a small amount of the economic activity that AI produced media will be.
It doesn't matter if you care how terrible or how cringe AI is. Or that AI quality is worse than what existed before. What matters is in 5-20 years with a new generation what 90% of them care and if AI has more utility that not using AI. In the authors case in the short term for that particular medium, AI was not more useful. But in the long term AI will dominate every facet of our media.
counter-point: You're in the hackernews comment section. Why aren't you just asking ChatGPT to generate comments for you on this article?
Pontificate on this, then return.
This isn't an economically driven platform that is directly revenue driven like ads, movies, news articles, podcasts, etc for one.
go to reddit or even linkedin which is revenue driven and it is a sea of bots parroting the same tired points with comments that are 1 or 2 words different from other comments directly above them.
My entire point is that human production is not going to zero, but is moving to a minority.
I still go to the theatre and listen to live orchestras but I would be a fool to say that other mass media types didn't replace the cultural role of entertainment that those forms once had.
Yeah but why are you consuming human-generated garbage comments on HN instead of just getting infinite comments from ChatGPT?
Pontificate on this, then return.
Thats irrelevant to what I am saying. You want all or nothing. I am saying AI will dominate in the majority and human generated content will still survive in the minority.
It doesn't matter what luddites like myself think. It matters what is economically viable to the masses.
Perhaps with such excellent examples of human thought such as your comment eventually ChatGPT comments won't seem so bad.
You are choosing to consume human-generated content even though right now you could go to ChatGPT and get infinite AI-generated content, for free!
Why is that?
I felt conflicted as I listened to the AI generated audio.
Is it just me or does everyone feel this way ? Its an instant put off for me when i hear that article was generated by AI. I would lose trust of that publication and move on.
People think AI is the magic answer for shitty content. In some cases AI is only adding speed not quality.
"Those responsible for sacking the people who have just been sacked, have been sacked."
Clickbait title for a decent write-up. The author left Hawaii, and as indicated in the article the local press resorted to AI presenters because it was hard to keep talent on board. This is not a "job replacement" story so much as a "backup plan" story in my opinion.
> local press resorted to AI presenters because it was hard to keep talent on board
I see a phrase like that and my first thought is "the pay is crap" and/or "the bosses are awful" and nobody in their right mind would work there.
The disruptive technology adoption process is at least somewhat predictable, including:
* Most people react negatively to the disruption because of the risk, fear of the unknown, and also because people don't like change.
* Also, early in the development and adoption of a new, immature technology, there is a lot of trial and error regarding applications, mostly error. Sometimes those failures are because the application isn't a good match for the technology; often they are because the technology isn't mature and will still improve and add major features, or because the details of the interface between technology and application are still being worked out.
* The people reacting negatively will point out those errors as signs that the technology is hopeless. Often they are wrong: the tech will mature and improve, and those people will be eclipsed.
The good news is, they won't remember it that way: First they laugh at you, then they tell you it isn't in the Bible (i.e., it violates the orthodoxy, the way things always have been done), then they say they knew it all along. AI is in stage 2.
At least this is funny in the way it looks robotic and dorky. When it gets better it will become increasingly scary to go on this road. Why would anyone want to watch anything gen AI other than a one off curiosity?