Every time a big company screws up, there are two highly informed sets of people who are guaranteed to be lurking, but rarely post, in a thread like this:
1) those directly involved with the incident, or employees of the same company. They have too much to lose by circumventing the PR machine.
2) people at similar companies who operate similar systems with similar scale and risks. Those people know how hard this is and aren’t likely to publicly flog someone doing their same job based on uninformed speculation. They know their own systems are Byzantine and don’t look like what random onlookers think it would look like.
So that leaves the rest, who offer insights based on how stuff works at a small scale, or better yet, pronouncements rooted in “first principles.”
I've noticed this amongst the newer "careerist" sort of software developer who is stumbling into the field for money, as opposed to the obsessive computer geek of yesteryear, who practiced it as a hobby. This character archetype is a transplant, say, less than five years ago from another, often non-technical discipline, and was taught or learned from overly simplistic materials that decry systems programming, or networking, or computer science concepts as unnecessary, impractical skills, reducing everything to writing JavaScript glue code between random NPM packages found on google.
Especially in a time where the gates have come crashing down to pronouncements of, "now anybody can learn to code by just using LLMs," there is a shocking tendency to overly simplify and then pontificate upon what are actually bewilderingly complicated systems wrapped up in interfaces, packages, and layers of abstraction that hide away that underlying complexity.
It reminds me of those quantum woo people, or movies like What the Bleep Do We Know!? where a bunch of quacks with no actual background in quantum physics or science reason forth from drastically oversimplified, mathematics-free models of those theories and into utterly absurd conclusions.
Completely agreed. There are also former employees who have very educated opinions about what is likely going on, but between NDAs and whatnot there is only so much they are willing to say. It is frustrating for those in the know, but there are lines they can't or won't cross.
Whenever an HN thread covers subjects where I have direct professional experience I have to bite my tongue while people who have no clue can be as assertive and confidently incorrect as their ego allows them to be.
Right? A common complaint by outsiders is that Netflix uses microservices. I'd love to hear exactly how a monolith application is guaranteed to perform better, with details. What is the magic difference that would have ensured the live stream would have been successful?
The only time I worked on a project that had a live television launch, it absolutely tipped over within like 2 minutes, and people on HN and Reddit were making fun of it. And I know how hard everyone worked, and how competent they were, so I sympathize with the people in these cases. While the internet was teeing off with easy jokes, engineers were swarming on a problem that was just not resolving, PMs were pacing up and down the hallway, people were getting yelled at by leadership, etc. It's like taking all the stress and complexity of a product launch and multiplying it by 100. And the thing I'm talking about was just a website, not even a live video stream.
3) the people supplying 1) and 2) with tools (hard- or software)
We (yep) don't know the exact details, but we do get sent snapshots of full configs and deployments to debug things... we might not see exact load patterns, but it's enough to know. And if course we can't tell due to NDAs.
I'm sure 2) can post. But it won't be popular, so you'll need to dig to find it.
Most people are consumers and at the end of the day, their ability to consume a (boring) match was disrupted. If this was PPV (I don't think it is) the paid extra to not get the quality of product they expected. I'm not surprised they dominate the conversation.
For an event like this, there already exists an architecture that can handle boundless scale: torrents.
If you code it to utilize high-bandwidth users upload, the service becomes more available as more users are watching -- not less available.
It becomes less expensive with scale, more available, more stable.
The be more specific, if you encode the video in blocks with each new block hash being broadcast across the network, just managing the overhead of the block order, it should be pretty easy to stream video with boundless scale using a DHT.
Could even give high-bandwidth users a credit based upon how much bandwidth they share.
With a network like what Netflix already has, the seed-boxes would guarantee stability. There would be very little delay for realtime streams, I'd imagine 5 seconds top. This sort of architecture would handle planet-scale streams for breakfast on top of the already existing mechanism.
But then again, I don't get paid $500k+ at a large corp to serve planet scale content, so what do I know.
The way to deal with this is to constantly do live events, and actually build organizational muscle. Not these massive one off events in an area the tech team has no experience in.
We should always be doing (the thing we want to do)
Somme examples that always get me in trouble (or at least big heated conversations)
1. Always be building: It does not matter if code was not changed, or there has been no PRs or whatever, build it. Something in your org or infra has likely changed. My argument is "I would rather have a build failure on software that is already released, than software I need to release".
2. Always be releasing: As before it does not matter if nothing changed, push out a release. Stress the system and make it go through the motions. I can't tell you how many times I have seen things fail to deploy simply because they have not attempted to do so in some long period of time.
There are more just don't have time to go into them. The point is if "you did it, and need to do it again ever in the future, then you need to continuously do it"
They've been doing live events since 2023. But it's hard to be prepared for something that's never been done by anyone before — a superbowl scale event, entirely viewed over the internet. The superbowl gets to offload to cable and over the air. Interestingly, I didn't have any problems with my stream. So it sounds like the bandwidth problems might be localized, perhaps by data center or ISP.
Unless Netflix eng decides to release a public postmorterm, we can only speculate. In my time organizing small-time live streams, we always had up to 3 parallel "backup" streams (Vimeo, Cloudflare, Livestream). At Netflix's scale, I doubt they could simply summon any of these providers in, but I guess Akamai / Cloudflare would have been up for it.
Sometimes this just isn't feasible for cost reasons.
A company I used to work for ran a few Super Bowl ads. The level of traffic you get during a Super Bowl ad is immense, and it all comes at you in 30 seconds, before going back to a steady-state value just as quickly. The scale pattern is like nothing else I've ever seen.
Super Bowl ads famously seven million dollars. These are things we simply can't repeat year over year, even if we believed it'd generate the same bump in recognition each time.
I think Netflix have a fair bit of organisational muscle, perhaps the fight was considered not as large of an event as the NFL streams would be in the future.
Also, "No experience in" really? You have no idea if that's really the case
Everyone here talking like this something unique netflix had to deal with. Hotstar live streamed india va Pakistan cricket match with zero issues with all time high live viewership ever in the history of live telecast. Why would viewers paying $20 month want to think about their technical issues, they dropped the ball pure and simple. Tech already exists for this, it’s been done before even by espn, nothing new here.
But that's exactly the point: Netflix didn't do this in a vacuum, they did it within Netflix.
It might just have been easier to start from scratch, maybe using an external partner experienced in live streaming, but the chances of that decision happening in a tech-heavy company such as Netflix that seems to pride itself on being an industry leader are close to zero.
depending on whom you ask, the bitrate used by the stream is significantly lower than what is considered acceptable from free livestreaming services, that albeit stream to much, much smaller audience.
without splitting hairs, livestreaming was never their forte, and going live with degradation elsewhere is not a great look for our distributed computing champ.
Netflix is good only on streaming ready made content, not live streaming, but;
1. Netflix is a 300B company, this isn't a resources issue.
2. This isn't the first time they have done live streaming at this scale either. They already have prior failure experience, you expect the 2nd time to be better, if not perfect.
3. There were plenty of time between first massive live streaming to second. Meaning plenty of time to learn and iterate.
The problem is that provisioning vast capacity for peak viewership is expensive and requires long-term commitment. Some providers won't give you more connectivity to their network unless you sign a 12 month deal where you prepay that.
Peak traffic is very expensive to run, because you're building capacity that will be empty/unsused when the event ends. Who'd pay for that? That's why it's tricky and that's why Akamai charges these insane prices for live streaming.
A "public" secret in that network layer is usually not redundant in your datacenter even if it's promised. To have redundant network you'd need to double your investment and it'll seat idle of at 50% max capacity. For 2hr downtime per year when you restart the high-capacity routers it's not cost efficient for most clients.
They have the NFL next month on Christmas day. So that'll be a big streaming session but I think it'll be nothing compared to this. Even Twitter was having problems handling the live pirate streams there.
Apple was clearly larger than Google when they came out with Apple Maps, and it was issue-laden for a long time. It is not a resource-issue, but a tech development maturity issue.
You can't solve your way out of a complex problem that you have created and which wasn't needed in the first place. The entire microservices thing was overly complex with zero benefits
I spoke to multiple Netflix senior technicians about this.
People just do not appreciate how many gotchas can pop up doing anything live. Sure, Netflix might have a great CDN that works great for their canned content and I could see how they might have assumed that's the hardest part.
Live has changed over the years from large satellite dishes beaming to a geosat and back down to the broadcast center($$$$$), to microwave to a more local broadcast center($$$$), to running dedicated fiber long haul back to a broadcast center($$$), to having a kit with multiple cell providers pushing a signal back to a broadcast center($$), to having a direct internet connection to a server accepting a live http stream($).
I'd be curious to know what their live plan was and what their redundant plan was.
This is the whole point of chaos engineering that was invented at Netflix, which tests the resiliency of these systems.
I guess we now know the limits of what "at scale" is for Netflix's live-streaming solution. They shouldn't be failing at scale on a huge stage like this.
I look forward to reading the post mortem about this.
> People just do not appreciate how many gotchas can pop up doing anything live.
Sure thing, but also, how much resources do you think Netflix threw on this event? If organizations like FOSSDEM and CCC can do live events (although with way smaller viewership) across the globe without major hiccups on (relatively) tiny budgets and smaller infrastructure overall, how could Netflix not?
Cable TV (or even OTA antenna in the right service area) is simply a superior live product compared to anything streaming.
The Masters app is the only thing that comes close imo.
Cable TV + DVR + high speed internet for torrenting is still an unmatched entertainment setup. Streaming landscape is a mess.
It's too bad the cable companies abused their position and lost any market goodwill. Copper connection direct to every home in America is a huge advantage to have fumbled.
The interesting thing is that a lot of TV infrastructure is now running over IP networks. If I were to order a TV connection for my home I'd get an IPTV box to connect to my broadband router via Ethernet, and it'd simply tell the upstream router to send a copy of a multicast stream my way.
Reliable and redundant multicast streaming is pretty much a solved problem, but it does require everyone along the way to participate. Not a problem if you're an ISP offering TV, definitely a problem if you're Netflix trying to convince every single provider to set it up for some one-off boxing match.
This. Im honestly going to cancel my streaming shit. They remove and mess with it so much. Like right now HBO max or whatever removes my recent watches after 90 days. why?
It wasn't even just buffering issues, the feed would just stop and never start again until I paused it and then clicked "watch live" with the remote.
It was really bad. My Dad has always been a fan of boxing so I came over to watch the whole thing with him.
He has his giant inflatable screen and a projector that we hooked up in the front lawn to watch it, But everything kept buffering. We figured it was the Wi-Fi so he packed everything up and went inside only to find the same thing happening on ethernet.
He was really looking forward to watching it on the projector and Netflix disappointed him.
On a few forum sites I'm on, people are just giving up. Looking forward to the post-mortem on how they weren't ready for this (with just a tiny bit of schadenfreude because they've interviewed and rejected me twice).
AB84 streamed it live from a box at the arena to ~5M viewers on Twitter. I was watching it on Netflix, I didn't have any problems, but I also put his live stream up for the hell of it. He didn't have any issues that I saw.
It’s not everyone. Works fine for me though I did have to reload the page when I skipped past the woman match to the Barrios Ramos fight and it was stuck buffering at 99%.
I wonder if there will be any long term reputational repercussions for Netflix because of this. Amongst SWEs, Netflix is known for hiring the best people and their streaming service normally seems very solid. Other streaming services have definitely caught up a bit and are much more reliable then in the early days, but my impression still has always been that Netflix is a step above the rest technically.
This sure doesn't help with that impression, and it hasn't just been a momentary glitch but hours of instability. And the Netflix status page saying "Netflix is up! We are not currently experiencing an interruption to our streaming service." doesn't help either...
Not the same demographic but their last large attempt at live was through a Love is blind reunion. It was the same thing, millions of people logging in, epic failure, nothing worked.
They never tried to do a live reunion again. I suppose they should have to get the experience. Because they are hitting the same problems with a much bigger stake event.
From what I've heard, Netflix has really diluted the culture that people know of from the Patty McCord days.
In particular, they have been revising their compensation structure to issue RSUs, add in a bunch of annoying review process, add in a bunch of leveling and titles, begin hiring down market (e.g. non-sr employees), etc.
In addition to doing this, shuffling headcount, budgets, and title quotas around has in general made the company a lot more bureaucratic.
I think, as streaming matured as a solution space, this (what is equivalent to cost-cutting) was inevitable.
If Netflix was running the same team/culture as it was 10 years ago, I'd like to say that they would have been able to pull of streaming.
So the issue is that Netflix gets its performance from colocating caches of movies in ISP datacenters, and a live broadcast doesn't work with that. It's not just about the sheer numbers of viewers, it's that a live model totally undermines their entire infrastructure advantage.
If Netflix still interviews on hacker rank puzzles I think this should be a wake up call. Interviewing on irrelevant logic puzzles is no match for systems engineering.
Has Netflix ever live streamed something before? People on reddit are reporting that if you back up the play marker by about 3 minutes the lag goes away. They've got a handle on streaming things when they have a day in advance to encode it into different formats and push it to regional CDNs. But I can't recall them ever live streaming something. Definitely nothing this hyped.
I don't spend much time streaming, but I got a glimpse of the Amazon Prime catalog yesterday, and was surprised at how many titles on the front page were movies I'd actually watch. Reminded me of Netflix a dozen years ago.
> ut my impression still has always been that Netflix is a step above the rest technically.
I always assumed youtube was top dog for performance and stability. I can’t remember the last time I had issues with them and don’t they handle basically more traffic than any other video service?
I think Netflix will have even more sw engineers looking to work there once they notice even for average quality of work they can get paid 3 times more than their current pay.
Most people pay Netflix to watch movies and tv shows, not sports. If I hadn't checked Hacker News today, I wouldn't even know they streamed sports, let alone that they had issues with it. Even now that I do, it doesn't affect how I see their core offering, which is their library of on-demand content.
Netflix's infrastructure is clearly built for static content, not live events, so it's no shock they aren't as polished in this area. Streaming anything live over the internet is a tough technical challenge compared to traditional cable.
I think why I will remember about this fight is not the (small) streaming issue I encountered as much as the poor quality of the fight itself. For me that was the reputational loss. Netflix was touting “NFL is coming to Netflix”. This fight did not really make me want to watch that.
I don't think it'll be long-term. Most people will forget about this really quickly. It's not like there will be many people saying "Oh, you don't want to sign up for Netflix, the Tyson fight wasn't well streamed" in even 6 months nevermind 10 years.
Based on this I'm wondering whether it was straight up they did not expect it to be this popular?
> Some Cricket graphs of our #Netflix cache for the #PaulVsTyson fight. It has a 40 Gbps connection and it held steady almost 100% saturated the entire time.
I don't think Netflix is even designed to handle very extreme multi-region live-streaming at scale as evidenced in this event with hundreds of millions simultaneously watching.
YouTube, Twitch, Amazon Prime, Hulu, etc have all demonstrated to stream simultaneously to hundreds of millions live without any issues. This was Netflix's chance to do this and they have largely failed at this.
There are no excuses or juniors to blame this time. Quite the inexperience from the 'senior' engineers at Netflix not being able to handle the scale of live-streaming which they may lose contracts for this given the downtime across the world over this high impact event.
Very embarrassing for a multi-billion dollar publicly traded company.
Yea, it’s a bad look. But I switched to watching some other Netflix video and it seemed fine. Just this event had some early issues. Looks fine now though.
Streamed glitch free for me both on my phone and Xbox. The fight wasn’t so great though, but still a fun event. Jake Paul is a money machine right now.
Yeah, the funny part is that Hulu, Amazon Prime, and Peacock have all demonstrated the ability to handle an event of this caliber with no issue. Netflix now may never get another opportunity like this again.
To me the difference is that in 2012, you had companies focusing on delivering a quality product, whether it made money or not. Today, the economic environment has shifted a lot and companies are trying to increase profits while cutting costs. The result is inevitably a decline in quality. I'm sure that Netflix could deliver a flawless live stream to millions of viewers, but the question is can they do it while making a profit that Wall Street is happy with. Apparently not.
The funny thing is I was just reading something on HN like three days ago about how light years ahead Netflix tech was compared to other streaming providers. This is the first thing I thought of when I saw the reports that the fight was messing up.
But is there a way that Netflix might have learned from all of Youtube's past mistakes?
The only reasonable way to scale something like this up is probably to... scale it up.
Sure, there are probably some generic lessons, but I bet that the pain points in Netflix's architecture (historically grown over more than a decade and optimized towards highly cacheable content) are very different from Youtube, which has ramped up live content gradually over as many years.
The average quality of talent has gone way down compared to 2012 though.
E.g. the median engineer, excluding entry level/interns, at YouTube in 2012 was a literal genius at their niche or quite close to it.
Netflix simply can’t hire literal geniuses with mid six figure compensation packages in 2024 dollars anymore… though that may change with a more severe contraction.
It's incomprehensible to me that Netflix, one of the most highly skilled engineering teams in the world - completely sh*t the bed last night and provided a nearly unwatchable experience that was not even in the same league as pre-internet live broadcast from 30 years ago.
My bet is that a technical manager told his executive (multiple times) that he needed more resources and engineering time to make live work properly, and they just told him to make do because they didn't want to spend the money.
It could come down to something as stupid as:
Executive: "we handled [on demand show ABCD] on day one, that was XX million"
Engineering: "live is really different"
Executive: (arguing about why it shouldn't be that different and should not need a lot of new infrastructure)
Engineering: (can't really argue with his boss about this anymore after having repeated the same conversation 3 or 4 times)
-- tells the team: we are not getting new servers or time for a new project. We have to just make do with what we have. You guys are brilliant, I know you can do it!"
I had buffering issues but then backed off and let a bit of it buffer up (maybe 1 or 2 mintues?) and then it was fine for the entire Tyson Paul match. There was no reason I needed it to be live vs. a 1 or 2 minute delay.
This topic is really just fun for me to read based on where I work and my role.
Live is a lot harder than on demand especially when you can't estimate demand (which I'm sure this was hard to do). People are definitely not understanding that. Then there is that Netflix is well regarded for their engineering not quite to the point of snobbery.
What is actually interesting to me is that they went for an event like this which is very hard to predict as one of their first major forays into live, instead of something that's a lot easier to predict like a baseball game / NFL game.
I have to wonder if part of the NFL allowing Netflix to do the Christmas games was them proving out they could handle live streams at least a month before. The NFL seems to be quite particular (in a good way) about the quality of the delivery of their content so I wouldn't put it past them.
Netflix’s snobbery of engineering is so exhausting. Then seeing them be unable to fix this problem after several previous streaming failures is a bit rich.
To me it speaks to how most of the top tech companies of the 2010s have degraded as of late. I see it all the time with Google hiring some of the lower performing engineers on my teams because they crushed Leetcode.
> The NFL seems to be quite particular (in a good way) about the quality of the delivery of their content
Alas, my experience with the NFL in the UK does not reflect that. DAZN have the rights to stream NFL games here, and there are aspects of their service that are very poor. My major, long-standing issue has been the editing of their full game “ad-free” replays - it is common for chunks of play to be cut out, including touchdowns and field goals. Repeated complaints to DAZN haven’t resulted in any improvements. I can’t help but think that if the NFL was serious about the quality of their offering, they’d be knocking heads together at DAZN to fix this.
Aside from latency (which isn't much of a problem unless you are competing with TV or some other distribution system), it seems easier than on-demand, since you send the same data to everyone and don't need to handle having a potentially huge library in all datacenters (you have to distribute the data, but that's just like having an extra few users per server).
My guess is that the problem was simply that the number of people viewing Netflix at once in the US was much larger than usual and higher than what they could scale too, or alternatively a software bug was triggered.
I know nothing about boxing and this fight was just ridiculously impressive. I kept tuning out of the earlier fights. They felt like some sort of filler. I didn’t get the allure. But Taylor v Serrano was just obvious talent that even I could appreciate it.
What do you think were the dynamics of the engineering team working on this?
I'd think this isn't too crazy to stress test. If you have 300 million users signed up then you're stress test should be 300 million simultaneous streams in HD for 4 hours. I just don't see how Netflix screws this up.
Maybe it was a management incompetence thing? Manager says something like "We only need to support 20 million simultaneous streams" and engineers implement to that spec even if the 20 million number is wildly incorrect.
Reading the comments here, I think one thing that's overlooked is that Netflix, which has been on the vanguard of web-tech and has solved many complicated problems in-house, may not have had the culture to internally admit that they needed outside help to tackle this problem.
Main event hasn’t even started yet. Traffic will probably 10x for that. They’re screwed. Should have picked something lower profile to get started with live streaming.
When you step back and look at the situation, it's not hard to see why Netflix dropped the ball here. Here's now I see it (not affiliated with Netflix, pure speculation):
- Months ago, the "higher ups" at Netflix struck a deal to stream the fight on Netflix. The exec that signed the deal was probably over the moon because it would get Netflix into a brand new space and bring in large audience numbers. Along the way the individuals were probably told that Netflix doesn't do livestreaming but they ignored it and assumed their talented Engineers could pull it off.
- Once the deal was signed then it became the Engineer's problem. They now had to figure out how to shift their infrastructure to a whole new set of assumptions around live events that you don't really have to think about when streaming static content.
- Engineering probably did their absolute best to pull this off but they had two main disadvantages, first off they don't have any of the institutional knowledge about live streaming and they don't really know how to predict demand for something like this. In the end they probably beefed up livestreaming as much as they could but still didn't go far enough because again, no one there really knows how something like this will pan out.
- Evening started off fine but crap hit the fan later in the show as more people tuned in for the main card. Engineering probably did their best to mitigate this but again, since they don't have the institutional knowledge of live events, they were shooting in the dark hoping their fixes would stick.
Yes Netflix as a whole screwed this one up but I'm tempted to give them more grace than usual here. First off the deal that they struck was probably one they couldn't ignore and as for Engineering, I think those guys did the freaking best they could given their situation and lack of institutional knowledge. This is just a classic case of biting off more than one can chew, even if you're an SV heavyweight.
This isn't Netflix's first foray into livestreaming. They tried a livestream last year for a reunion episode of one of their reality TV shows which encountered similar issues [0]. Netflix already has a contract to livestream a football event on Christmas, so it'll be interesting to see if their engineers are able to get anything done in a little over a month.
These failures reflect very poorly on Netflix leadership. But we all know that leadership is never held accountable for their failures. Whoever is responsible for this should at least come forward and put out an apology while owning up to their mistakes.
It’s insane the excuses being made here for Netflix’s apparently unique circumstances.
They failed. Full stop. There is no valid technical reason they couldn’t have had a smooth experience. There are numerous people with experience building these systems they could have hired and listened to. It isn’t a novel problem.
Here are the other companies that are peers that livestream just fine, ignoring traditional broadcasters:
- Google (YouTube live), millions of concurrent viewers
- Amazon (Thursday Night Football, Twitch), millions of concurrent viewers
- Apple (MLS)
NBC live streamed the Olympics in the US for tens of millions.
As a cofounder of a CDN company that pushed a lot of traffic, the problem with live streaming is that you need to propagate peak viewership trough a loooot of different providers. The peering/connectivity deals are usually not structured for peak capacity that is many times over the normal 95th percentile. You can provision more connectivity, but you don't know how many will want to see the event.
Also, live events can be trickier than stored files, because you can't offload to the edges beforehand to warm up the caches.
So Netflix had 2 factors outside of their control
- unknown viewership
- unknown peak capacities outside their own networks
Both are solvable, but if you serve "saved" content you optimize for different use case than live streaming.
I don't disagree that Netflix could have / should have done better. But everybody screws these things up. Even broadcast TV screws these things up.
Live events are difficult.
I'll also add on, that the other things you've listed are generally multiple simultaneous events; when 100M people are watching the same thing at the same time, they all need a lot more bitrate at the same time when there's a smoke effect as Tyson is walking into the ring; so it gets mushy for everyone. IMHO, someone on the event production staff should have an eye for what effects won't compress well and try to steer away from those, but that might not be realistic.
I did get an audio dropout at that point that didn't self correct, which is definitely a should have done better.
I also had a couple of frames of block color content here and there in the penultimate bout. I've seen this kind of stuff on lots of hockey broadcasts (streams or ota), and I wish it wouldn't happen... I didn't notice anything like that in the main event though.
Experience would likely be worse if there were significant bandwidth constraints between Netflix and your player, of course. I'd love to see a report from Netflix about what they noticed / what they did to try to avoid those, but there's a lot outside Netflix's control there.
Amazon had their fair share of livestream failures and for notably less viewers. I don't think they deserve a spot on that list. I briefly worked in streaming media for sports and while it's not a novel problem, there are so many moving parts and points of failure that it can easily all go badly.
It's not full stop. There are reasons why they failed, and for many it's useful and entertaining to dissect them. This is not "making excuses" and does not get in the way of you, apparently, prioritizing making a moral judgment.
Live streaming is hard. Most companies that do live streaming at 2024 scale did it by learning from their mistakes. This is true for Hotstar, Amazon and even Youtube. Netflix stack is made to stream optimised, compressed , cached videos with a manageable concurrent viewers for the same video. Here we had ~65m concurrent viewers in their first live event. The compression they use, distribution etc have not scaled up well. I'll judge them based on how they handle their next live event
I don't think this is their first live event. They have hosted a pro golf promotional match and they had a live pro tennis match between Nadal and Alcaraz off the top of my head.
It will never not annoy and amuse me that illegal options (presumably run by randoms in their spare time) are so much better than the offerings of big companies and their tech ‘talent’.
I have Netflix purchased legally with hard earned money. But because I had issues I looked for illegal streams, and they were bad, crashes, buffering.. you name it. So I went back to Netflix and watched it at 140p quality.
> But the real indicator of how much Sunday’s screw-up ends up hurting Netflix will be the success or failure of its next live program—and the next one, and the one after that, and so on. There’s no longer any room for error. Because, like the newly minted spouses of Love Is Blind, a streaming service can never stop working to justify its subscribers’ love. Now, Netflix has a lot of broken trust to rebuild.
Weird that an organization like Netflix is having problems with this considering their depth of both experience and pockets. I wonder if they didn't expect the number of people who were interested in finding out what the pay-per-view experience is like without spending any extra money. Still, I suppose we can all be thankful Netflix is getting to cut their live event teeth on "alleged rapist vs convicted rapist" instead of something more important.
From my experience, it works if your not watching it 'live'. But the moment I put my devices to 'live' it perma-breaks. 504 gateway timed out in web developer tools hitting my local CDN. probably works on some CDNs, doesnt on others. Probably works if your not 'live'
edit: literally a nginx gateway timed out screen if you view the response from the cdn... wow
I've been re-watching Silicon Valley the last few weeks and just watched the Nucleus live stream episode 2 days ago, pretty funny seeing it in real life.
This is probably a naive question but very relevant to what we have here.
In a protocol where a oft-repeated request goes through multiple intermediaries, usually every intermediate will be able to cache the response for common queries (Eg: DNS).
In theory, ISPs would be able to do the same with the HTTP. Although I am not aware of anyone doing such (since it will rightfully raise concerns of privacy and tampering).
Now TLS (or other encryption) will break this abstraction. Every user, even if they request a live stream, receives a differently encrypted response.
But live stream of a popular boxing match has nothing to do with the "confidentiality" of encryption protocol, only integrity.
Do we have a protocol which allows downstream intermediates eg ISPs to cache content of the stream based on demand, while a digital signature / other attestation being still cryptographically verified by the client?
there's Named Data Networking (went by Content-Centric Networking earlier). You request data, not a url, the pipe/path becomes the CDN. If any of your nearest routers have the bytes, your request will go no further.
I don't see it much mentioned the last few years, but the research groups have ongoing publications. There's an old 2006 Van Jacobson video that is a nice intro.
I guarantee this is a management issue. Somebody needed to bear down at some point and put the resources into load testing. The engineers told them it probably won't be sufficient.
I assume this came down to some technical manager saying they didn't have the human and server resources for the project to work smoothly and a VP or something saying "well, just do the best you can.. surely it will be at least a little better than last time we tried something live, right?"
I think there should be a $20 million class action lawsuit, which should be settled as automatic refunds for everyone who streamed the fight. And two executives should get fired.
At least.. that's how it would be if there was any justice in the world. But we now know there isn't -- as evidenced by the fact that Jake Paul's head is still firmly attached to his body.
I am curious about their live streaming infrastructure.
I have done live streaming for around 100k concurrent users. I didn't setup infrastructure because it was CloudFront CDN.
Why it is hard for Netflix. They have already figured out CDN part. So it should not be a problem even if it is 1M or 100M. because their CDN infrastructure is already handling the load.
I have only work with HLS live streaming where playlist is constantly changing compared to VOD. Live video chunks work same as VOD. CloudFront also has a feature request collapsing that greatly help live streaming.
So, my question is if Netflix has already figured out CDN, why their live infrastructure failing?
Note: I am not saying my 100k is same scaling as their 100M. I am curious about which part is the bottleneck.
> Why it is hard for Netflix. They have already figured out CDN part. So it should not be a problem even if it is 1M or 100M. because their CDN infrastructure is already handling the load ... Note: I am not saying my 100k is same scaling as their 100M. I am curious about which part is the bottleneck.
100k concurrents is a completely different game compared to 10 million or 100 million. 100k concurrents might translate to 200Gbps globally for 1080p, whereas for that same quality, you might be talking 20T for 10 million streams. 100k concurrents is also a size such that you could theoretically handle it on a small single-digit number of servers, if not for latency.
> CloudFront also has a feature request collapsing that greatly help live streaming.
I don't know how much request coalescing Netflix does in practice (or how good their implementation is). They haven't needed it historically, since for SVOD, they could rely on cache preplacement off-peak. But for live, you essentially need a pull-through cache for the sake of origin offload. If you're not careful, your origin can be quickly overwhelmed. Or your backbone if you've historically relied too heavily on your caches' effectiveness, or likewise your peering for that same reason.
200Gbps is a small enough volume that you don't really need to provision for that explicitly; 20Tbps or 200Tbps may need months if not years of lead time to land the physical hardware augments, sign additional contracts for space and power, work with partners, etc.
The first round of the fight just finished, and the issues seem to be resolved, hopefully for good. All this to say what others have noted already, this experience does not evoke a lot of confidence in Netflix's live-streaming infrastructure.
Yes, it was utterly boring, but they made their money. I don't like either Paul brother, so I only watched in hopes shorter, much-older Tyson would make Jake look as foolish as he is.
A friend and I, in separate states, found that it wouldn’t stream from TVs, Roku, etc. but would stream from mobile. And for me, using a mobile hotspot to a laptop; though that implies checking IP address range instead of just user-agent, so that seems unlikely.
Anyway, I wouldn’t be surprised if they were prioritizing mobile traffic because it’s more forgiving of shitty bitrate.
On a tangential note, the match totally looked fixed to me - Tyson was barely throwing any punches. I understand age is not on his side, but he looked plenty spry when he was ducking, weaving and dodging. It seemed to me he could have done better in terms of attacking as well.
I would argue Tyson has a shorter reach, Jake was whiffing a lot of superman punches, and all that does is waste energy. Jake might be able to throw punches, but he clearly wasn't interested in taking them. If they stood closer and slugged it out, the fight could have gone either way.
Yeah the biggest thing to me, and the commentators mentioned this as well, his legs looked REALLY wobbly.
All your attacking power comes from your legs and hips, so if his legs weren’t stable he didn’t have much attacking power.
I think he gave it everything he had in rounds 1 and 2. Unfortunately, I just don’t think it was ever going to be enough against a moderately trained 27 year old.
It probably depends more on the ISP than on Netflix. Engineers over in my ISP’s subreddit are talking about how flows from Netflix jumped by over 450Gb/s and it was no big deal because it wasn’t enough to cause any congestion on any of their backbone links.
I think they must be noticing the issues, because I've noticed they've been dropping the stream quality quite substantially... It's a clever trick, but kind of cheap to do so, because who wants to watch pixelated things?
To be brutally honest if it’s a choice between pixelated and constantly buffering, pixelated is way less bad. Constantly buffering is incredibly annoying during live sports. (but this doesn’t negate your main point which is that if people paid to watch they expect decent resolution)
I wrote an analysis on doing this kind of unicast streaming in cable networks a decade ago. For edge networks with reasonable 100gig distribution as their standard, these would see some of the minor buffering issues.
There is a reason that cable doesn’t stream unicast and uses multicast and QAM on a wire. We’ve just about hit the point where this kind of scale unicast streaming is feasible for a live event without introducing a lot of latency. Some edge networks (especially without local cache nodes) just simply would not have enough capacity, whether in the core or peering edge, to do the trick.
Saw an Arista presentation about the increase in SFP capacity, it's Moore law style stuff. Arm based kit has a shockingly efficient amount of streams-per-watt too.
I can't see traditional DVB/ATSC surviging much beyond 2040 even accounting for the long tail.
You're right that large scale parallel live streams has only become feasible in the last few years. The BBC has some insights in how the BBC had to change their approach to scale to getting 10 million in 2021, having had technical issues in the 3 million range in 2018
Personally I don't think the latency is solved yet -- TV is slow enough (about 10 seconds from camera to TV), but IP streaming tends to add another 20-40 seconds on top of that.
That's no good when you're watching the penalties. Not only will your neighbours be cheering before you as they watch on normal TV, but even if you're both on the same IPTV you may well 5 seconds of difference.
The total end-to-end time is important too, with 30 seconds the news push notifications, tweets, etc on your phone will come in before you see the result.
Yes, as I have said again and again on hacker news in different comments Netflix went overboard with their microservices and tried to position itself as a technological company when it's not. It has made everything more complex and that's why any Netflix tech blog is useless because it is not the way to build things correctly.
To understand how to do things correctly look at something like pornhub who handle more scale than Netflix without crying about it.
The other day I was having this discussion with somebody who was saying distributed counter logic is hard and I was telling them that you don't even need it if Netflix didn't go completely mental on the microservices and complexity.
You would think, but technology always finds a way to screw things up. Cox Communications has had ongoing issues with their video for weeks because of Juniper router upgrades and even the vendor can't fix it. They found this out AFTER they put it in production. Shit happens.
Does anyone have any thoughts besides "bad engineering" on what could've gone wrong? It seems like taking on a new endeavor like streaming an event that would possibly draw many hundreds of millions of viewers doesn't make sense. Is there any obvious way that this would just work, or is there obviously a huge mistake deeply rooted in the whole thing. Also, are there any educated guesses on some fine details in the codebase and patterns that could result in this?
I don't understand why the media is pushing this a Jake Paul vs Mike Tyson stuff so hard and why people care about it. Boxing is crude entertainment for low intelligence people.
I'm tired of all this junk entertainment which only serves to give people second-hand emotions that they can't feel for themselves in real life. It's like, some people can't get sex so they watch porn. People can't fight so they watch boxing. People can't win in real life so they play video games or watch superhero movies.
Many people these days have to live vicariously through random people/entities; watch others live the life they wished they had and then they idolize these people who get to have everything... As if these people were an intimate projection of themselves... When, in fact, they couldn't be more different. It's like rooting for your opponent and thinking you're on the same team; when, in fact, they don't even know that you exist and they couldn't be more different from you.
You're no Marvel superhero no matter how many comic books you own. The heroes you follow have nothing to do with you. Choose different heroes who are more like you. Or better; do something about your life and give yourself a reason to idolize yourself.
Mine is glitchy, but if I refresh i get a good steam for a bit, then it gets low res, then freeze. If I wait for auto-reconnect it takes forever. Hard refresh and I'm good. Like, new streams to new server, then overloaded, then does as if their cluster is crashing and healing is rapid cycles. Sawtooth patterns on their charts.
And then all these sessions lag, or orphan taking up space, so many reconnections at various points in the stream.
System getting hammered. Can't wait for this writeup.
Hopefully they fix it because they are hosting two Christmas NFL games this year and if you want to really piss people off you have buffering issues during NFL games lol.
I believe HN's algorithm tends to relatively downrank stories with a high comment-to-upvote ratio, because they are more often flamewars on divisive topics.
The arrogant Netflix! They always brag about how technologically superior they are, and they can't handle a simple technological challenge! I didn't have a buffering issue, I had an error page - for hours! Yet, they kept advertising the boxing match to me! What a joke! If you can't stream it, don't advertise it to save face with people like me who don't care about boxing!
Every organization makes mistakes and every organization has outages. Netflix is not different. Instead, of bashing them because they are imperfect, you might want to ask what you can learn from this incident. What would you do if your service received more traffic than expected? How would you test your service so you can be confident it will stay up?
Also, I have never seen any Netflix employees who are arrogant or who think they are superior to other people. What I have seen is Netflix's engineering organization frequently describes the technical challenges they face and discusses how they solve them.
I think you’re oversimplifying it. Live event streaming is very different from movie streaming. All those edge cache servers become kinda useless and you start hitting peering bottlenecks.
After a few buffering timeouts during the first match, the rest of the event had no technical difficulties (in SoCal, so close to one of Netflix's HQs).
Unfortunately, except for the women's match, the fights were pretty lame...4 of the 6 male boxers were out of shape. Paul and Tyson were struggling to stay awake and if you were to tell me that Paul was just as old as Tyson I would have believed it.
Assuming Netflix used its extensive edge cache network to distribute the streams to the ISPs. The software on the caching servers would have been updated to be capable of dealing with receiving and distributing live streamed content, even if maybe the hardware was not optimal for that (throughput vs latency is a classic networking tradeoff).
Now inside the ISPs network again everyting would probably be optimized for the 99.99% usecase of the Netflix infra: delivering large bulk data that is not time sensitive. This means very large buffers to shift big gobs of packets in bulk.
As everything along the path is trying to fill up those buffers before shipping to the next router on the path, some endpoints aware this is a live stream start cancelling and asking for more recent frames ...
Why do they want to get into the live business? It doesn't seem to synergize with their infrastructure. Sending the same stream in real time to numerous people just isn't the same task as letting people stream optimized artifacts that are prepositioned at the edge of the network.
Most PPV is what, $50-$70? So subscribing to Netflix for $20 or whatever per month sounds like a bargain for anyone who is interested and not already a customer. Then assume some large percentage doesn’t cancel either because they forgot, or because they started watching a show and then decided to keep paying.
Live is the only thing that won’t be commodified entirely. “Anyone” can pump out stream-when-you-want TV shows. Live events are generally exclusive, unpredictable, and cultural moments .
It was so bad. So so bad. Like don’t use your customers as guinea pigs for live streaming. So lame. They need a new head of content delivery. You can’t charge customers like that and market a massive event and your tech is worse than what we had from live broadcast tv.
I watched on an AppleTV and the stream was rock solid.
I don’t know if it’s still the case, but in the past some devices worked better than others during peak times because they used different bandwidth providers. This was the battle between Comcast and Cogent and Netflix.
> I watched on an AppleTV and the stream was rock solid.
For me it was buffering and low resolution, on the current AppleTV model, hardwired, with a 1Gbps connection from AT&T. Some streaming devices may have handled whatever issues Netflix was having better than others, but this was clearly a bigger problem than just the streaming device.
I thought Netflix’s biggest advantage was the quality/salary of its engineers.
I think that every time I wait for Paramount+ to restart after its gone black in picture on picture, and yet, I’n still on Paramount+ and not Netflix, so maybe that advantage isn’t real.
Sigh, none of the competitors are much better. Disney, who has more than enough cash to throw at streaming, is a near constant hassle for us ( after 3 or more episodes it throws an inscrutable error on Playstation ). I would drop it, but this is the only remaining streaming service and wife is not willing to drop it ( I guess until 1 it is one error per one episode ).
I think this was true at some point, but I’ve been disappointed in the quality of the OSS Netflix tools recently. I think before k8s and a plethora of other tools matured, they were way ahead of the curve.
I specifically found the Netflix suite for Spring very lacking, and found message oriented architectures on something like NATS a lot easier to work with.
i thought they did DSA interviews at netflix what happened? I had to watch the fight on someone streaming to X from their phone at the event and it was better than watching on netflix..if you could watch at all. extremely embarrassing!
My theory is they've so heavily optimized for static content and distributing content on edge nodes that they were probably poorly setup for live-streaming.
One similar crash I remember very well was CNN on 9/11 - I tired to connect from France but is down the whole day.
Since then I am very used to it because our institutional web sites traditionally crash when there is a deadline (typically the taxes or school inscriptions).
As for that one, my son is studying in Europe (I am also in Europe), he called me desperate at 5 am or so to check if he is the only one with the problem (I am the 24/7 family support for anything plugged in). After having liberally insulted Netflix he realized he confirmed with his grandparents that he will be helping them at 10 :)
I never hear about it anymore. Is that because everyone wants to watch something different at their own time? Or is it actually working just fine now in the background? I see under the "Deployment" section it mentions IPTV in hotel rooms.
They should have partnered with every major CDN and load balanced across all of them. It’s ironic how we used to be better at broadcasting live events way back in the day versus today.
I watched the event last night and didn't get any buffering issues, but I did notice frequent drop in video quality when watching the live feed. If I backed the video up a bit, the video quality suddenly went back up to 4k.
I had some technical experience with live video streaming over 15 years ago. It was a nightmare back then. I guess live video is still difficult in 2024. But congrats to Jake Paul and boxing fans. It was a great event. And breaking the internet just adds more hype for the next one.
If you're going to be having intense algorithm interviews, paying top dollar for only hiring senior engineers, building high intensity and extreme distributed systems and having SRE engineers, we best see insanely good results and a high ROI out of it.
All of the conditions was perfect for Netflix, and it seems that the platform entirely flopped.
Is this what chaos engineering is all about that Netflix was marketing heavily to engineers? Was the livestream supposed to go down as Netflix removed servers randomly?
It seemed to be some capacity issue with the CDNs. When I stopped and restarted the stream it worked again. Perhaps they do not use real time multi-cdn switching.
What a massive blow to NFLX. They have been in the streaming game for years (survived COVID-19) and this silly exhibition match is what does them in?
I didn’t watch it live (boxing has lot its allure for me) but vicariously lived through it via social feed on Bluesky/Mastadon.
Billions of dollars at their disposal and they just can’t get it right. Probably laid off the highly paid engineers and teams that made their shit work.
Honestly you didn't miss much, every (real) boxing fan thought of this as a disgrace and a shame when announced. putting a 58 year old Tyson against a crackhead filled with steroids (Jake Paul) ? Either case it would have been a shame on Jake Paul for even getting in the ring with such an old boxer.
In boxing you are old by 32 or maybe 35 year old for heavy weight, and everything goes down very very fast.
A see in the comments multiple people talking about how "cable" companies who have migrated to IPTV has solved this problem.
I'd disagree.
I'm on IPTV and any major sporting event (World Series, Super Bowl, etc) is horrible buffering when I try to watch on my 4K IPTV (streaming) channel. I always have to downgrade to the HD channel and I still occasionally experience buffering.
Technical issues happen, but I wish they would've put up a YouTube stream or something (or at least asked YouTube to stop taking down the indie streams that were popping up). It seems like basically their duty to the boxers and the fans to do everything in their power to let the match be seen live, even if it means eating crow and using another platform.
On X.com someone had a stream that was stable to at least 5 million simultaneous viewers, but then (as I expected) someone at Netflix got them to pull the plug on it. So I would expect this fight to have say, 50 million + watching? Maybe as many as 150-250 million worldwide, given this is Tyson's last fight.
We all know netflix was built for static content, but its still hilarious that they have thousands of engineers making 500-1M in total comp and they couldnt live stream a basic broadcast. You probably could have just run this on AWS with a CDK configuration and quota increase from amazon
I'm sure the architecture and scale of NetFlix's operations is truly impressive, but stories like this make me further appreciate the elegant simplicity of scalability of analogue terrestrial TV, and to a similar extent, digital terrestrial TV and satellite.
All these engineering blog posts, distributed systems and these complex micro-services clearly didn't help with this issue.
Netflix is clearly not designed nor prepared for scalable multi-region live-streaming, no matter the amount of 'senior' engineers they throw at the problem.
Woke up at 4am (EU here), to tune for the main event. Bought Netflix just for this. The women fight went good, no buffering, 4K.
As it approached the time for Paul vs Tyson, it started to first drop to 140p, and then constantly buffer. Restarted my chromecast a few times, tried from laptop, and finally caught a stream on my mobile phone via mobile network rather than my wifi.
The TV Netflix kept blaming my internet which kept coming back as “fast”.
Ended up watching the utterly disappointing, senior abuse, live stream on my mobile phone with 360p quality.
Gonna cancel Netflix and never pay for it it again, nor watch hyped up boxing matches.
I thought it's only the best of the best of the best working at Netflix ... or maybe we can just put this myth to sleep that Netflix even knows what it's doing. The suggestions are shit, the UX is shit, apparently even the back end sucks.
I ended up turning my TV off and watching from my phone because of the buffering/freezing. The audio would continue to play and the screen would be frozen with a loading percentage that never changed.
I have Spectrum (600 Mbps) for ISP and Verizon for mobile.
Did anyone else see different behaviour with different clients? My TV failed on 25% loaded, my laptop loaded but played for a minute or two before getting stuck buffering, and my iphone played the whole fight fine. All on the same wifi network.
Internet live streaming is harder than cable tv sattelite live streaming over "dumb" TV boxes cable. They should not have used internet for this honestly. A TV signal can go to millions live.
I would have just made it simple, delay the live stream a few seconds and encode it into the same bucket where users already is playing static movies. Just have the player only allow start at the time everyone is at.
From my limited understanding, the NFL heavily depends on the Netflix Open Connect platform to stream media to edge locations, which is different from live streaming. Probably, they over-pushed the HD contents.
I'm watching the event as I'm writing this. I've been needing to exit the player and resume it constantly. Pretty surprising that Netflix hasn't weeded out these bugs.
All your corporate culture, comp
Structure, Interview process etc etc is all so much meta if you can’t deliver. They showed they can’t deliver. Huge let down.
This livestream broke the internet, no joke. youtube was barely loading and a bunch of other sites too. 130M is a conservative number given all the pirate streams.
I did some VPN hopping and connecting to an endpoint in Dallas has allowed me to start watching again. Not live though, that throws me back into buffering hell.
I think this is a result of most software "engineering" having become a self-licking ice cream cone. Besides mere scaling, the techniques and infrastructure should be mostly squared away.
Yes, it's all complicated, but I don't think we should excuse ourselves when we objectively fail at what we do. I'm not saying that Netflix developers are bad people, but that it doesn't matter how hard of a job it is; it was their job and what they did was inadequate to say the least.
Every time a big company screws up, there are two highly informed sets of people who are guaranteed to be lurking, but rarely post, in a thread like this:
1) those directly involved with the incident, or employees of the same company. They have too much to lose by circumventing the PR machine.
2) people at similar companies who operate similar systems with similar scale and risks. Those people know how hard this is and aren’t likely to publicly flog someone doing their same job based on uninformed speculation. They know their own systems are Byzantine and don’t look like what random onlookers think it would look like.
So that leaves the rest, who offer insights based on how stuff works at a small scale, or better yet, pronouncements rooted in “first principles.”
I've noticed this amongst the newer "careerist" sort of software developer who is stumbling into the field for money, as opposed to the obsessive computer geek of yesteryear, who practiced it as a hobby. This character archetype is a transplant, say, less than five years ago from another, often non-technical discipline, and was taught or learned from overly simplistic materials that decry systems programming, or networking, or computer science concepts as unnecessary, impractical skills, reducing everything to writing JavaScript glue code between random NPM packages found on google.
Especially in a time where the gates have come crashing down to pronouncements of, "now anybody can learn to code by just using LLMs," there is a shocking tendency to overly simplify and then pontificate upon what are actually bewilderingly complicated systems wrapped up in interfaces, packages, and layers of abstraction that hide away that underlying complexity.
It reminds me of those quantum woo people, or movies like What the Bleep Do We Know!? where a bunch of quacks with no actual background in quantum physics or science reason forth from drastically oversimplified, mathematics-free models of those theories and into utterly absurd conclusions.
Completely agreed. There are also former employees who have very educated opinions about what is likely going on, but between NDAs and whatnot there is only so much they are willing to say. It is frustrating for those in the know, but there are lines they can't or won't cross.
Whenever an HN thread covers subjects where I have direct professional experience I have to bite my tongue while people who have no clue can be as assertive and confidently incorrect as their ego allows them to be.
Right? A common complaint by outsiders is that Netflix uses microservices. I'd love to hear exactly how a monolith application is guaranteed to perform better, with details. What is the magic difference that would have ensured the live stream would have been successful?
The only time I worked on a project that had a live television launch, it absolutely tipped over within like 2 minutes, and people on HN and Reddit were making fun of it. And I know how hard everyone worked, and how competent they were, so I sympathize with the people in these cases. While the internet was teeing off with easy jokes, engineers were swarming on a problem that was just not resolving, PMs were pacing up and down the hallway, people were getting yelled at by leadership, etc. It's like taking all the stress and complexity of a product launch and multiplying it by 100. And the thing I'm talking about was just a website, not even a live video stream.
You are basically saying, everybody who criticizes Netflix now has no clue.
That’s a bold claim given that people with inside knowledge could post here without disclosing they are insiders.
Is that some kind of No True Scotsman?
3) the people supplying 1) and 2) with tools (hard- or software)
We (yep) don't know the exact details, but we do get sent snapshots of full configs and deployments to debug things... we might not see exact load patterns, but it's enough to know. And if course we can't tell due to NDAs.
you are so right about that. tho I'm sure that many of the netflix folks are still doing their after action analysis in prep for Dec 25 NFL.
now take this realization and apply it to any news article or forum post you read and think about how uninformed they actually are.
I'm sure 2) can post. But it won't be popular, so you'll need to dig to find it.
Most people are consumers and at the end of the day, their ability to consume a (boring) match was disrupted. If this was PPV (I don't think it is) the paid extra to not get the quality of product they expected. I'm not surprised they dominate the conversation.
And nonetheless, it freezes up.
You don’t belong to either group. What does this make you?
> who offer insights based on how stuff works at a small scale, or better yet, pronouncements rooted in “first principles.”
And looking through the comments, this is just wrong.
[flagged]
For an event like this, there already exists an architecture that can handle boundless scale: torrents.
If you code it to utilize high-bandwidth users upload, the service becomes more available as more users are watching -- not less available.
It becomes less expensive with scale, more available, more stable.
The be more specific, if you encode the video in blocks with each new block hash being broadcast across the network, just managing the overhead of the block order, it should be pretty easy to stream video with boundless scale using a DHT.
Could even give high-bandwidth users a credit based upon how much bandwidth they share.
With a network like what Netflix already has, the seed-boxes would guarantee stability. There would be very little delay for realtime streams, I'd imagine 5 seconds top. This sort of architecture would handle planet-scale streams for breakfast on top of the already existing mechanism.
But then again, I don't get paid $500k+ at a large corp to serve planet scale content, so what do I know.
The way to deal with this is to constantly do live events, and actually build organizational muscle. Not these massive one off events in an area the tech team has no experience in.
I have this argument a lot in tech.
We should always be doing (the thing we want to do)
Somme examples that always get me in trouble (or at least big heated conversations)
1. Always be building: It does not matter if code was not changed, or there has been no PRs or whatever, build it. Something in your org or infra has likely changed. My argument is "I would rather have a build failure on software that is already released, than software I need to release".
2. Always be releasing: As before it does not matter if nothing changed, push out a release. Stress the system and make it go through the motions. I can't tell you how many times I have seen things fail to deploy simply because they have not attempted to do so in some long period of time.
There are more just don't have time to go into them. The point is if "you did it, and need to do it again ever in the future, then you need to continuously do it"
They've been doing live events since 2023. But it's hard to be prepared for something that's never been done by anyone before — a superbowl scale event, entirely viewed over the internet. The superbowl gets to offload to cable and over the air. Interestingly, I didn't have any problems with my stream. So it sounds like the bandwidth problems might be localized, perhaps by data center or ISP.
Agreed. This is a management failure, full stop. Unbelievable that they'd expect engineering to handle a single Livestream event of this magnitude.
> ...the tech team has no experience in
Unless Netflix eng decides to release a public postmorterm, we can only speculate. In my time organizing small-time live streams, we always had up to 3 parallel "backup" streams (Vimeo, Cloudflare, Livestream). At Netflix's scale, I doubt they could simply summon any of these providers in, but I guess Akamai / Cloudflare would have been up for it.
The WWE is moving their programming to Netflix next year. If I were them, I'd be horrified at what I saw.
Sometimes this just isn't feasible for cost reasons.
A company I used to work for ran a few Super Bowl ads. The level of traffic you get during a Super Bowl ad is immense, and it all comes at you in 30 seconds, before going back to a steady-state value just as quickly. The scale pattern is like nothing else I've ever seen.
Super Bowl ads famously seven million dollars. These are things we simply can't repeat year over year, even if we believed it'd generate the same bump in recognition each time.
that’s difficult to reproduce at scale; there are only so many “super bowl” events in a calendar year
I think Netflix have a fair bit of organisational muscle, perhaps the fight was considered not as large of an event as the NFL streams would be in the future.
Also, "No experience in" really? You have no idea if that's really the case
Wow, building talent from within? I thought that went out of fashion. I think companies are too impatient to develop their employees.
Everyone here talking like this something unique netflix had to deal with. Hotstar live streamed india va Pakistan cricket match with zero issues with all time high live viewership ever in the history of live telecast. Why would viewers paying $20 month want to think about their technical issues, they dropped the ball pure and simple. Tech already exists for this, it’s been done before even by espn, nothing new here.
The Independent reports 35m viewers of that cricket match [0].
Rolling Stone reported 120m for Tyson and Paul on Netflix [1].
These are very different numbers. 120m is Super Bowl territory. Could Hotstar handle 3-4 of those cricket matches at the same time without issue?
[0] https://www.the-independent.com/sport/cricket/india-pakistan...
[1] https://www.rollingstone.com/culture/culture-news/jake-paul-...
But that's exactly the point: Netflix didn't do this in a vacuum, they did it within Netflix.
It might just have been easier to start from scratch, maybe using an external partner experienced in live streaming, but the chances of that decision happening in a tech-heavy company such as Netflix that seems to pride itself on being an industry leader are close to zero.
> with zero issues
depending on whom you ask, the bitrate used by the stream is significantly lower than what is considered acceptable from free livestreaming services, that albeit stream to much, much smaller audience.
without splitting hairs, livestreaming was never their forte, and going live with degradation elsewhere is not a great look for our distributed computing champ.
Netflix is good only on streaming ready made content, not live streaming, but;
1. Netflix is a 300B company, this isn't a resources issue.
2. This isn't the first time they have done live streaming at this scale either. They already have prior failure experience, you expect the 2nd time to be better, if not perfect.
3. There were plenty of time between first massive live streaming to second. Meaning plenty of time to learn and iterate.
The problem is that provisioning vast capacity for peak viewership is expensive and requires long-term commitment. Some providers won't give you more connectivity to their network unless you sign a 12 month deal where you prepay that.
Peak traffic is very expensive to run, because you're building capacity that will be empty/unsused when the event ends. Who'd pay for that? That's why it's tricky and that's why Akamai charges these insane prices for live streaming.
A "public" secret in that network layer is usually not redundant in your datacenter even if it's promised. To have redundant network you'd need to double your investment and it'll seat idle of at 50% max capacity. For 2hr downtime per year when you restart the high-capacity routers it's not cost efficient for most clients.
Yea, the issue here isn't just that they're having issues, it's that they're having the same issues they've had before.
They have the NFL next month on Christmas day. So that'll be a big streaming session but I think it'll be nothing compared to this. Even Twitter was having problems handling the live pirate streams there.
Apple was clearly larger than Google when they came out with Apple Maps, and it was issue-laden for a long time. It is not a resource-issue, but a tech development maturity issue.
>They already have prior failure experience
What was the previous fail?
Yeah didn't they crash on love is blind or one of their reality shows recently-ish?
You can't solve your way out of a complex problem that you have created and which wasn't needed in the first place. The entire microservices thing was overly complex with zero benefits
I spoke to multiple Netflix senior technicians about this.
They said that's the whole shtick.
People just do not appreciate how many gotchas can pop up doing anything live. Sure, Netflix might have a great CDN that works great for their canned content and I could see how they might have assumed that's the hardest part.
Live has changed over the years from large satellite dishes beaming to a geosat and back down to the broadcast center($$$$$), to microwave to a more local broadcast center($$$$), to running dedicated fiber long haul back to a broadcast center($$$), to having a kit with multiple cell providers pushing a signal back to a broadcast center($$), to having a direct internet connection to a server accepting a live http stream($).
I'd be curious to know what their live plan was and what their redundant plan was.
You are making excuses for a multibillion dollar company that has been in this game for many years. Maybe the first to market in streaming.
This isn’t NFLX’s first rodeo in live streaming. Have seen a handful of events pop up in their apps.
There is no excuse. All of the resources and talent at their disposal, and they looked absolutely amateurish. Poor optics.
I would be amazed if they are able to secure another exclusive contract like this in the future.
This is the whole point of chaos engineering that was invented at Netflix, which tests the resiliency of these systems.
I guess we now know the limits of what "at scale" is for Netflix's live-streaming solution. They shouldn't be failing at scale on a huge stage like this.
I look forward to reading the post mortem about this.
Is multicast a thing on the commercial internet? Seems like that could help.
It is weird because this was a solved problem.
Every major network can broadcast the Super Bowl without issue.
And while Netflix claims it streamed to 280 million, that’s if every single subscriber viewed it.
Actual numbers put it in the 120 million range. Which is in line with the Super Bowl.
Maybe Netflix needs to ask CBS or ABC how to broadcast
You’re talking about the contribution from the venue to the boardcast centre, increasingly not a full program but being mixed remotely.
That’s a very different area to transmission of live to end users.
> People just do not appreciate how many gotchas can pop up doing anything live.
Sure thing, but also, how much resources do you think Netflix threw on this event? If organizations like FOSSDEM and CCC can do live events (although with way smaller viewership) across the globe without major hiccups on (relatively) tiny budgets and smaller infrastructure overall, how could Netflix not?
Cable TV (or even OTA antenna in the right service area) is simply a superior live product compared to anything streaming.
The Masters app is the only thing that comes close imo.
Cable TV + DVR + high speed internet for torrenting is still an unmatched entertainment setup. Streaming landscape is a mess.
It's too bad the cable companies abused their position and lost any market goodwill. Copper connection direct to every home in America is a huge advantage to have fumbled.
The interesting thing is that a lot of TV infrastructure is now running over IP networks. If I were to order a TV connection for my home I'd get an IPTV box to connect to my broadband router via Ethernet, and it'd simply tell the upstream router to send a copy of a multicast stream my way.
Reliable and redundant multicast streaming is pretty much a solved problem, but it does require everyone along the way to participate. Not a problem if you're an ISP offering TV, definitely a problem if you're Netflix trying to convince every single provider to set it up for some one-off boxing match.
The Masters app is truly incredible, I don't know if it gets enough praise.
This. Im honestly going to cancel my streaming shit. They remove and mess with it so much. Like right now HBO max or whatever removes my recent watches after 90 days. why?
Apple TV MLB games look incredible compared to live cable tv.
It wasn't even just buffering issues, the feed would just stop and never start again until I paused it and then clicked "watch live" with the remote.
It was really bad. My Dad has always been a fan of boxing so I came over to watch the whole thing with him.
He has his giant inflatable screen and a projector that we hooked up in the front lawn to watch it, But everything kept buffering. We figured it was the Wi-Fi so he packed everything up and went inside only to find the same thing happening on ethernet.
He was really looking forward to watching it on the projector and Netflix disappointed him.
> My Dad has always been a fan of boxing
What did your Dad think about the 'boxing'?
On a few forum sites I'm on, people are just giving up. Looking forward to the post-mortem on how they weren't ready for this (with just a tiny bit of schadenfreude because they've interviewed and rejected me twice).
They sabotaging OP just for a reverse schadenfreude play
AB84 streamed it live from a box at the arena to ~5M viewers on Twitter. I was watching it on Netflix, I didn't have any problems, but I also put his live stream up for the hell of it. He didn't have any issues that I saw.
It’s not everyone. Works fine for me though I did have to reload the page when I skipped past the woman match to the Barrios Ramos fight and it was stuck buffering at 99%.
Can you share which forums
The post-mortem will be interesting indeed.
I wonder if there will be any long term reputational repercussions for Netflix because of this. Amongst SWEs, Netflix is known for hiring the best people and their streaming service normally seems very solid. Other streaming services have definitely caught up a bit and are much more reliable then in the early days, but my impression still has always been that Netflix is a step above the rest technically.
This sure doesn't help with that impression, and it hasn't just been a momentary glitch but hours of instability. And the Netflix status page saying "Netflix is up! We are not currently experiencing an interruption to our streaming service." doesn't help either...
Not the same demographic but their last large attempt at live was through a Love is blind reunion. It was the same thing, millions of people logging in, epic failure, nothing worked.
They never tried to do a live reunion again. I suppose they should have to get the experience. Because they are hitting the same problems with a much bigger stake event.
From what I've heard, Netflix has really diluted the culture that people know of from the Patty McCord days.
In particular, they have been revising their compensation structure to issue RSUs, add in a bunch of annoying review process, add in a bunch of leveling and titles, begin hiring down market (e.g. non-sr employees), etc.
In addition to doing this, shuffling headcount, budgets, and title quotas around has in general made the company a lot more bureaucratic.
I think, as streaming matured as a solution space, this (what is equivalent to cost-cutting) was inevitable.
If Netflix was running the same team/culture as it was 10 years ago, I'd like to say that they would have been able to pull of streaming.
So the issue is that Netflix gets its performance from colocating caches of movies in ISP datacenters, and a live broadcast doesn't work with that. It's not just about the sheer numbers of viewers, it's that a live model totally undermines their entire infrastructure advantage.
See: https://openconnect.netflix.com/en/
If Netflix still interviews on hacker rank puzzles I think this should be a wake up call. Interviewing on irrelevant logic puzzles is no match for systems engineering.
Was live streaming much of a use case for them before this?
They stream plenty of pre recorded video, often collocated. Live streaming seems like something they aren’t yet good at.
Has Netflix ever live streamed something before? People on reddit are reporting that if you back up the play marker by about 3 minutes the lag goes away. They've got a handle on streaming things when they have a day in advance to encode it into different formats and push it to regional CDNs. But I can't recall them ever live streaming something. Definitely nothing this hyped.
I don't spend much time streaming, but I got a glimpse of the Amazon Prime catalog yesterday, and was surprised at how many titles on the front page were movies I'd actually watch. Reminded me of Netflix a dozen years ago.
> ut my impression still has always been that Netflix is a step above the rest technically.
I always assumed youtube was top dog for performance and stability. I can’t remember the last time I had issues with them and don’t they handle basically more traffic than any other video service?
I think Netflix will have even more sw engineers looking to work there once they notice even for average quality of work they can get paid 3 times more than their current pay.
I think they have to refund the fees for a month to anyone who streamed this fight. That's the only thing that seems fair.
It has been pretty useless. At the moment seems to be working only when running in non-live mode several minutes behind.
So if there are 1 million trying to stream it, that means they would lose $15 million. So.. they might only give a partial refund.
But people should push for an automatic refund instead of a class action.
Netflix won't take a hit here.
Most people pay Netflix to watch movies and tv shows, not sports. If I hadn't checked Hacker News today, I wouldn't even know they streamed sports, let alone that they had issues with it. Even now that I do, it doesn't affect how I see their core offering, which is their library of on-demand content.
Netflix's infrastructure is clearly built for static content, not live events, so it's no shock they aren't as polished in this area. Streaming anything live over the internet is a tough technical challenge compared to traditional cable.
I think why I will remember about this fight is not the (small) streaming issue I encountered as much as the poor quality of the fight itself. For me that was the reputational loss. Netflix was touting “NFL is coming to Netflix”. This fight did not really make me want to watch that.
I used to work for a live streaming platform once. We always joked that VOD (Netflix) was "easy" compared to live.
I don't think it'll be long-term. Most people will forget about this really quickly. It's not like there will be many people saying "Oh, you don't want to sign up for Netflix, the Tyson fight wasn't well streamed" in even 6 months nevermind 10 years.
>but my impression still has always been that Netflix is a step above the rest technically.
Maybe if we're not counting Youtube as 'streaming', but in my mind no one holds a candle to YT quality in (live)streaming.
Is cable and broadcast better for live TV? No scaling issues. Doesn't matter how many people tune in.
Based on this I'm wondering whether it was straight up they did not expect it to be this popular?
> Some Cricket graphs of our #Netflix cache for the #PaulVsTyson fight. It has a 40 Gbps connection and it held steady almost 100% saturated the entire time.
https://fosstodon.org/@atoponce/113491103342509883
I don't think Netflix is even designed to handle very extreme multi-region live-streaming at scale as evidenced in this event with hundreds of millions simultaneously watching.
YouTube, Twitch, Amazon Prime, Hulu, etc have all demonstrated to stream simultaneously to hundreds of millions live without any issues. This was Netflix's chance to do this and they have largely failed at this.
There are no excuses or juniors to blame this time. Quite the inexperience from the 'senior' engineers at Netflix not being able to handle the scale of live-streaming which they may lose contracts for this given the downtime across the world over this high impact event.
Very embarrassing for a multi-billion dollar publicly traded company.
There's a difference between live broadcasts and serving up content that's sitting on a server I guess?
In my country every time there's a big football match the people who try to watch it on the internet face issues.
Yea, it’s a bad look. But I switched to watching some other Netflix video and it seemed fine. Just this event had some early issues. Looks fine now though.
Streamed glitch free for me both on my phone and Xbox. The fight wasn’t so great though, but still a fun event. Jake Paul is a money machine right now.
> their streaming service normally seems very solid
Not trying to downplay their complexity, but last I heard Netflix is splitting the shows in small data chunks and just serves them as static files.
Live streaming is a different beast
There's an upcoming NFL game on Netflix next month. They need to get their shit together.
For me, netflix constantly forget the last episode/spot I was in a TV show. Beyond frustrating
Yeah, the funny part is that Hulu, Amazon Prime, and Peacock have all demonstrated the ability to handle an event of this caliber with no issue. Netflix now may never get another opportunity like this again.
It may vary by ISP. It’s been fine for me.
[dead]
In 2012 Youtube did the Red Bull stratos live stream with 8m concurrent users. We're 12 years later, Netflix fucked up.
To me the difference is that in 2012, you had companies focusing on delivering a quality product, whether it made money or not. Today, the economic environment has shifted a lot and companies are trying to increase profits while cutting costs. The result is inevitably a decline in quality. I'm sure that Netflix could deliver a flawless live stream to millions of viewers, but the question is can they do it while making a profit that Wall Street is happy with. Apparently not.
The funny thing is I was just reading something on HN like three days ago about how light years ahead Netflix tech was compared to other streaming providers. This is the first thing I thought of when I saw the reports that the fight was messing up.
In 2012 Youtube did the Red Bull stratos live stream with 8m concurrent users
8m vs 60m. And not in 4K. Not a great choice for comparison.
But is there a way that Netflix might have learned from all of Youtube's past mistakes?
The only reasonable way to scale something like this up is probably to... scale it up.
Sure, there are probably some generic lessons, but I bet that the pain points in Netflix's architecture (historically grown over more than a decade and optimized towards highly cacheable content) are very different from Youtube, which has ramped up live content gradually over as many years.
2012 live video was what, 480p?
The average quality of talent has gone way down compared to 2012 though.
E.g. the median engineer, excluding entry level/interns, at YouTube in 2012 was a literal genius at their niche or quite close to it.
Netflix simply can’t hire literal geniuses with mid six figure compensation packages in 2024 dollars anymore… though that may change with a more severe contraction.
It's incomprehensible to me that Netflix, one of the most highly skilled engineering teams in the world - completely sh*t the bed last night and provided a nearly unwatchable experience that was not even in the same league as pre-internet live broadcast from 30 years ago.
My bet is that a technical manager told his executive (multiple times) that he needed more resources and engineering time to make live work properly, and they just told him to make do because they didn't want to spend the money.
It could come down to something as stupid as:
Executive: "we handled [on demand show ABCD] on day one, that was XX million"
Engineering: "live is really different"
Executive: (arguing about why it shouldn't be that different and should not need a lot of new infrastructure)
Engineering: (can't really argue with his boss about this anymore after having repeated the same conversation 3 or 4 times) -- tells the team: we are not getting new servers or time for a new project. We have to just make do with what we have. You guys are brilliant, I know you can do it!"
I had buffering issues but then backed off and let a bit of it buffer up (maybe 1 or 2 mintues?) and then it was fine for the entire Tyson Paul match. There was no reason I needed it to be live vs. a 1 or 2 minute delay.
What’s amazing is that they have had several streaming flips before and are unable to fix it
This topic is really just fun for me to read based on where I work and my role.
Live is a lot harder than on demand especially when you can't estimate demand (which I'm sure this was hard to do). People are definitely not understanding that. Then there is that Netflix is well regarded for their engineering not quite to the point of snobbery.
What is actually interesting to me is that they went for an event like this which is very hard to predict as one of their first major forays into live, instead of something that's a lot easier to predict like a baseball game / NFL game.
I have to wonder if part of the NFL allowing Netflix to do the Christmas games was them proving out they could handle live streams at least a month before. The NFL seems to be quite particular (in a good way) about the quality of the delivery of their content so I wouldn't put it past them.
Netflix’s snobbery of engineering is so exhausting. Then seeing them be unable to fix this problem after several previous streaming failures is a bit rich.
To me it speaks to how most of the top tech companies of the 2010s have degraded as of late. I see it all the time with Google hiring some of the lower performing engineers on my teams because they crushed Leetcode.
> The NFL seems to be quite particular (in a good way) about the quality of the delivery of their content
Alas, my experience with the NFL in the UK does not reflect that. DAZN have the rights to stream NFL games here, and there are aspects of their service that are very poor. My major, long-standing issue has been the editing of their full game “ad-free” replays - it is common for chunks of play to be cut out, including touchdowns and field goals. Repeated complaints to DAZN haven’t resulted in any improvements. I can’t help but think that if the NFL was serious about the quality of their offering, they’d be knocking heads together at DAZN to fix this.
Why is live a lot harder?
Aside from latency (which isn't much of a problem unless you are competing with TV or some other distribution system), it seems easier than on-demand, since you send the same data to everyone and don't need to handle having a potentially huge library in all datacenters (you have to distribute the data, but that's just like having an extra few users per server).
My guess is that the problem was simply that the number of people viewing Netflix at once in the US was much larger than usual and higher than what they could scale too, or alternatively a software bug was triggered.
This Serrano fight is just an insane display of excellence.
If anyone was waiting for the main card to tune in, I recommend tuning in now.
Absolutely excellent fight. 10 full rounds with full effort until the end. Fantastic.
Also, no buffering issues on my end. Have to wonder if it's a regional issue.
What was an amazing fight - that Serrano won. I have no idea how Taylor was scored the winner.
I know nothing about boxing and this fight was just ridiculously impressive. I kept tuning out of the earlier fights. They felt like some sort of filler. I didn’t get the allure. But Taylor v Serrano was just obvious talent that even I could appreciate it.
That was a savage fight!
naw, taylor head butting the whole fight was dirty and really took the wind out of it
Serrano should have won.
What do you think were the dynamics of the engineering team working on this?
I'd think this isn't too crazy to stress test. If you have 300 million users signed up then you're stress test should be 300 million simultaneous streams in HD for 4 hours. I just don't see how Netflix screws this up.
Maybe it was a management incompetence thing? Manager says something like "We only need to support 20 million simultaneous streams" and engineers implement to that spec even if the 20 million number is wildly incorrect.
Has there ever been a 300m concurrent live stream? I thought Disney+ had the record at something like 60m.
I love how I can come to HN to instantly find out if it’s Netflix or my WiFi.
Wifi wifi or wifi as in your ISP Internet connection? Sp many people now call an Internet connection "wifi".
Anyway, network cable is the only way to go!
This! I was checking my WiFi and then I instinctively checked HN and what do you know!
metoo!
Right?!
Reading the comments here, I think one thing that's overlooked is that Netflix, which has been on the vanguard of web-tech and has solved many complicated problems in-house, may not have had the culture to internally admit that they needed outside help to tackle this problem.
A combination of hubris and groupthink.
Not invented here syndrome works at first but as time progresses the internally built tools become a liability
Main event hasn’t even started yet. Traffic will probably 10x for that. They’re screwed. Should have picked something lower profile to get started with live streaming.
I don’t work in tech. Is this something that engineers could respond to and reallocate resources to fix mid stream?
They've done quite a bit of lower profile live streams... various events, and the Everybody's in LA chat show series.
When you step back and look at the situation, it's not hard to see why Netflix dropped the ball here. Here's now I see it (not affiliated with Netflix, pure speculation):
- Months ago, the "higher ups" at Netflix struck a deal to stream the fight on Netflix. The exec that signed the deal was probably over the moon because it would get Netflix into a brand new space and bring in large audience numbers. Along the way the individuals were probably told that Netflix doesn't do livestreaming but they ignored it and assumed their talented Engineers could pull it off.
- Once the deal was signed then it became the Engineer's problem. They now had to figure out how to shift their infrastructure to a whole new set of assumptions around live events that you don't really have to think about when streaming static content.
- Engineering probably did their absolute best to pull this off but they had two main disadvantages, first off they don't have any of the institutional knowledge about live streaming and they don't really know how to predict demand for something like this. In the end they probably beefed up livestreaming as much as they could but still didn't go far enough because again, no one there really knows how something like this will pan out.
- Evening started off fine but crap hit the fan later in the show as more people tuned in for the main card. Engineering probably did their best to mitigate this but again, since they don't have the institutional knowledge of live events, they were shooting in the dark hoping their fixes would stick.
Yes Netflix as a whole screwed this one up but I'm tempted to give them more grace than usual here. First off the deal that they struck was probably one they couldn't ignore and as for Engineering, I think those guys did the freaking best they could given their situation and lack of institutional knowledge. This is just a classic case of biting off more than one can chew, even if you're an SV heavyweight.
This isn't Netflix's first foray into livestreaming. They tried a livestream last year for a reunion episode of one of their reality TV shows which encountered similar issues [0]. Netflix already has a contract to livestream a football event on Christmas, so it'll be interesting to see if their engineers are able to get anything done in a little over a month.
These failures reflect very poorly on Netflix leadership. But we all know that leadership is never held accountable for their failures. Whoever is responsible for this should at least come forward and put out an apology while owning up to their mistakes.
[0] https://time.com/6272470/love-is-blind-live-reunion-netflix/
> They now had to figure out how to shift their infrastructure to a whole new set of assumptions around live events
It wasn't their first live event. A previous live event had similar issues.
Livestreaming is a solved problem. This sounds like NIH [1]. (At the very least, hire them as a back-up.)
[1] https://en.wikipedia.org/wiki/Not_invented_here
>First off the deal that they struck was probably one they couldn't ignore
If you can't provide the service you shouldn't sell it?
Not sure why Netflix is held in high regard - this proves they're just as much clowns as the other 'big players' in the circus.
I mean, maybe? You just made all this up.
[flagged]
[flagged]
It’s insane the excuses being made here for Netflix’s apparently unique circumstances.
They failed. Full stop. There is no valid technical reason they couldn’t have had a smooth experience. There are numerous people with experience building these systems they could have hired and listened to. It isn’t a novel problem.
Here are the other companies that are peers that livestream just fine, ignoring traditional broadcasters:
- Google (YouTube live), millions of concurrent viewers
- Amazon (Thursday Night Football, Twitch), millions of concurrent viewers
- Apple (MLS)
NBC live streamed the Olympics in the US for tens of millions.
As a cofounder of a CDN company that pushed a lot of traffic, the problem with live streaming is that you need to propagate peak viewership trough a loooot of different providers. The peering/connectivity deals are usually not structured for peak capacity that is many times over the normal 95th percentile. You can provision more connectivity, but you don't know how many will want to see the event. Also, live events can be trickier than stored files, because you can't offload to the edges beforehand to warm up the caches.
So Netflix had 2 factors outside of their control
- unknown viewership
- unknown peak capacities outside their own networks
Both are solvable, but if you serve "saved" content you optimize for different use case than live streaming.
I don't disagree that Netflix could have / should have done better. But everybody screws these things up. Even broadcast TV screws these things up.
Live events are difficult.
I'll also add on, that the other things you've listed are generally multiple simultaneous events; when 100M people are watching the same thing at the same time, they all need a lot more bitrate at the same time when there's a smoke effect as Tyson is walking into the ring; so it gets mushy for everyone. IMHO, someone on the event production staff should have an eye for what effects won't compress well and try to steer away from those, but that might not be realistic.
I did get an audio dropout at that point that didn't self correct, which is definitely a should have done better.
I also had a couple of frames of block color content here and there in the penultimate bout. I've seen this kind of stuff on lots of hockey broadcasts (streams or ota), and I wish it wouldn't happen... I didn't notice anything like that in the main event though.
Experience would likely be worse if there were significant bandwidth constraints between Netflix and your player, of course. I'd love to see a report from Netflix about what they noticed / what they did to try to avoid those, but there's a lot outside Netflix's control there.
The examples given here are not on the same scale. The numbers known so far:
- 120m viewers [1]
- Entire Netflix CDN Traffic grew 4x when the live stream started [2]
[1] https://www.rollingstone.com/culture/culture-news/jake-paul-...
[2] https://x.com/DougMadory/status/1857634875257294866
Amazon had their fair share of livestream failures and for notably less viewers. I don't think they deserve a spot on that list. I briefly worked in streaming media for sports and while it's not a novel problem, there are so many moving parts and points of failure that it can easily all go badly.
> They failed. Full stop.
It's not full stop. There are reasons why they failed, and for many it's useful and entertaining to dissect them. This is not "making excuses" and does not get in the way of you, apparently, prioritizing making a moral judgment.
The big difference of all the examples you’ve mentioned is dedicated full-time crews on the ground where the events are produced.
I’m pretty confident that when the post mortem is done the issues are going to be way closer to the broadcast truck than the user.
it could be that they made use of the same advice X followed :)
Live streaming is hard. Most companies that do live streaming at 2024 scale did it by learning from their mistakes. This is true for Hotstar, Amazon and even Youtube. Netflix stack is made to stream optimised, compressed , cached videos with a manageable concurrent viewers for the same video. Here we had ~65m concurrent viewers in their first live event. The compression they use, distribution etc have not scaled up well. I'll judge them based on how they handle their next live event
I don't think this is their first live event. They have hosted a pro golf promotional match and they had a live pro tennis match between Nadal and Alcaraz off the top of my head.
It will never not annoy and amuse me that illegal options (presumably run by randoms in their spare time) are so much better than the offerings of big companies and their tech ‘talent’.
Illegal options would have lot less active users. So it is not a fair comparison
I have Netflix purchased legally with hard earned money. But because I had issues I looked for illegal streams, and they were bad, crashes, buffering.. you name it. So I went back to Netflix and watched it at 140p quality.
Utter incompetence from senior leadership at Netflix. They had so much time to prepare for this.
I want to index everyone sneering at this situation and never work with any of them.
yep, especially knowing this isn't their first rodeo... 18 months since https://time.com/6272470/love-is-blind-live-reunion-netflix/
> But the real indicator of how much Sunday’s screw-up ends up hurting Netflix will be the success or failure of its next live program—and the next one, and the one after that, and so on. There’s no longer any room for error. Because, like the newly minted spouses of Love Is Blind, a streaming service can never stop working to justify its subscribers’ love. Now, Netflix has a lot of broken trust to rebuild.
Weird that an organization like Netflix is having problems with this considering their depth of both experience and pockets. I wonder if they didn't expect the number of people who were interested in finding out what the pay-per-view experience is like without spending any extra money. Still, I suppose we can all be thankful Netflix is getting to cut their live event teeth on "alleged rapist vs convicted rapist" instead of something more important.
> alleged rapist vs convicted rapist
And you’ll never guess which Presidential candidate they both support!
From my experience, it works if your not watching it 'live'. But the moment I put my devices to 'live' it perma-breaks. 504 gateway timed out in web developer tools hitting my local CDN. probably works on some CDNs, doesnt on others. Probably works if your not 'live'
edit: literally a nginx gateway timed out screen if you view the response from the cdn... wow
It's down permanently for me in India. We have Hotstar, which has a record of 58 million viewers during the cricket World Cup final. Way ahead.
Probably less about the level of advancement and more about their ability to stream vs play VOD. Two different kinds of infrastructure optimisation.
Wasn't that the biggest concurrent stream ever?
Can Mike Judge please stop predicting everything?
I've been re-watching Silicon Valley the last few weeks and just watched the Nucleus live stream episode 2 days ago, pretty funny seeing it in real life.
This is probably a naive question but very relevant to what we have here.
In a protocol where a oft-repeated request goes through multiple intermediaries, usually every intermediate will be able to cache the response for common queries (Eg: DNS).
In theory, ISPs would be able to do the same with the HTTP. Although I am not aware of anyone doing such (since it will rightfully raise concerns of privacy and tampering).
Now TLS (or other encryption) will break this abstraction. Every user, even if they request a live stream, receives a differently encrypted response.
But live stream of a popular boxing match has nothing to do with the "confidentiality" of encryption protocol, only integrity.
Do we have a protocol which allows downstream intermediates eg ISPs to cache content of the stream based on demand, while a digital signature / other attestation being still cryptographically verified by the client?
there's Named Data Networking (went by Content-Centric Networking earlier). You request data, not a url, the pipe/path becomes the CDN. If any of your nearest routers have the bytes, your request will go no further.
I don't see it much mentioned the last few years, but the research groups have ongoing publications. There's an old 2006 Van Jacobson video that is a nice intro.
What you describe is called a CDN and has been widely used for 20 years.
I guarantee this is a management issue. Somebody needed to bear down at some point and put the resources into load testing. The engineers told them it probably won't be sufficient.
I assume this came down to some technical manager saying they didn't have the human and server resources for the project to work smoothly and a VP or something saying "well, just do the best you can.. surely it will be at least a little better than last time we tried something live, right?"
I think there should be a $20 million class action lawsuit, which should be settled as automatic refunds for everyone who streamed the fight. And two executives should get fired.
At least.. that's how it would be if there was any justice in the world. But we now know there isn't -- as evidenced by the fact that Jake Paul's head is still firmly attached to his body.
I am curious about their live streaming infrastructure.
I have done live streaming for around 100k concurrent users. I didn't setup infrastructure because it was CloudFront CDN.
Why it is hard for Netflix. They have already figured out CDN part. So it should not be a problem even if it is 1M or 100M. because their CDN infrastructure is already handling the load.
I have only work with HLS live streaming where playlist is constantly changing compared to VOD. Live video chunks work same as VOD. CloudFront also has a feature request collapsing that greatly help live streaming.
So, my question is if Netflix has already figured out CDN, why their live infrastructure failing?
Note: I am not saying my 100k is same scaling as their 100M. I am curious about which part is the bottleneck.
> Why it is hard for Netflix. They have already figured out CDN part. So it should not be a problem even if it is 1M or 100M. because their CDN infrastructure is already handling the load ... Note: I am not saying my 100k is same scaling as their 100M. I am curious about which part is the bottleneck.
100k concurrents is a completely different game compared to 10 million or 100 million. 100k concurrents might translate to 200Gbps globally for 1080p, whereas for that same quality, you might be talking 20T for 10 million streams. 100k concurrents is also a size such that you could theoretically handle it on a small single-digit number of servers, if not for latency.
> CloudFront also has a feature request collapsing that greatly help live streaming.
I don't know how much request coalescing Netflix does in practice (or how good their implementation is). They haven't needed it historically, since for SVOD, they could rely on cache preplacement off-peak. But for live, you essentially need a pull-through cache for the sake of origin offload. If you're not careful, your origin can be quickly overwhelmed. Or your backbone if you've historically relied too heavily on your caches' effectiveness, or likewise your peering for that same reason.
200Gbps is a small enough volume that you don't really need to provision for that explicitly; 20Tbps or 200Tbps may need months if not years of lead time to land the physical hardware augments, sign additional contracts for space and power, work with partners, etc.
Live streaming and streaming prerecorded movies is a whole different ballgame.
In fact, optimizing for later can hurt the former.
Would be interesting to read any postmortems on this failure. Maybe someone will be kind enough to share the technical details for the curious crowd.
Amazon had issues last year too when they started broadcasting TNF but its fine these days.
I'm sure they will get it figured out.
> envoy overloaded
That's the plain-text message I see when I tried to refresh the stream.
Follow-up:
My location: East SF Bay.
Now even the Netflix frontpage (post login, https://www.netflix.com/browse ) shows the same message.
The same message even in a private window when trying to visit https://www.netflix.com/browse
The first round of the fight just finished, and the issues seem to be resolved, hopefully for good. All this to say what others have noted already, this experience does not evoke a lot of confidence in Netflix's live-streaming infrastructure.
Ah, envoy. Now that is a name I have not missed.
I thought Netflix engineers were the best and could even do mythical leetcode hards. What happened? Why are they paid half a million dollars a year?
Isn't this more of a management problem, trying to turn a not-livestream system into a livestreaming one?
Hell, I’d complaing about Jake Paul vs Mike Tyson as well if I was a boxing fan. Even without buffering issues
Yes, it was utterly boring, but they made their money. I don't like either Paul brother, so I only watched in hopes shorter, much-older Tyson would make Jake look as foolish as he is.
They better get have some better judges and refs too. The co-headline title fight was a joke.
Reminds me of Nucleus stuttering during UFC
I could hear Gavin Belson screaming during the broadcast when my stream was freezing as they were each making their entrance. Mike Judge is a prophet.
That show ages better every single day.
Netflix has some NFL games on Christmas Day. Wonder how those will go for them.
I remember when ESPN started streaming years back, it was awful. Now I almost never have problems with their live events, primarily their NHL streams.
A friend and I, in separate states, found that it wouldn’t stream from TVs, Roku, etc. but would stream from mobile. And for me, using a mobile hotspot to a laptop; though that implies checking IP address range instead of just user-agent, so that seems unlikely.
Anyway, I wouldn’t be surprised if they were prioritizing mobile traffic because it’s more forgiving of shitty bitrate.
I wonder if this points to network peering and edge nodes. Mobile network vs cabled network likely being routed to different places.
I just left a bar streaming it on a smart TV and back in my home it's streaming on the Roku just fine.
FWIW, works fine for me.
Please don't make these types of comments, they mean nothing and they serve no purpose.
Been working great for me as well. Starlink in Oregon.
The stream never buffered on my side but quality was for the whole duration of the stream pretty basic I doubt it was even 720p
Us too
On a tangential note, the match totally looked fixed to me - Tyson was barely throwing any punches. I understand age is not on his side, but he looked plenty spry when he was ducking, weaving and dodging. It seemed to me he could have done better in terms of attacking as well.
I would argue Tyson has a shorter reach, Jake was whiffing a lot of superman punches, and all that does is waste energy. Jake might be able to throw punches, but he clearly wasn't interested in taking them. If they stood closer and slugged it out, the fight could have gone either way.
Yeah the biggest thing to me, and the commentators mentioned this as well, his legs looked REALLY wobbly.
All your attacking power comes from your legs and hips, so if his legs weren’t stable he didn’t have much attacking power.
I think he gave it everything he had in rounds 1 and 2. Unfortunately, I just don’t think it was ever going to be enough against a moderately trained 27 year old.
Bet they wish they'd gone with middle out compression
When they come up with that idea it's the most 18-rated and accurate way an engineer would think about it.
It probably depends more on the ISP than on Netflix. Engineers over in my ISP’s subreddit are talking about how flows from Netflix jumped by over 450Gb/s and it was no big deal because it wasn’t enough to cause any congestion on any of their backbone links.
My kid woke me up complaining internet is not working. Turns out he is trying to watch the fight and it's not working at all here in India.
I think they must be noticing the issues, because I've noticed they've been dropping the stream quality quite substantially... It's a clever trick, but kind of cheap to do so, because who wants to watch pixelated things?
To be brutally honest if it’s a choice between pixelated and constantly buffering, pixelated is way less bad. Constantly buffering is incredibly annoying during live sports. (but this doesn’t negate your main point which is that if people paid to watch they expect decent resolution)
Looks like I’m playing Tysons Punchout right now
Glass Jake?
I wrote an analysis on doing this kind of unicast streaming in cable networks a decade ago. For edge networks with reasonable 100gig distribution as their standard, these would see some of the minor buffering issues.
There is a reason that cable doesn’t stream unicast and uses multicast and QAM on a wire. We’ve just about hit the point where this kind of scale unicast streaming is feasible for a live event without introducing a lot of latency. Some edge networks (especially without local cache nodes) just simply would not have enough capacity, whether in the core or peering edge, to do the trick.
Saw an Arista presentation about the increase in SFP capacity, it's Moore law style stuff. Arm based kit has a shockingly efficient amount of streams-per-watt too.
I can't see traditional DVB/ATSC surviging much beyond 2040 even accounting for the long tail.
You're right that large scale parallel live streams has only become feasible in the last few years. The BBC has some insights in how the BBC had to change their approach to scale to getting 10 million in 2021, having had technical issues in the 3 million range in 2018
https://www.bbc.co.uk/webarchive/https%3A%2F%2Fwww.bbc.co.uk...
Personally I don't think the latency is solved yet -- TV is slow enough (about 10 seconds from camera to TV), but IP streaming tends to add another 20-40 seconds on top of that.
That's no good when you're watching the penalties. Not only will your neighbours be cheering before you as they watch on normal TV, but even if you're both on the same IPTV you may well 5 seconds of difference.
The total end-to-end time is important too, with 30 seconds the news push notifications, tweets, etc on your phone will come in before you see the result.
Dumb question
Isn't live streaming at scale already solved problem by cable companies? I never seen ESPN going down during a critical event
Yes, as I have said again and again on hacker news in different comments Netflix went overboard with their microservices and tried to position itself as a technological company when it's not. It has made everything more complex and that's why any Netflix tech blog is useless because it is not the way to build things correctly.
To understand how to do things correctly look at something like pornhub who handle more scale than Netflix without crying about it.
The other day I was having this discussion with somebody who was saying distributed counter logic is hard and I was telling them that you don't even need it if Netflix didn't go completely mental on the microservices and complexity.
This is not the same streaming - netflix is doing that over HTTP. Totally different tech and scaling issues
You would think, but technology always finds a way to screw things up. Cox Communications has had ongoing issues with their video for weeks because of Juniper router upgrades and even the vendor can't fix it. They found this out AFTER they put it in production. Shit happens.
Does anyone have any thoughts besides "bad engineering" on what could've gone wrong? It seems like taking on a new endeavor like streaming an event that would possibly draw many hundreds of millions of viewers doesn't make sense. Is there any obvious way that this would just work, or is there obviously a huge mistake deeply rooted in the whole thing. Also, are there any educated guesses on some fine details in the codebase and patterns that could result in this?
I don't understand why the media is pushing this a Jake Paul vs Mike Tyson stuff so hard and why people care about it. Boxing is crude entertainment for low intelligence people.
I'm tired of all this junk entertainment which only serves to give people second-hand emotions that they can't feel for themselves in real life. It's like, some people can't get sex so they watch porn. People can't fight so they watch boxing. People can't win in real life so they play video games or watch superhero movies.
Many people these days have to live vicariously through random people/entities; watch others live the life they wished they had and then they idolize these people who get to have everything... As if these people were an intimate projection of themselves... When, in fact, they couldn't be more different. It's like rooting for your opponent and thinking you're on the same team; when, in fact, they don't even know that you exist and they couldn't be more different from you.
You're no Marvel superhero no matter how many comic books you own. The heroes you follow have nothing to do with you. Choose different heroes who are more like you. Or better; do something about your life and give yourself a reason to idolize yourself.
Mine is glitchy, but if I refresh i get a good steam for a bit, then it gets low res, then freeze. If I wait for auto-reconnect it takes forever. Hard refresh and I'm good. Like, new streams to new server, then overloaded, then does as if their cluster is crashing and healing is rapid cycles. Sawtooth patterns on their charts.
And then all these sessions lag, or orphan taking up space, so many reconnections at various points in the stream.
System getting hammered. Can't wait for this writeup.
Hopefully they fix it because they are hosting two Christmas NFL games this year and if you want to really piss people off you have buffering issues during NFL games lol.
Maybe this was a stress test for the NFL games?
I'd expect the NFL games to have a largely American audience, but today's boxing event attracted a global audience.
How is this story not on the front page anymore? 375 comments. Seems like a big story to me.
I believe HN's algorithm tends to relatively downrank stories with a high comment-to-upvote ratio, because they are more often flamewars on divisive topics.
I can feel the pressure on the network engineers from here XD
The arrogant Netflix! They always brag about how technologically superior they are, and they can't handle a simple technological challenge! I didn't have a buffering issue, I had an error page - for hours! Yet, they kept advertising the boxing match to me! What a joke! If you can't stream it, don't advertise it to save face with people like me who don't care about boxing!
Every organization makes mistakes and every organization has outages. Netflix is not different. Instead, of bashing them because they are imperfect, you might want to ask what you can learn from this incident. What would you do if your service received more traffic than expected? How would you test your service so you can be confident it will stay up?
Also, I have never seen any Netflix employees who are arrogant or who think they are superior to other people. What I have seen is Netflix's engineering organization frequently describes the technical challenges they face and discusses how they solve them.
I think you’re oversimplifying it. Live event streaming is very different from movie streaming. All those edge cache servers become kinda useless and you start hitting peering bottlenecks.
After a few buffering timeouts during the first match, the rest of the event had no technical difficulties (in SoCal, so close to one of Netflix's HQs).
Unfortunately, except for the women's match, the fights were pretty lame...4 of the 6 male boxers were out of shape. Paul and Tyson were struggling to stay awake and if you were to tell me that Paul was just as old as Tyson I would have believed it.
Seems like the magic number was 60 million concurrent streams
https://www.theverge.com/2024/11/16/24298338/netflix-mike-tt...
That's a suspicious number, given the previous world record is 59 million.
Pure speculation as I have 0 knowledge.
Assuming Netflix used its extensive edge cache network to distribute the streams to the ISPs. The software on the caching servers would have been updated to be capable of dealing with receiving and distributing live streamed content, even if maybe the hardware was not optimal for that (throughput vs latency is a classic networking tradeoff).
Now inside the ISPs network again everyting would probably be optimized for the 99.99% usecase of the Netflix infra: delivering large bulk data that is not time sensitive. This means very large buffers to shift big gobs of packets in bulk.
As everything along the path is trying to fill up those buffers before shipping to the next router on the path, some endpoints aware this is a live stream start cancelling and asking for more recent frames ...
Hilarity ensues
Why do they want to get into the live business? It doesn't seem to synergize with their infrastructure. Sending the same stream in real time to numerous people just isn't the same task as letting people stream optimized artifacts that are prepositioned at the edge of the network.
Most PPV is what, $50-$70? So subscribing to Netflix for $20 or whatever per month sounds like a bargain for anyone who is interested and not already a customer. Then assume some large percentage doesn’t cancel either because they forgot, or because they started watching a show and then decided to keep paying.
They want to break into sports because it’s such a big business and if you do sports you need to be able to stream live.
Live is the only thing that won’t be commodified entirely. “Anyone” can pump out stream-when-you-want TV shows. Live events are generally exclusive, unpredictable, and cultural moments .
Not sure why this is being downvoted. I can see your point - it’s much harder to this live but a lot of their cdn infra can be reused.
It was so bad. So so bad. Like don’t use your customers as guinea pigs for live streaming. So lame. They need a new head of content delivery. You can’t charge customers like that and market a massive event and your tech is worse than what we had from live broadcast tv.
I watched on an AppleTV and the stream was rock solid.
I don’t know if it’s still the case, but in the past some devices worked better than others during peak times because they used different bandwidth providers. This was the battle between Comcast and Cogent and Netflix.
Your device type has no influence on your provider and its bandwidth characteristics. If you're on Comcast, Apple can't magically make it not suck.
> I watched on an AppleTV and the stream was rock solid.
For me it was buffering and low resolution, on the current AppleTV model, hardwired, with a 1Gbps connection from AT&T. Some streaming devices may have handled whatever issues Netflix was having better than others, but this was clearly a bigger problem than just the streaming device.
I thought Netflix’s biggest advantage was the quality/salary of its engineers.
I think that every time I wait for Paramount+ to restart after its gone black in picture on picture, and yet, I’n still on Paramount+ and not Netflix, so maybe that advantage isn’t real.
Sigh, none of the competitors are much better. Disney, who has more than enough cash to throw at streaming, is a near constant hassle for us ( after 3 or more episodes it throws an inscrutable error on Playstation ). I would drop it, but this is the only remaining streaming service and wife is not willing to drop it ( I guess until 1 it is one error per one episode ).
I think this was true at some point, but I’ve been disappointed in the quality of the OSS Netflix tools recently. I think before k8s and a plethora of other tools matured, they were way ahead of the curve.
I specifically found the Netflix suite for Spring very lacking, and found message oriented architectures on something like NATS a lot easier to work with.
i thought they did DSA interviews at netflix what happened? I had to watch the fight on someone streaming to X from their phone at the event and it was better than watching on netflix..if you could watch at all. extremely embarrassing!
My theory is they've so heavily optimized for static content and distributing content on edge nodes that they were probably poorly setup for live-streaming.
One similar crash I remember very well was CNN on 9/11 - I tired to connect from France but is down the whole day.
Since then I am very used to it because our institutional web sites traditionally crash when there is a deadline (typically the taxes or school inscriptions).
As for that one, my son is studying in Europe (I am also in Europe), he called me desperate at 5 am or so to check if he is the only one with the problem (I am the 24/7 family support for anything plugged in). After having liberally insulted Netflix he realized he confirmed with his grandparents that he will be helping them at 10 :)
Does anyone remember IP multicast?
I remember a lot of trade magazines in the late 1990's during the dot com boom talked about how important it would be.
https://en.wikipedia.org/wiki/IP_multicast
I never hear about it anymore. Is that because everyone wants to watch something different at their own time? Or is it actually working just fine now in the background? I see under the "Deployment" section it mentions IPTV in hotel rooms.
It’s a learning experience! I remember Conor and Floyd broke hbo and the ufc. It’s a hard problem for sure!
Some buffering issues for us, but I bet views are off the charts. Huge for Netflix, bad for espn, paramount, etc etc
They should have partnered with every major CDN and load balanced across all of them. It’s ironic how we used to be better at broadcasting live events way back in the day versus today.
I watched the event last night and didn't get any buffering issues, but I did notice frequent drop in video quality when watching the live feed. If I backed the video up a bit, the video quality suddenly went back up to 4k.
I had some technical experience with live video streaming over 15 years ago. It was a nightmare back then. I guess live video is still difficult in 2024. But congrats to Jake Paul and boxing fans. It was a great event. And breaking the internet just adds more hype for the next one.
If you're going to be having intense algorithm interviews, paying top dollar for only hiring senior engineers, building high intensity and extreme distributed systems and having SRE engineers, we best see insanely good results and a high ROI out of it.
All of the conditions was perfect for Netflix, and it seems that the platform entirely flopped.
Is this what chaos engineering is all about that Netflix was marketing heavily to engineers? Was the livestream supposed to go down as Netflix removed servers randomly?
It seemed to be some capacity issue with the CDNs. When I stopped and restarted the stream it worked again. Perhaps they do not use real time multi-cdn switching.
It's far from perfect here in Canada, I keep having to pause it or go back and then load it again.
Oddly having watched PPV events via the high seas for years, it feels normal...
Wow I feel scammed. I paid for a Netflix subscription specifically for this but it's not loading so I'm watching on an illegal streaming website
What a massive blow to NFLX. They have been in the streaming game for years (survived COVID-19) and this silly exhibition match is what does them in?
I didn’t watch it live (boxing has lot its allure for me) but vicariously lived through it via social feed on Bluesky/Mastadon.
Billions of dollars at their disposal and they just can’t get it right. Probably laid off the highly paid engineers and teams that made their shit work.
Honestly you didn't miss much, every (real) boxing fan thought of this as a disgrace and a shame when announced. putting a 58 year old Tyson against a crackhead filled with steroids (Jake Paul) ? Either case it would have been a shame on Jake Paul for even getting in the ring with such an old boxer.
In boxing you are old by 32 or maybe 35 year old for heavy weight, and everything goes down very very fast.
End of rant.
Just adding a data point, here in Canada on my nVidia Shield it went down to 360p a dozen times or so, but never paused at all. I guess I got lucky.
IPTV
A see in the comments multiple people talking about how "cable" companies who have migrated to IPTV has solved this problem.
I'd disagree.
I'm on IPTV and any major sporting event (World Series, Super Bowl, etc) is horrible buffering when I try to watch on my 4K IPTV (streaming) channel. I always have to downgrade to the HD channel and I still occasionally experience buffering.
So Netflix isn't alone in this matter.
Technical issues happen, but I wish they would've put up a YouTube stream or something (or at least asked YouTube to stop taking down the indie streams that were popping up). It seems like basically their duty to the boxers and the fans to do everything in their power to let the match be seen live, even if it means eating crow and using another platform.
Was this their first time doing live content? I figured something would go wrong. I'm sure lots of people were watching.
I’m not sure buffering was the biggest issue with this event. How was as 58 year old Tyson fighting a man in his 20s?
On X.com someone had a stream that was stable to at least 5 million simultaneous viewers, but then (as I expected) someone at Netflix got them to pull the plug on it. So I would expect this fight to have say, 50 million + watching? Maybe as many as 150-250 million worldwide, given this is Tyson's last fight.
We all know netflix was built for static content, but its still hilarious that they have thousands of engineers making 500-1M in total comp and they couldnt live stream a basic broadcast. You probably could have just run this on AWS with a CDK configuration and quota increase from amazon
Streaming live can be a very different thing than on-demand.
Hopefully Netflix can share more about what they learned, I love learning about this stuff.
I had joked I would probably cancel Netflix after the fight.. since I realized other platforms seemed to have more content both old and new.
Then the video started stuttering.
I'm sure the architecture and scale of NetFlix's operations is truly impressive, but stories like this make me further appreciate the elegant simplicity of scalability of analogue terrestrial TV, and to a similar extent, digital terrestrial TV and satellite.
Every time it buffers for me, Netflix does an internet test only for it to come back and say its fast...
All these engineering blog posts, distributed systems and these complex micro-services clearly didn't help with this issue.
Netflix is clearly not designed nor prepared for scalable multi-region live-streaming, no matter the amount of 'senior' engineers they throw at the problem.
I’m very disappointed.
Woke up at 4am (EU here), to tune for the main event. Bought Netflix just for this. The women fight went good, no buffering, 4K.
As it approached the time for Paul vs Tyson, it started to first drop to 140p, and then constantly buffer. Restarted my chromecast a few times, tried from laptop, and finally caught a stream on my mobile phone via mobile network rather than my wifi.
The TV Netflix kept blaming my internet which kept coming back as “fast”.
Ended up watching the utterly disappointing, senior abuse, live stream on my mobile phone with 360p quality.
Gonna cancel Netflix and never pay for it it again, nor watch hyped up boxing matches.
Not enough chaos monkey engineering.
I'm watching on a 'pirate' stream because my netflix stream is absolutely frozen.
I thought it's only the best of the best of the best working at Netflix ... or maybe we can just put this myth to sleep that Netflix even knows what it's doing. The suggestions are shit, the UX is shit, apparently even the back end sucks.
I ended up turning my TV off and watching from my phone because of the buffering/freezing. The audio would continue to play and the screen would be frozen with a loading percentage that never changed.
I have Spectrum (600 Mbps) for ISP and Verizon for mobile.
Did anyone else see different behaviour with different clients? My TV failed on 25% loaded, my laptop loaded but played for a minute or two before getting stuck buffering, and my iphone played the whole fight fine. All on the same wifi network.
Even people on hacker news do not understand.
Internet live streaming is harder than cable tv sattelite live streaming over "dumb" TV boxes cable. They should not have used internet for this honestly. A TV signal can go to millions live.
I would have just made it simple, delay the live stream a few seconds and encode it into the same bucket where users already is playing static movies. Just have the player only allow start at the time everyone is at.
TreeDN: Tree-Based CDNs for Mass Audience Live Streaming https://www.youtube.com/watch?v=wRUwsvept-8
This is Netflix's statement on the fiasco
https://x.com/netflix/status/1857906492235723244
From my limited understanding, the NFL heavily depends on the Netflix Open Connect platform to stream media to edge locations, which is different from live streaming. Probably, they over-pushed the HD contents.
I'm watching the event as I'm writing this. I've been needing to exit the player and resume it constantly. Pretty surprising that Netflix hasn't weeded out these bugs.
All your corporate culture, comp Structure, Interview process etc etc is all so much meta if you can’t deliver. They showed they can’t deliver. Huge let down.
This livestream broke the internet, no joke. youtube was barely loading and a bunch of other sites too. 130M is a conservative number given all the pirate streams.
Illegal streams are working but netflix is not. That is crazy.
I did some VPN hopping and connecting to an endpoint in Dallas has allowed me to start watching again. Not live though, that throws me back into buffering hell.
They're not used to live. I imagine that's it. All their caching infrastructure is there assuming the content isn't currently being generated.
Serves Netflix right for killing my beloved DVD rentals.
Guess they should have livestreamed it on X to be safe!
I’m a little amused at folks tuning in for meme / low quality personalities doing things … and getting the equivalent production values.
It’s like watching a Minecraft cosplay of the event.
It's been fine since 11:00 EST, I wonder if they started using the CDN more effectively and pushed everyone back a few minutes?
Currently trying to watch it and it's not loading at all for me. Re-subscribed specifically for the fight.
I watched the whole fight with a 2 minute delay. That was frustrating and it didn't help that Tyson lost.
It's not lagging for me. It crashed and not coming back.
Update: Switched to the app on my phone and so far so good.
Maybe if jedberg and Brendan Gregg were still a part of Netflix, that this wouldn’t have happened.
Nucleix needs to focus on fixing middle-out compression instead of kicking cameras.
Amazon prime streams the Thursday night NFL game and they seem to have no problem.
Sounds like a scene from: Silicon Valley - Nucleus fails
https://www.youtube.com/watch?v=9IGvzb-KCpY
Why no one mentioned the term “vaporware”? Isn’t this a classic example of one?
I hope they do a postmortem
I thought Hooli was Google, but may be it was Netflix after all.
Works in Australia. Maybe their CDN is under a lot of stress?
So much for Netflix engineering talent aura
Over promised and under delivered. That’s a bad look
This is why we need ipv6. If ipv6 was fully rolled out this livestream could have been an efficient multicast stream like what happens with ipTV.
This reminds me of that scene in Silicon Valley
Shoulda used middle out compression.
Chaos testing, nothing to see here.
Sounds like a job for Pied Piper
They didn’t miss anything.
Ota broadcasts are clearer
Silicon Valley predicted this: https://youtu.be/ddTbNKWw7Zs
Working okay for me
Streaming is hard.
Everyone pointing out that their illegal streams, X streams, etc. work fine are kind of missing the point.
These secondary streams might be serving a couple thousand users at best.
Initial estimates are in the hundreds of millions for Netflix. Kind of a couple of orders of magnitude difference there.
They have absolutely shit the bed here, and of course their socials are completely ignoring it.
People still pay real world money to Netflix after they cancelled and how and why Warrior Nun just to see grandpa being beaten up.
I guess in the year when Trump is being reelected this is hardly a surprise.
off topic but.
i thought tyson was in eldercare.
I can't see the fight right now.
Is this potentially an aws issue?
Looks like shit for me. Buffered a bit as well.
Was this the plot of a silicon valley episode?
yeah i'm using iptv which is just a rip of NF and its stuck buffering.
I blame RTO and AI
I'm an engineering manager at a Fortune 500 company. The dumbest engineer on our team left for Netflix. He got a pay raise too.
Our engineers are fucking morons. And this guy was the dumbest of the bunch. If you think Netflix hires top tier talent, you don't know Netflix.
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
Why didn’t they use Netflix AI to solve the problems?
Dupe: https://news.ycombinator.com/item?id=42153906
How is this not a solved problem by now?
I think this is a result of most software "engineering" having become a self-licking ice cream cone. Besides mere scaling, the techniques and infrastructure should be mostly squared away.
Yes, it's all complicated, but I don't think we should excuse ourselves when we objectively fail at what we do. I'm not saying that Netflix developers are bad people, but that it doesn't matter how hard of a job it is; it was their job and what they did was inadequate to say the least.
Jonathan Blow is right.