YouTube as Storage

(github.com)

47 points | by saswatms 3 hours ago ago

41 comments

I don't get how it works.

> Encoding: Files are chunked, encoded with fountain codes, and embedded into video frames

Wouldn't YouTube just compress/re-encode your video and ruin your data (assuming you want bit-by-bit accurate recovery)?

If you have some redundancy to counter this, wouldn't it be super inefficient?

(Admittedly, I've never heard of "fountain codes", which is probably crucial to understanding how it works.)

[-]

Jaxan an hour ago

Yes it is inefficient. But youtube pays the storage ;-). (There is probably a limit on free accounts, and it is probably not allowed by the TOS.)

[-]

genidoi an hour ago

Right, you just pay daily in worrying when, not if, youtube will terminate your account and delete your "videos".

[-]

madmads 17 minutes ago

I think it's just meant to be a fun experiment, not your next enterprise backup site

[-]

K0balt 7 minutes ago

Stegonagraphic backup with crappy ai transmogrified reaction videos. Free backup for openclaw agents so they can take over the internet lol

repeekad 2 hours ago

I once asked one of the original YouTube infra engineers “will you ever need to delete the long tail of videos no one watches”

They said it didn’t matter, because the sheer volume of new data flowing in growing so fast made the old data just a drop in the bucket

[-]

arjie an hour ago

Videos do disappear, though. https://www.reddit.com/r/DataHoarder/comments/1ioz4x1/is_it_...

Searching hn.algolia.com for examples will yield numerous ones.

https://news.ycombinator.com/item?id=23758547

https://bsky.app/profile/sinevibes.bsky.social/post/3lhazuyn...

[-]

Kwpolska 43 minutes ago

Of course videos disappear for copyright, ToS violations, or when the uploaders remove them. They do not disappear just because nobody watched them.

wasmainiac 2 hours ago

I wonder if that still holds true? The volume of videos increases exponentially especially with AI slop, I wonder if at some point they will have to limit the storage per user, with a paid model if you surpass that limit. Many people who upload many videos I guess some form of income off YouTube so it wouldn’t that be that big of a deal.

[-]

weird-eye-issue an hour ago

What they said only holds true because the growth continues so that the old volume of videos doesn't matter as much since there's so many more new ones each year compared to the previous year. So the question is more about whether or not it will hold true in the long term, not today

pogue 2 hours ago

I assume it's an economics issue. As long as they continue making money off the uploads to a higher extent than it costs for storage, it works out for them.

[-]

throw_await an hour ago

Do they make a profit nowadays

ranger_danger 2 hours ago

I wonder if anyone has ever compiled a list of channels with abnormally large numbers of videos? For example this guy has over 14,000:

https://www.youtube.com/@lylehsaxon

[-]

HeliumHydride an hour ago

There is a channel with 2 million videos: https://www.youtube.com/@RoelVandePaar/videos One with 4 million videos: https://www.youtube.com/@NameLook

[-]

buenzlikoder 15 minutes ago

NameLook puts a whole new meaning to "low effort videos"

wellf 38 minutes ago

First one has transcribed stack overflow to YT by the look of it

j-bos an hour ago

This ia really cool but also feels like a potential burden on the commons,

[-]

vasco an hour ago

That great commons that are the multi trillion dollar corporations that could buy multiple countries? They sure worry about the commons when launching another datacenter to optimize ads.

[-]

cheonn638 38 minutes ago

> That great commons that are the multi trillion dollar corporations that could buy multiple countries?

Exactly which countries could they buy?

Let me guess: you haven’t actually asked gemini

[-]

cheschire 29 minutes ago

Have you? Assuming Google would want to not put all their chips on that one number and invest all available capital in the purchase of a nation, and assuming that nation were open to being purchased in the first place (big assumption; see Greenland), Google is absolutely still in a place to be able to purchase multiple smaller countries, or one larger one.

[-]

arcticfox 17 minutes ago

Greenland already has a wealthy benefactor, I'd be surprised if poor countries wouldn't be interested

gregoryl 31 minutes ago

https://en.wikipedia.org/wiki/Hyperbole

[-]

K0balt 4 minutes ago

You don’t have to go ballistic!

russfrank 21 minutes ago

The USA.

agnishom 42 minutes ago

You are right, but YouTube is also a massive repository of human cultural expression, whose true value is much more than the economic value it brings to Google.

[-]

komali2 40 minutes ago

Yes, but it's a classic story of what actually happened to the commons - they were fenced and sold to land "owners."

Honestly, if you aren't taking full advantage within the constraints of the law of workarounds like this, you're basically losing money. Like not spending your entire per diem budget when on a business trip.

zokier 2 hours ago

Also, how to get your google account banned for abuse.

[-]

newqer an hour ago

Just make sure you have you have a bot network storing the information in with multiple accounts. Also with with enough parity bits (E.g. PAR2) to recover broken vids or removed accounts.

[-]

compsciphd 36 minutes ago

par2 is very limited.

It only support 32k parts in total (or in reality that means in practice 16k parts of source and 16k parts of parity).

Lets take 100GB of data (relatively large, but within realm of reason of what someone might want to protect), that means each part will be ~6MB in size. But you're thinking you also created 100GB of parity data (6MB*16384 parity parts) so you're well protected. You're wrong.

Now lets say one has 20000 random bit error over that 100GB. Not a lot of errors, but guess what, par will not be able to protect you (assuming those 20000 errors are spread over > 16384 blocks it precalculated in the source). so at the simplest level , 20KB of errors can be unrecoverable.

par2 was created for usenet when a) the size of binaries being posted wasn't so large b) the size of article parts being posted wasn't so large c) the error model they were trying to protect was whole articles not coming through or equivalently having errors. In the olden days of usenet binary posting you would see many "part repost requests", that basically disappeared with par (then quickly par2) introduction. It fails badly with many other error models.

[-]

e145bc455f1 3 minutes ago

what other tool do you recommend?

wellf 36 minutes ago

Or.... backblaze B2

madduci 2 hours ago

Love this project, although I would never personally trust YT as Storage, since they can delete your channel/files whenever they want

qwertox an hour ago

The explainer video on the page [0] is a pretty nice explanation for people who don't really know what video compression is about.

[0] https://www.youtube.com/watch?v=l03Os5uwWmk

polotics an hour ago

Wot no steganography? Come on pretty please with an invisible cherry on top! :-) Here to get you started: https://link.springer.com/article/10.1007/s11042-023-14844-w

blackhaz 2 hours ago

Has anyone got an example how such a video looks like? Really curious. Reminds me of the Soviet Arvid card that could store 2 GB on an E-180 VHS tape.

https://en.wikipedia.org/wiki/ArVid

the_dude_ an hour ago

reminds me of gmail fs, https://en.wikipedia.org/wiki/GMail_Drive very interesting project explanation video on youtube

ranger_danger 2 hours ago

Other examples of so-called "parasitic storage": https://dpaste.com/DREQLAJ2V.txt

andrewstuart an hour ago

How does it survive YouTube transcoding.

finalhacker an hour ago

after compression, all data lost.

sneak 2 hours ago

Something at this link crashes both MobileSafari and iOS Firefox on my device.

[-]

Hamuko 2 hours ago

The GitHub link? Works fine in Safari on my M4 iPad Pro.