Cash App migrated 400TB of data to PlanetScale's cloud

(planetscale.com)

47 points | by mschoening 2 days ago ago

14 comments

We've loved working with the incredible team at Cash App. If anyone has questions myself or someone on the PlanetScale team will answer.

[-]

prng2021 2 days ago

The article says they used a forked version of Vitess with customizations. What were they and how did you address that when migrating?

[-]

nickvanw 2 days ago

Answering on behalf of PlanetScale (I'm the CTO!). I don't remember exactly what was different from upstream, but it wasn't a whole lot of changes.

Fortunately, PlanetScale runs a well-maintained fork ourselves, so we're very used to taking custom changes and getting them deployed. In this case, we asked the Block team for all of their changes and went through them one by one to see what was needed to pull into our fork.

By the time we did the migration, we made sure that any behavior wouldn't be different where it mattered.

[-]

aaronyoung a day ago

Mostly the diffs were related to running against the on-prem MySQL instances smoothly: stuff like changes to split tooling or how you boot up the pieces. We have had unique vindexes or query planning changes in the past but we either deprecated or upstreamed them prior to migration.

JTyQZSnP3cQGa8B 2 days ago

It would be interesting to know why a Bitcoin application requires 400 TB of disk space.

[-]

osser 2 days ago

It's peer-to-peer payments and banking, which have been around for much longer than the stocks/bitcoin aspect of the app.

ImPostingOnHN 2 days ago

The article says,

> At peak times, Cash App's database handles approximately 3–4 million queries per second (QPS) across 400 shards, totaling around 400TiB of data.

400TiB represents not a lot of this data. If each query stored only 1 byte, this would only be 4 years worth of this data.

If duplicated, or processed and the results stored, that would add up, too.

[-]

echoangle 2 days ago

Why would a query store data? Are they logging individual queries?

znpy 2 days ago

It's not much data.

With current 22TB magnetic disks it's less than 20 disks, they would fit into a single machine (4u, likely).

The Storinator XL60 from 45drives (https://www.45drives.com/products/storinator-xl60-configurat...) can hold 60 disks, for (advertised) ~1.4PB of data.

(btw i've learned about 45drives through Linus Tech Tips channel so I think it's obligatory to say LTTSTORE.COM - for the meme)

[-]

manacit 2 days ago

Just to address the core of your comment, 20 magnetic disks would combine for about ~2,000 IOPS of capacity, provide for no redundancy, and allow only one machine to process the entirety of the queries coming in to power the application.

Even a full 60 disk server filled with magnetic disks would provide less I/O capacity for running a relational database than a single EBS volume.

It's might not look like a lot of data if you're talking about storing media files, but it's quite a bit of relational data to be queried in single-digit milliseconds at-scale.

[-]

znpy a day ago

I assumed people did not need to be explicitly reminded that you had to provision additional capacity for redundancy, and that you can use different layers of caching (ssd caches, ram caches etc).

And by the way, it was just posted today that you can get 60TB pci gen5 SSDs from Micron: https://news.ycombinator.com/item?id=42122434 : you can still fit all that dataset in a single machine and provide all the iops you need. You'd need just 7 of those.

So yeah, 400TB of data is not much data nowadays.

telgareith 2 days ago

45drives is... bad. I cannot understand why people would use them- particularly for homelab. Three SC846 24x3.5" 4U can be had for no more than $400/ea and, theres 36bay versions.

60 drives in 6U is a crap ton of weight.

[-]

znpy a day ago

> Three SC846 24x3.5" 4U can be had for no more than $400/ea and, theres 36bay versions.

Because then you need three times the rack space. What you don't spend one-time for the hardware you'll spend extra every month for rack space, connectivity, cooling, power etc.

You might use that budget for redundancy, for example.

amazingamazing 2 days ago

Hard to evaluate without knowing the cost.