Something seemed off about this list as it suggested “rocket science” in this field is MapReduce from Google. Turns out this list is from 2014 [1], reader beware, as things have changed since then.
This definitely puts it into context — it has been ten years since this list came out, and at that time, it had been ten years since the MapReduce paper came out.
I think it's important to clarify who these lists are really for. They're not meant for people simply looking to "learn distributed systems," in my opinion. These might help those pushing the envelope or looking for new approaches.
For the rest of us, imagine asking how to solve quadratic equations and getting 100 papers on category theory.
Fred Herbert’s list, more up-to-date than OP’s, isn't even complete according to him. He mentions "Designing Data-Intensive Applications" as essential but suggests something like: "you need to read a bunch of papers first to _really_ get it." (My paraphrasing :-). If that’s not gatekeeping, what is?
Thanks to decades of others' work, we don’t need to read 100 nanokernel papers to be effective Linux users. And while building a good production ready OS from scratch is still hard, 99% of us don’t need to—we just need to get proficient with the tools that already exist. The same goes for distributed systems—it’s does not have to be that hard unless you're trying to push the envelope.
In the spirit of learning by doing, here are some better ways (imho) of "learning distributed systems" for the working SWE:
* Build something with NATS [1] or YugaByte [2].
* Try a hands-on tutorial like [3].
* Some books get better with each re-read, and Designing Data-Intensive Applications is one of them. So go ahead and read it, even if you haven't read 100 papers yet. When you hit sections you don’t understand, ask questions and get help. Feel free to skip the giant distributed systems reading lists :-).
Something seemed off about this list as it suggested “rocket science” in this field is MapReduce from Google. Turns out this list is from 2014 [1], reader beware, as things have changed since then.
[1] https://news.ycombinator.com/from?site=dancres.github.io
This definitely puts it into context — it has been ten years since this list came out, and at that time, it had been ten years since the MapReduce paper came out.
This list looks a bit dated, I'd recommend Heidi Howard's Distributed Consensus Reading List https://github.com/heidihoward/distributed-consensus-reading...
See also https://ferd.ca/a-distributed-systems-reading-list.html, which mentions the OP list.
I think it's important to clarify who these lists are really for. They're not meant for people simply looking to "learn distributed systems," in my opinion. These might help those pushing the envelope or looking for new approaches.
For the rest of us, imagine asking how to solve quadratic equations and getting 100 papers on category theory.
> See also https://ferd.ca/a-distributed-systems-reading-list.html, which mentions the OP list.
Fred Herbert’s list, more up-to-date than OP’s, isn't even complete according to him. He mentions "Designing Data-Intensive Applications" as essential but suggests something like: "you need to read a bunch of papers first to _really_ get it." (My paraphrasing :-). If that’s not gatekeeping, what is?
Thanks to decades of others' work, we don’t need to read 100 nanokernel papers to be effective Linux users. And while building a good production ready OS from scratch is still hard, 99% of us don’t need to—we just need to get proficient with the tools that already exist. The same goes for distributed systems—it’s does not have to be that hard unless you're trying to push the envelope.
In the spirit of learning by doing, here are some better ways (imho) of "learning distributed systems" for the working SWE:
* Build something with NATS [1] or YugaByte [2].
* Try a hands-on tutorial like [3].
* Some books get better with each re-read, and Designing Data-Intensive Applications is one of them. So go ahead and read it, even if you haven't read 100 papers yet. When you hit sections you don’t understand, ask questions and get help. Feel free to skip the giant distributed systems reading lists :-).
--
1: https://nats.io/
2: https://www.yugabyte.com/
3: https://pragprog.com/titles/tjgo/distributed-services-with-g...
great stuff!!
And yet no mention of CRDT technology?