Scalability But at what COST? (2015) [pdf]

(usenix.org)

3 points | by tosh 7 hours ago ago

1 comments

  • ninadpathak 6 hours ago

    This paper remains essential for distributed systems thinking. The key insight: total cost of ownership (CPU, network, I/O) matters more than just throughput. McSherry et al show that a single well-optimized machine often outperforms naive distributed approaches across many workloads. The COST metric captures what practitioners care about. Highly relevant for recent trends in vector databases and graph processing frameworks, where people often default to distributed setups without measuring actual overhead vs single-machine alternatives. See also: https://github.com/frankmcsherry/dataflow-paper for reproducible experiments.