Consistent Hash Ring

(gallery.selfboot.cn)

41 points | by jcartw 3 days ago ago

17 comments

  • pyfon a few seconds ago

    I had a practical use to learn this: design interview prep

  • meling an hour ago
  • rad_gruchalski 4 hours ago
  • __turbobrew__ 5 hours ago

    Do virtual nodes actually help?

    From what I can see in the UI, nodes are placed semi randomly on the ring (probably a hash itself determines the node placement) so don’t you still have the same problem that the hashing of virtual nodes onto the ring can still cause imbalances?

    To me it seems like you should be trying to put new nodes on the ring within the largest gaps on the ring.

    • jauntywundrkind an hour ago

      Vnodes are amazing for so many reasons. The model here is pretty simple, but even still, it means you can rejuggle work without having to re-hash when adding nodes: just have the new node claim some vnodes. That's just the basics.

      In Cassandra's consistent hashing & many others, you can also juggle vnodes around between nodes as you please, which, if you have hotspot vnodes, gives you some chance to add some anti- affinity for the hot vnodes.

      https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/arc...

    • swinglock 4 hours ago

      Add enough random nodes and eventually it will be even, so it helps.

      • __turbobrew__ 4 hours ago

        Why not just make the nodes even from the start? Place new nodes in the largest existing gap between any pair of nodes.

        • lsecondario 4 hours ago

          In a distributed system all clients would need to agree on the largest existing gap under various messaging anomalies. If you have a mechanism to reach that agreement then you can probably use a simpler deterministic load balancing algo. Consistent hashing gives you eventually consistent routing without (synchronous) agreement.

          • __turbobrew__ 44 minutes ago

            You don’t need agreement. A newly added node can try best effort to find the optimal placement based upon the known state of the cluster and then advertise that it is going to a specific place in the hash ring. When clients learn about the new node they will also learn where that node has decided to put itself into the ring. No coordination is needed. There are probably moments that a client learns about the node they also atomically learn about where that nodes is placed.

            There are probably other issues where simultaneously added nodes may try to insert themself into the same position in the ring, you could add some jitter in the placement to compensate for this, but then I guess you are now introducing randomness which is one step closer to just having vnodes again.

        • sfilipov 3 hours ago

          Worth mentioning that virtual nodes also ensure that the order of servers is random. Which helps when a server is removed - the keys that need to be moved will be spread across all other servers. If we were to evenly chop up the hash ring, server B will always be after server A. And when we remove server A, all keys residing on it will need to be moved exclusively to server B.

        • remram 3 hours ago

          Whatever balanced configuration you make, it will become imbalanced if you add or remove a node.

  • eulenteufel 5 hours ago

    Curious, I just learned about hash rings last week when setting up the ironic openstack service [0].

    [0] https://docs.openstack.org/ironic/latest/_modules/ironic/com...

  • packetlost 5 hours ago

    If you're looking for an application of these, the DynamoDB paper is a really great read: https://www.allthingsdistributed.com/files/amazon-dynamo-sos...

  • samwho 5 hours ago

    Love this.

  • jcartw 3 days ago

    Interactive Consistent HashRing Visualization