One thing this doesn't take into account (and the paper acknowledges this) is that the characters are assigned by picking cards from a deck. So the two players cannot have the same character.
Taking this into account would make the game much more complicated, because it can introduce an element of bluff.
For a simple example, imagine that there are only 5 characters. On your first turn you know the opponent doesn't have the same card as you, so you've got 4 options remaining. You'd like to ask a question that splits them into 2+2, but if you do this then the card you're holding will make one of the groups into a 3. Your opponent will know that your card is one of the 3, so you've effectively given them a head start. Instead you might sometimes want to split the options 3+2 with your card in the 2, as a bluff.
How often you want to do this must be described by some Nash equilibrium probabilities. It would be interesting to set up a linear programming solver to find these exactly, but so far I haven't had time to set this up. I don't know if it would be practical to solve the full version of the game with 24 characters.
The idea is that if you're winning you can just do a binary search, but if you're losing, it's better to take some risks by making narrower guesses.
For example, let's say it's the last turn and your opponent is about to win. Say you may have 2 options but your opponent has 4 options. Instead of whittling it down to 2 options, it's better to guess one of the four. How outrageous should your guesses be is the content of the result and paper.
> For example, let's say it's the last turn and your opponent is about to win.
Or lose. Last month I played Guess Who with my Indian wife who hadn’t encountered it before, and in a couple of rounds she made mistakes in eliminating tiles, so that my wild guess saved her from losing to her own incorrect final guess.
I find it somewhat surprising that the optimal play when you're ahead is still just binary search. Is there an intuitive reason why it's not productive to make riskier guesses? Why not use my lead to have some chance of sealing my victory immediately, while still maintaining my lead if I'm wrong?
An entropy argument for optimal strategy when winning? Finite size/bounds arguments for losing?
If you have 100 options available to you, the maximum information gain is if you eliminate half. So, if you can, you always want to employ that strategy.
The problem comes with when you're losing, you might get maximum entropy gain by eliminating half but, because of finite effects of the game ending, that might not matter.
Take the example I gave: the next move you lose and you have 4 options to choose from. Eliminating half (2 in this case) will give you maximum entropy gain but guarantee a loss, since you're not whittling down the remaining list to 1 option. Better to take the hit on entropy in order to at least have a chance at winning.
I don't claim to have deep knowledge but this seems like finite size scaling effects. There's a kind of "continuum limit" of these processes but when you get to actual real-world/finite instances, there are issues "at the edges", where the continuum becomes finite. The finite size of the problems creates a kind of non-linear issue at the edges. All this is very hand-waivy, so don't take it too seriously but that's the intuition I have, at least.
You can see an edge effect in bidding-based card games when someone is close to victory.
Say you're in a game to 500 points and you're losing 460 to 480. There are 13 tricks and a trick is worth 10 points if you bid it.
The other team bids 5 tricks. Assuming they can make this (very safe) bid, they will have 530 points. You are collectively good for about 6 tricks. What should you bid?
If you bid reflecting your hand, you'll score 60 points and lose the game 520 to 530. You could go higher; you can take 8 tricks without even needing to set the other team. That would convert your loss into a win. But it's extremely unlikely that you'll be able to make those 8 tricks.
If you're playing duplicates and getting scored based on how good your result was compared to other teams playing the same hand, you should bid 6. If you're playing this as a one-off and getting scored based on whether you win or lose, you should bid 8 despite the fact that you can't make 8.
This becomes a manners issue in some games where your bid is an important input into later players' bids.
> This becomes a manners issue in some games where your bid is an important input into later players' bids.
Yes, I learned bridge playing duplicate where preemptive bids[1] are totally fine, but at some rubber bridge tables you won't get invited back if you have a habit of bidding them.
Binary search minimizes the number of expected moves until you find the target. If you are already ahead, this is a natural thing to want to do. The reason why this doesn't work when you're behind is that your opponent can also do that and probabilistically maintain their lead.
I know that it minimizes the expected number of moves. But, the goal is to maximize the probability that you win in fewer moves than your opponent, not minimize the expected number of moves. Given that your opponent is playing some riskier strategy, it's not intuitively obvious to me that your optimal moves for those two objectives are the same.
If it helps your intuition: Even at 3-4 remaining, you'll still win at the next turn. Above that your chances of getting it right are too low compared to the reduction (assuming there is an option eliminating enough).
This could be made more complicated/interesting if you play a series of games and are awarded points based on either how many rounds it took to win or how many remaining cards you still had.
In addition to other answers, one way to think about it is that the options are symmetric around the midpoint: a guess that partitions the space into (1/4 of options, 3/4 of options) is the same as one that does (3/4, 1/4). So (1/2, 1/2) is special in some way – it has to be either a local minimum or local maximum. And if the function is convex (or close enough), then (1/2, 1/2) is a global minimum/maximum.
But (1/2, 1/2) is clearly a better choice than just guessing a specific individual. So it must be the best choice.
If you're behind and you're doing the same strategy as your opponent, you'll never catch up. If you're behind doing the risky bet strategy, most times you will never catch up either because your risky bets don't pay off, but a few times they will pay off.
This is largely how all complex competitive games work. At some point there is a shared valuation of which player is ahead and the behind player must take steps that are outside of optimal play to attempt to leapfrog ahead. The GWENT card game was particularly well designed for this. ie How many extra cards to play/sacrifice for round control if you drew badly, based on meta-matchups.
I have always asserted that some games (like Heroes of the Storm) suffer from not having catch up mechanics beyond player skill. This is problematic, when player skill can be quantized to an average value that has led to the losing state. This makes it much less likely to ever be a useful catchup mechanic, in comparison to some intrinsic gamble mechanics.
The lack of catch up mechanics also means the games are less interesting because risks are only worth taking after the known state, not casually during as a chaotic factor that might be capitalized on.
Sure, I think it makes intuitive sense to me that you should play riskier when you're behind. The surprising part to me is that when you're ahead, even if you know that your opponent will play "sub-optimally", that doesn't change your own optimal move.
In the video, in the continuous version the game never end and highlight the "loser" strategy.
When you are behind the optimal play is to make a gamble, which most likely will make you even worse. From the naive winning side it seems the loser is just doing a stupid strategy of not following the optimal dichotomy strategy, and therefore that's why they are losing. But in fact they are a "player" doing not only their best, but the best that can be done.
The infinite sum of ever smaller probabilities like in Zeno's paradox, converge towards a finite value. The inevitable is a large fraction of the time, you are playing catch-up and will never escape.
You are losing, playing optimally, but slowly realising the probabilities that you are a loser as evidence by the score which will most likely go down even more next round. Most likely the entire future is an endless sequence of more and more desperate looking losing bets, just hoping to strike it once that will most likely never comes.
In economics such things are called "traps", for example the poverty trap exhibits similar mechanics. Where even though you display incredible ingenuity by playing the optimal game strategy, most of the time you will never escape, and you will need to take even more desperate measures in the future. That's separating the wheat from the chaff from the chaff's perspective or how you make good villains : because like Bane in Batman there are some times (the probability is slim but finite) where the gamble pays off once and you escape the hell hole you were born in and become legend.
If you don't play this optimal strategy you will lose slower but even more surely. The optimal strategy is to bet just enough to go from your current situation to the winning side. It's also important not to overshoot : this is not always taking moonshots, but betting just enough to escape the hole, because once out, the probabilities plays in your favor.
One thing this doesn't take into account (and the paper acknowledges this) is that the characters are assigned by picking cards from a deck. So the two players cannot have the same character.
Taking this into account would make the game much more complicated, because it can introduce an element of bluff.
For a simple example, imagine that there are only 5 characters. On your first turn you know the opponent doesn't have the same card as you, so you've got 4 options remaining. You'd like to ask a question that splits them into 2+2, but if you do this then the card you're holding will make one of the groups into a 3. Your opponent will know that your card is one of the 3, so you've effectively given them a head start. Instead you might sometimes want to split the options 3+2 with your card in the 2, as a bluff.
How often you want to do this must be described by some Nash equilibrium probabilities. It would be interesting to set up a linear programming solver to find these exactly, but so far I haven't had time to set this up. I don't know if it would be practical to solve the full version of the game with 24 characters.
The idea is that if you're winning you can just do a binary search, but if you're losing, it's better to take some risks by making narrower guesses.
For example, let's say it's the last turn and your opponent is about to win. Say you may have 2 options but your opponent has 4 options. Instead of whittling it down to 2 options, it's better to guess one of the four. How outrageous should your guesses be is the content of the result and paper.
Paper is on archive (and linked from the video):
https://arxiv.org/abs/1509.03327
> For example, let's say it's the last turn and your opponent is about to win.
Or lose. Last month I played Guess Who with my Indian wife who hadn’t encountered it before, and in a couple of rounds she made mistakes in eliminating tiles, so that my wild guess saved her from losing to her own incorrect final guess.
I find it somewhat surprising that the optimal play when you're ahead is still just binary search. Is there an intuitive reason why it's not productive to make riskier guesses? Why not use my lead to have some chance of sealing my victory immediately, while still maintaining my lead if I'm wrong?
An entropy argument for optimal strategy when winning? Finite size/bounds arguments for losing?
If you have 100 options available to you, the maximum information gain is if you eliminate half. So, if you can, you always want to employ that strategy.
The problem comes with when you're losing, you might get maximum entropy gain by eliminating half but, because of finite effects of the game ending, that might not matter.
Take the example I gave: the next move you lose and you have 4 options to choose from. Eliminating half (2 in this case) will give you maximum entropy gain but guarantee a loss, since you're not whittling down the remaining list to 1 option. Better to take the hit on entropy in order to at least have a chance at winning.
I don't claim to have deep knowledge but this seems like finite size scaling effects. There's a kind of "continuum limit" of these processes but when you get to actual real-world/finite instances, there are issues "at the edges", where the continuum becomes finite. The finite size of the problems creates a kind of non-linear issue at the edges. All this is very hand-waivy, so don't take it too seriously but that's the intuition I have, at least.
You can see an edge effect in bidding-based card games when someone is close to victory.
Say you're in a game to 500 points and you're losing 460 to 480. There are 13 tricks and a trick is worth 10 points if you bid it.
The other team bids 5 tricks. Assuming they can make this (very safe) bid, they will have 530 points. You are collectively good for about 6 tricks. What should you bid?
If you bid reflecting your hand, you'll score 60 points and lose the game 520 to 530. You could go higher; you can take 8 tricks without even needing to set the other team. That would convert your loss into a win. But it's extremely unlikely that you'll be able to make those 8 tricks.
If you're playing duplicates and getting scored based on how good your result was compared to other teams playing the same hand, you should bid 6. If you're playing this as a one-off and getting scored based on whether you win or lose, you should bid 8 despite the fact that you can't make 8.
This becomes a manners issue in some games where your bid is an important input into later players' bids.
> This becomes a manners issue in some games where your bid is an important input into later players' bids.
Yes, I learned bridge playing duplicate where preemptive bids[1] are totally fine, but at some rubber bridge tables you won't get invited back if you have a habit of bidding them.
1: https://en.wikipedia.org/wiki/Preempt
Binary search minimizes the number of expected moves until you find the target. If you are already ahead, this is a natural thing to want to do. The reason why this doesn't work when you're behind is that your opponent can also do that and probabilistically maintain their lead.
I know that it minimizes the expected number of moves. But, the goal is to maximize the probability that you win in fewer moves than your opponent, not minimize the expected number of moves. Given that your opponent is playing some riskier strategy, it's not intuitively obvious to me that your optimal moves for those two objectives are the same.
If it helps your intuition: Even at 3-4 remaining, you'll still win at the next turn. Above that your chances of getting it right are too low compared to the reduction (assuming there is an option eliminating enough).
This could be made more complicated/interesting if you play a series of games and are awarded points based on either how many rounds it took to win or how many remaining cards you still had.
In addition to other answers, one way to think about it is that the options are symmetric around the midpoint: a guess that partitions the space into (1/4 of options, 3/4 of options) is the same as one that does (3/4, 1/4). So (1/2, 1/2) is special in some way – it has to be either a local minimum or local maximum. And if the function is convex (or close enough), then (1/2, 1/2) is a global minimum/maximum.
But (1/2, 1/2) is clearly a better choice than just guessing a specific individual. So it must be the best choice.
If you're behind and you're doing the same strategy as your opponent, you'll never catch up. If you're behind doing the risky bet strategy, most times you will never catch up either because your risky bets don't pay off, but a few times they will pay off.
This is largely how all complex competitive games work. At some point there is a shared valuation of which player is ahead and the behind player must take steps that are outside of optimal play to attempt to leapfrog ahead. The GWENT card game was particularly well designed for this. ie How many extra cards to play/sacrifice for round control if you drew badly, based on meta-matchups.
I have always asserted that some games (like Heroes of the Storm) suffer from not having catch up mechanics beyond player skill. This is problematic, when player skill can be quantized to an average value that has led to the losing state. This makes it much less likely to ever be a useful catchup mechanic, in comparison to some intrinsic gamble mechanics.
The lack of catch up mechanics also means the games are less interesting because risks are only worth taking after the known state, not casually during as a chaotic factor that might be capitalized on.
Sure, I think it makes intuitive sense to me that you should play riskier when you're behind. The surprising part to me is that when you're ahead, even if you know that your opponent will play "sub-optimally", that doesn't change your own optimal move.
Hockey fans will recognize this strategy as “pulling the goalie.”
> Why not use my lead to have some chance of sealing my victory immediately, while still maintaining my lead if I'm wrong?
There's no advantage to winning immediately. You aren't scored on time taken.
So, it's better to use the better strategy.
In the video, in the continuous version the game never end and highlight the "loser" strategy.
When you are behind the optimal play is to make a gamble, which most likely will make you even worse. From the naive winning side it seems the loser is just doing a stupid strategy of not following the optimal dichotomy strategy, and therefore that's why they are losing. But in fact they are a "player" doing not only their best, but the best that can be done.
The infinite sum of ever smaller probabilities like in Zeno's paradox, converge towards a finite value. The inevitable is a large fraction of the time, you are playing catch-up and will never escape.
You are losing, playing optimally, but slowly realising the probabilities that you are a loser as evidence by the score which will most likely go down even more next round. Most likely the entire future is an endless sequence of more and more desperate looking losing bets, just hoping to strike it once that will most likely never comes.
In economics such things are called "traps", for example the poverty trap exhibits similar mechanics. Where even though you display incredible ingenuity by playing the optimal game strategy, most of the time you will never escape, and you will need to take even more desperate measures in the future. That's separating the wheat from the chaff from the chaff's perspective or how you make good villains : because like Bane in Batman there are some times (the probability is slim but finite) where the gamble pays off once and you escape the hell hole you were born in and become legend.
If you don't play this optimal strategy you will lose slower but even more surely. The optimal strategy is to bet just enough to go from your current situation to the winning side. It's also important not to overshoot : this is not always taking moonshots, but betting just enough to escape the hole, because once out, the probabilities plays in your favor.