Cepheus - Unbeatable Poker program

  • 9 replies
    • Calsaj
      Joined: 28.01.2011 Posts: 22
      This team, as well as a few others (off the top of my head, Hyperborean, Slumbot, Sonia, Polaris), have been edging closer to solving limit holdem for some time. It wouldn't surprise me if they had achieved GTO in this format.

      Here is a really interesting video of Hyperborean and Slumbot playing eachother, attempting to solve HU NLHE:

      Here is another of Hyperborean employing a min-betting strategy postflop:

      What I find really interesting is how infrequently they fold preflop as well as their disregard for pot odds postflop, frequent minbetting into 20BB+ pots and massive overbets on the river/preflop. It does make me speculate that there could be a flaw in the programming where the bot is predisposed to reaching the river as frequently as possible as river bet-call-fold decisions are far more solvable than any other street, perhaps the bots are looking to get into spots cheaply where they can confidently play optimally (on the river) rather than attempting to discern/approximate the optimal play in all scenarios.

      Unfounded speculation from me given the amazing minds that program these bots but it does look suspect. Then again in another discussion on this topic Ike Haxton did say that he expected a truly GTO playstyle to look like a massive fish to an ABC player or professional and that you'd likely believe you were exploiting it until you realized after thousands of hands how handily it was owning you.
    • luitzen
      Joined: 03.04.2009 Posts: 664
      Teaching a computer to play poker is not too difficult and is not new either. As long as it's able to beat the limits it's playing on, it succeeded. Even though it's illegal, I suspect many high school kids have accomplished this.

      I suspect that the computer mostly learned from playing against other computers, not against humans. Computers play more or less perfect and predictable, humans don't. Every human plays differently and I wonder whether it takes hand histories into account.

      The computer probably found a Nash strategy for limit head's up hold'em and since limit head's up hold'em is a fairly simple game, it should not be too complicated to teach it to humans (there's a website where you can find its ranges: http://poker.srv.ualberta.ca ). If you play the same strategy, you'll be as succesful as the computer and the only winner is the house. Since humans usually won't play the perfect Nash strategy, the computer might become more succesful by playing a different strategy.

      Something else that comes to mind: what game is the computer playing? Are different varying percentages of rake taken into account? Is a changing blind structure taken into account? Does the computer take tournament and pay-out structures taken into account? A game where you can infinitely play on the same blind levels without any form of rake is not really realistic.

      Since head's up limit hold'em isn't the most popular game, this robot is also of limited use.
    • Boomer2k10
      Joined: 22.09.2010 Posts: 2,551
      When dealing with this bot you have to realise that there's no such thing as "adapting" to it or "adjusting it's play based on its opponent".

      It is attempting to play a Nash Equilibrium strategy which it is now so close to there's no point really continuing. The predecessor to this bot beat the world's best players 8 years ago. This current bot is far superior. In theory Polaris (the 2008 bot) was beatable for over 11BB/100, this one is near as makes no difference unbeatable.

      Unbeatable doesn't mean "Optimal" in terms of extracting the most money from exploitable opponents. It means that there is no way to adapt to and therefore beat the opponent's play, the best you can do it draw and it doesn't matter "how you play" against it, you can't win long term.

      The next step for these bots is to build on the NE strategy and work in an exploitative game (Polaris had something similar where they programmed personalities into it by essentially making it see the pot size incorrectly) on top of it. That would be something truly terrifying and it's the next step now to work on.

      Also Limit Holdem may be "simple" but there are 10^18 decisions which can be made in just HUHU. This is a VERY significant accomplishment and it lays the groundwork for a great deal of future progress
    • YohanN7
      Joined: 15.06.2009 Posts: 3,907
      While Nash equilibria exist, I'm not aware of a proof of existence of an optimal exploitative strategy. The number of possible decisions, now based on the history, N hands previously played, will make the 10^18 decisions look puny. Even if such a strategy exists, there may be other strategies that are more profitable in the long run. To exemplify what I mean, suppose that it is optimal to open 100% on the button versus a certain player (because he has folded preflop too much in the past). It may be best to open a tad less than 100% because we want him to continue folding too much. Opening 100% will sooner or later make the opponent play better. Things like these are hard to quantify.

      (There is a micro FL player that is the easiest opponent in the world. When he bets, he has the goods. I snap-fold stuff like overpairs when he fires. but still, I do sometimes call with hands that I know for 99% sure are losing (say two small pair) only to keep him perfectly honest.)
    • Ramble
      Joined: 17.11.2008 Posts: 1,421
      As luck would have it I stumbled across this video of Phil Lack playing against Polaris in the PokerStrategy video section.
    • kavboj84
      Joined: 16.06.2011 Posts: 1,978
      Has anyone checked its preflop strategy ? I think its very interesting, it openlimps some hands in a very small % (all below 1% some below 0.1% !) and there seems to be no logic in it (for ex limps AA but doesn't limp KK, limps AK and A5 but AQ and A4 never).
      It seems to 3bet around 40%(!) percent (batshit insane IMO), it never calls most of pairs and Ax-s and SC-s vs the SB raise, I wonder how isn't this exploitable, cause its unable to rep these postflop. (for ex it almost cant have top pair on an A board or a set when he only calls pre ) .

      Although the chart seems a little bit ambiguous, for ex.I don't know how can he call when the SB has limped ? But perhaps they were just lazy to replace call with check .
    • Boomer2k10
      Joined: 22.09.2010 Posts: 2,551
      It has enough Ax in its range you can't make that assumption on an A-high board. Additionally it'll probably dial back it's bluffing frequency postflop in order to compensate, just because it does an action pre-flop doesn't mean it can't play it well postflop

      Sets are so rare and overpowered as part of a non-3-betting range that it doesn't really matter. Two powerful hands clashing is rare in HUHU that's the only situation where knowing your opponent doesn't have a set is a big thing. The program probably worked out that it's better off 3-betting pairs pre-flop and constructing it's postflop game around that since pairs have a big equity edge preflop, rather than trying to get tricky and potentially losing a lot of implied equity.

      One thing I did notice is, except for a minute insignificant %-age of the time, this bot does NOT 4-bet. That's fairly consistent which standard GTO thinking in that that is where the "range" considerations begin to take over from "equity" considerations so the bot obviously considers 4-betting to be impractical to playing optimally

      I definitely am going to play some hands vs it though, I've always wanted to play against Polaris but never got the chance, now I get the chance to get my ass handed to me by Polaris's big brother and I'm actually excited :f_biggrin:

      Also I 3-bet 30-35% of the time and can't play anywhere near as well as this thing postflop in a HUHU situation so that's definitely not out of the realms of possibility
    • Boomer2k10
      Joined: 22.09.2010 Posts: 2,551
      Originally posted by Ramble
      As luck would have it I stumbled across this video of Phil Lack playing against Polaris in the PokerStrategy video section.
      It's So Sick!

    • Lausbub7
      Joined: 28.11.2008 Posts: 2,397
      Hello! I am thinking about writing my bachelor thesis about cepheus but I am not sure if I will be able to. Just some quick questions: Is it possible to view cepheus strategy as the SB? So far I can only see strategy from the BB perspective but I am sure you can "change view".

      I want to break down some rundowns and check his strategy for flop turn river to see if he plays correct from a game theory point of view (which he very likely does). Is this possible using the strategy tool and does it make sense to you?