Show HN: Play poker with LLMs, or watch them play against each other(llmholdem.com)

163 pointsby projectyanga month ago33 comments

sciolista month ago
This is very cool, one piece of feedback: watching the table as the AI plays while seeing the reasoning is difficult as they're on other sides of the screen. It could be nice to have the reasoning show up next to the players as they make their moves.
- stevagea month ago
  Yep, exactly. It's very difficult currently.
nivekkevina month ago
Idea: can the agents make faces? 1. Programmatically--agents see each other's faces, and they can make their own. They can choose to ignore, but at least make that an input to the decision making. 2. Display them in UI--I just want to see their faces instead next to their model code names :)
sejjea month ago
I used to play professionally, and I still play in the casinos.
These LLMs are playing better than most human players I encounter (low limits).
They're kinda bad, but not as criminally bad as the humans.
- gerdesja month ago
  OK so you know how it goes in poker and I should probably read the literature ...
  How much of a session is based on "reading players" vs "playing the odds"?
  What I am getting at, is how different is poker than say roulette or blackjack? My initial thoughts are that poker such as TX hold 'em is not a game offered in a casino, so it must be mostly indeterminate. I imagine that the casino versions of poker are not TXHT.
  By contrast, roulette is simply a game where the casino wins eventually with a fixed profit (thanks to 0 and a possible 00). That is all well documented.
  I have only ever visited a casino once, 25 years ago, Plymouth, Devon as it turns out and I was advised to only take £50 in readies and bail out when it was gone. I came out £90 up, which was nice and my "advisor" came out £95 up (eventually, after being £200 down at one point). Sadly my "advisor" ended up bankrupt a year later.
  So, how do you play a LLM? I would imagine that conversation is not allowed ...
  - sejjea month ago
    They offer (real) poker at some casinos. It's standard NLHE usually 100-200bb max buyin, sometimes match the stack etc.
    Most common game spread is 9-handed $200 max $1/$2 NLHE. It's exactly like the game on the link, except more players and lower stakes.
    In the game, you try to win the money of the other 8 players, not of the casino. The casino takes a rake each hand, and a player with a large enough edge can overcome it. The edge might be you're excellent, or it might be they're terrible (or drunk). But the house gets paid to deal each hand.
    In the long term, poker outcomes are determined by skill. In the short term, they're luck. In the medium term, both. Most people never reach the long term, it's a lot of hands.
    There's also table games, similar to blackjack, that they call "three card poker" etc. These can't be beat, they favor the house. Standard table game, with a poker flavor. I've never played one of these.
  - MichaelApproveda month ago
    I used to play A LOT at low and high levels.
    At low levels, playing is ABC simple and mostly about following basic strategy for starting hands and pot adds for chasing. Don’t get fancy and keep your temperament steady and you’ll win.
    To a slight degree, you can do better with reading players and identifying them in broad ways (wild, conservative, confused, etc.) but don’t let that allow you to get fancy. Stick to the basic fundamental strategy for hands, position, and pot odds to crush lower level games.
  - firefaxa month ago
    >How much of a session is based on "reading players" vs "playing the odds"?
    I think the key is you need to watch for a person's play style.
    There's a two axis system: tight/agressive and passive/active.
    An active player sees more flops, and an aggressive player will call and raise more than a tight player.
    So a tight, aggressive player sees few flows but bets strongly when they have a good hand -- this is considered "good" strategy.
    Others might play a "tight-passive" strategy -- they'll play few hands but fold easily. They won't lose large amounts of money but they'll slowly bleed chips.
    A loose, aggeessive player is the type you want at the table -- they're making a lot of bets, and often bluffing, and you can sit and wait to catch them.
    Now, this is "reading" someone, but it's not the Rounders style "oh he just ate an oreo so he's bluffing" level reading of a player that movies
    For context, I'm an OK player. I can make a few hundred playing 1/3 per session -- I'm not in Vegas so I can't move to the next tier without sinking a lot of money on a flight and hotel.
    If your goal is a bit of beer money, it can be a fun hobby, but I wouldn't go into it expecting it to become a full time career.
  - furyofantaresa month ago
    Hold'em is offered in casinos routinely, I'm not sure where else one even goes to play it aside from private games, but it is not against the casino. It's against other players, and the casino takes a percentage of the pot.
    Others may differ and I am biased because 99% of my play has been online, but I'd say it's almost entirely playing the odds. Or at least, the popular romantic conception of looking for tells or whatever, is, I would expect, a really minimal edge compared to simply playing better.
    You do learn the other players' tendencies and adapt accordingly, and table selection is very important, so in that sense it is very much about reading players.
    A large part of my play was heads up where it's very much about understanding the other player's play as deeply as possible, and so if I wanted to be technically accurate about reading players vs playing the odds, I'd say both are very important. But if I'm answering someone who has the popular conception of what those phrases mean, I think saying "it's about playing the odds" would give them the more accurate picture.
    You really want to be good at playing the odds, and you don't want to stray too far from fundamentally good play. If someone is learning how to play and I'm advising them, I'm teaching them all about playing the odds, and trying to get them to read players less. Only once they have a solid fundamental understanding of the odds would I teach them how to adjust.
    stevagea month ago
    Around here (Melbourne) the other place is in pubs - there are organised poker tournaments. They can't legally charge you an entry fee, but they can give you a lot of extra chips if you buy a meal at the pub. Some modest prizes if you win.
    They're kind of a ridiculous format - you typically start with about 20 BB but the blinds go up pretty quickly so you don't see a lot of post-flop play.
    Somewhat entertaining.
  - csaa month ago
    > How much of a session is based on "reading players" vs "playing the odds"?
    Several good answers here, but I will add my own take.
    “Playing the odds” is basically playing good, fundamental poker. This is a baseline that most players will use when sitting down at a table of unknown players. This is often called a “balanced” strategy (note some people erroneously call this “GTO strategy, assuming the balance, which is not actually what GTO is).
    “Reading players” is a thing, but it can be broken up into (at least) two categories: 1) physical tells, and 2) player habits.
    Physical tells is not a big thing. Some people give off a lot of tells, but some folks are also decent (not good) at giving reverse tells. Honestly, you can be a wildly successful poker player while knowing nothing about physical tells.
    (Side note: One of the most reliable tells is bet timing tells. This can sometimes even be a tell online, especially when people are taking shots at higher stakes or are deep in a tournament. It can also be faked, but some folks are super reliable with timing tells, and they don’t realize it.)
    The second kind of tell is player tendencies — things like when players play too many hands or when they play too few (e.g., fold too often).
    One very reliable way for good players to smooth out their earnings curve is figure out which players fold too often and in which spots. Once they’ve figured that out, they try to set up that spot and basically print low-risk (sometimes even no-risk) chips.
    Taking advantage of these tendencies is called an “exploitative” strategy (as opposed to the “balanced” strategy mentioned above).
    Really good players can take rec players on a journey through a series of emotions (and accompanying predictable gameplay) such that the good player can read the rec player like a book. The odds tip heavily in favor of the good player at this point.
    Pro player and strong amateur players are so far ahead of recs in ways that the recs don’t even realize.
  - raincolea month ago
    > My initial thoughts are that poker such as TX hold 'em is not a game offered in a casino
    Why not? Because you think it's a game where the casino can lose?
    If so it's not an issue, as casinos that provide poker take "fees" from the stakes. Like how stock exchanges work: there are people making or losing money from stock market, but exchanges are always making profit.
    ryandrakea month ago
    Around where I live, about half the casinos offer poker and about half don't. Poker can be a pretty high cost to a casino. Compared to something like slot machines, it's financially mostly downside: You need a lot more physical space per player. You need more staff. In order for players to come and actually enjoy it, the poker room needs to be located in a relatively quiet corner, ideally enclosed so the buzz of the rest of the casino can't be heard, which is also expensive. And the game is slow, and rakes happen once per hand, so you're making money pretty slowly. And that's just for cash games. Tournaments are worse. If I had to guess, tournaments probably lose money for the casino, and they only exist to get players in so they play at the cash tables. Probably many other things that I'm forgetting because I don't run a casino.
  - conceptiona month ago
    Just an fyi they make apps like https://apps.apple.com/app/id1530767783 to train on betting based on expected value.Not the whole picture but trains the math side.
- bionsystema month ago
  I just watched for 5 min and no they don't play very well. Deepseek squeezed with K4o against CO open and BTN call with full stacks. Grok 3b AI with 25bb in the button with Q4s. Those are very far from optimal play which is well known since solvers. I wonder how they've been trained.
  - sejjea month ago
    Considering a squeeze puts deepseek ahead of most human players. Maybe not an optimal squeeze, but most human players flat if they're going to play.
    Grok's play is obviously bad, no solver needed. I wonder what he said in the "log" where you can see their thoughts. I guess he can hopelessly be trying to rep AA--again, I've seen worse thought processes every time I've ever played in a casino.
    I really think GPT, at least, would win at a live casino. Possibly the other bots as well. The humans are that bad. Poker is complex.
  - ryandrakea month ago
    You and OP are agreeing: "Better than most human players" is quite possibly the lowest skill bar in (at least live) poker.
- projectyanga month ago
  I'm actually surprised at how well they play pre-flop (mostly). Did some initial analysis on VPIP/PFR across positions, and somewhat decent.
  Post-flop on the other hand is all over the place...
  - a month ago
    undefined
  - algo_tradera month ago
    there plenty of published preflop charts and GTO ranges
    in fact, a fun project would be take a non-reasoning model, play on a lesser known game format, and see if it learns an "a ha" moment or explicitly simulate moves ahead
- hydr0smok3a month ago
  lol what? I just watched Grok fold pocket jacks preflop, no raise/limps ahead.
  - agentifysha month ago
    On Pokerstars it is the right move because you are going to get beat by someone going all in with 72o or something
    but seriously at lower stakes there is just no respect for the art its just a shock and awe strategy: throw shit up, break the game and use that demoralization to bully others.
  - sschnei8a month ago
    There’s no good way to play pocket jiggities @bradowen
  - sejjea month ago
    That's, in fact, a much smaller mistake than I see humans make every hour.
    p1ddaa month ago
    [flagged]
    sejjea month ago
    How many BBs do you think JJ is worth being dealt preflop? I suggest you confirm your guess.
    How many times do you see an obvious nut hand get called on the river by someone to whom it isn't obvious, for hundreds of BBs? (Every single session, that's the answer)
    One of these is such a huge mistake compared to the other.
  - chewsa month ago
    Grok knows that pocket Jacks are a fast way to go broke.
nindalfa month ago
I just saw GPT 5.2 do something absurd. It has a crazy amount of money ($26k) but folded with a 4-pair before the flop. That's insanely conservative, when it would have cost just $20 to see the flop. But even worse, on the very next hand it decided to place $20 down with a 5 and 4 of different suits.
In fact, all of them love folding before the flop. Most of the hands I'm seeing go like - $10 (small blind), $20 (big blind), fold, $70 bet, everyone folds. The site says "won $100", but in most of these cases that one LLM is picking up the blinds alone - $30. Chump change.
This is illuminating, but not a resource for learning poker.
- indigodaddya month ago
  Modern poker (which tbf not sure if these LLMs are acting according to modern GTO or not) is highly dependent on position. Things change a lot too when/if you are in SB/BB.
  - projectyanga month ago
    Yes, the prompt tries to get them to play GTO. I do think their preflop play is the closest to mirroring this compared to postflop behavior.
    indigodaddya month ago
    Is this tuned to tournament or cash GTO? To the OP's shock about pocket 4's (I think this is what they meant by 4-pair(?)), folding 4's pre flop in early position to no raise would be fairly standard in tournament GTO (although the stage of the tournament and # BBs can change things up significantly), but less standard for sure in cash (almost never probably).
jz67a month ago
Honest question, but this seems like an expensive project to host given the number of tokens per second. How is this being paid for?
- projectyanga month ago
  Good question! The player rooms have a rate limit per day. And as for the main table, it's actually a replay of hands I recorded the LLMs playing against each other over an extended time which eventually loops.
- psawayaa month ago
  Looks like this was cleverly designed to prevent costs blowing up. There's one game shared for everyone on the main page, and up to 100 private games per day.
sblawriea month ago
Do the players (LLMs) have memory of how prior hands were played by their opponents, or know their VPIP and PFR percentages? Or is each hand stateless?
- zahlmana month ago
  I suspect this would only matter much if they also remembered (and cared about) their own prior play.
  - sejjea month ago
    Not really. Only as far as their table image mattered--in this case, zero. Otherwise, you can and should ignore your own past play.
    What I'm curious about is if their innate training is enough to give them biases. Like maybe they think Grok is full of shit.
    zahlmana month ago
    > Not really. Only as far as their table image mattered--in this case, zero.
    Right; there's feedback to it. When humans play poker, they do so with common knowledge of the fact that humans have object permanence and can recognize and remember their opponents. The same thing that motivates "profiling" a villain, motivates attempting to project a table image, which in turn motivates being aware of the table image one is projecting.
- projectyanga month ago
  Each hand is stateless
SweetSoftPillowa month ago
Placing full GPT 5.2 versus fast/flash models of main competitors is unfair, would love to see more balanced table.
shukantpal23 days ago
This is really funny to watch and see what the LLMs are thinking. This makes me think how they would perform against a custom ML model trained with RL, e.g. https://ai.meta.com/blog/rebel-a-general-game-playing-ai-bot...
neko_rangera month ago
Thank you, I'll try to grab a table when it resets :) ! I've been getting into poker (always wanted to) since I found a lecture series from John Hopkins, and severely disappointed by my options to play online in NY (real or fake money). I just want to get reps in
- erikcwa month ago
  Link to the lectures?
  - maxbonda month ago
    Presumably it's this course:
    https://youtube.com/@jhupoker4850
    https://hopkinspokercourse.com
sneaka month ago
Needs a four color deck, and the colors on the cards of the waiting players should not be monochrome - makes it hard to evaluate what's happening in the hand. Also, a dealer button on the table would help in visually following the action.
jplataa month ago
Thanks for building and sharing, looks cool and is very entertaining.
I had similar idea for people to code poker playing bots and enter tournaments versus each other, this was pre-llm, however.
It would be fun if you hosted a 'tournament' every month and had each of the latest releases from the major models participate and see who comes out on top.
Or perhaps do open it up to others to enter and participate versus each other - where they can choose the model they want to build with and also enter custom prompt instructions to mold the play as they wish.
If you walk this path, would love to chat more.
mashlola month ago
I'm not an expert, but as I understand it there are existing solvers for poker/holdem? Perhaps one of the players could be a traditional solver to see how the LLMs fare against those?
- projectyanga month ago
  While others have commented about solvers, I'd also like to bring up AI poker bots such as Pluribus (https://en.wikipedia.org/wiki/Pluribus_(poker_bot)).
  This also wouldn't even be a close contest, I think Pluribus demonstrated a solid win rate against professional players in a test.
  As I was developing this project, a main thought came to mind as to the comparison between cost and performance between a "purpose" built AI such as Pluribus versus a general LLM model. I think Pluribus training costs ~$144 in cloud computing credits.
  - darepublica month ago
    Should be noted that this bot is heads up only? I believe a form of heads up poker is effectively solved as well-- limit hold'em heads up
- lowbatta month ago
  the LLMs would get crushed
  - cowthulhua month ago
    To expand on this - an LLM will try to play (and reason) like a person would, while a solver simply crunches the possibility space for the mathematically optimal move.
    It’s similar to how an LLM can sometimes play chess on a reasonably high (but not world-class) level, while Stockfish (the chess solver) can easily crush even the best human player in the world.
    postpriorxa month ago
    How does a poker solver select bet size? Doesn't this depend on posteriors on the opponent's 'policy' + hand estimation?
    Reason077a month ago
    GTO (“game theory optimal”) poker solvers are based around a decision tree with pre-set bet sizes (eg: check, bet small, bet large, all in), which are adjusted/optimized for stack depth and position. This simplifies the problem space: including arbitrary bet sizes would make the tree vastly larger and increase computational cost exponentially.
    boscillatora month ago
    No, I'm not super certain, but I believe most solvers are trained to be game theory optimal (GTO), which means they assume every other player is also playing GTO. This means there is no strategy which beats them in the long run, but they may not be playing the absolute best strategy.
    sejjea month ago
    Typically when you run a simulation on a hand, you give it some bet size options.
    To limit the scope of what it has to simulate.
    It's unlikely they're perfect, but there's very small differences in EV betting 100% vs 101.6% or whatever.
    meep_morpa month ago
    Not only to limit the scope of what it has to simulate, but only a certain number of bet sizes is practical for a human to implement in their strategy.
    iberatora month ago
    Nash equilibrium. Optimal strategy for online poker has been known for like literally 20 years right now
    bogzza month ago
    How would an LLM play like a human would? I kind of doubt that there is enough recounting of poker hands or transcription of filmed poker games in the training data to imbue a human-like decision pattern.
    meep_morpa month ago
    I don't have an answer, but there's over a decade of hand history discussions online from various poker forums like 2p2 and more recently Reddit.
    Terr_a month ago
    Also, if you set the bar for human players low enough, pretty much any set of actions is human-like. :p
    FergusArgylla month ago
    You are of course correct but to be pedantic:
    Stockfish isn't really a solver it's a neural net based engine
    DiscourseFana month ago
    Unlike Chess, in poker you don’t have perfect information, so there’s no real way to optimize it.
    tim-kta month ago
    You can still optimize for the expectation value, which is also essentially poker strategy.
    DiscourseFana month ago
    Anybody who plays poker “optimally” is bound to lose money when they come up against anyone with skill. Once you know the strategy your opponent is employing you can play like you have anything. I believe I’ve won with 7,2 offsuite more than any other hand, because I played like I had the nuts.
    cowthulhua month ago
    This is completely wrong - the entire point of the Nash equilibrium solution (in the context of poker, at least) is that it is, at worst, EV-neutral even when your opponent has perfect knowledge of your strategy.
    Your 72o comment indicates you are either playing with very weak players, or have gotten lucky, as in reasonably competitive games playing (and then full bluffing) 72o will be significantly negative EV. Try grinding that strategy at a public 10/20 table and you will be quickly butchered and sent back to the ATM.
    DiscourseFana month ago
    There are numerous videos of high level professional poker players winning large hands with incredible bluffs, this whole "Nash equilibrium solution" is nothing more than a conjecture with some symbols thrown in. I will re-iterate, there is no such thing as perfect knowledge when you have imperfect information. If you play "optimally," you will get bluffed out of all your money the moment everyone else at the table figures out what you're doing.
    24 days ago
    undefined
    24 days ago
    undefined
- sejjea month ago
  The solvers don't typically work in real time, I don't think. They take a while to crunch a hand.
  - dmurraya month ago
    "Solvers" normally means algorithms which aim to produce some mathematically optimal (given certain assumptions) behaviour.
    There are other poker playing programs [0] - what we called AI before large language models were a thing - which achieve superhuman performance in real time in this format. They would crush the LLMs here. I don't know what's publicly available though.
    [0] e.g. https://en.wikipedia.org/wiki/Pluribus_(poker_bot)
    sejjea month ago
    Solvers, in a poker context, are a category of programs. They run a simulation after you enter the known information.
    Like piosolver, as an example.
    The best poker-playing AI is not beatable by anyone, so yes, it would crush the LLMs.
gabriel666smitha month ago
This is fun!
Given online is now bot-riddled, I half-finished something similar a while back, where the game was adopting and 'coaching' (a <500 character prompt was allowed every time the dealer chip passed, outside of play) an LLM player, as a kind of gambling-on-how-good-at-prompting-you-are game. Feature request! The rake could pay for the tokens, at least.
TZubiria month ago
If you are interested in this space, you can check out NovaSolver.com
It's mostly a ChatGPT conversational interface over a classic Solver (Monte-Carlo simulation based), but that ease of use makes it very convenient for quick post-game analysis of hands.
I'm sure if you hook a Solver to a hud, it might be even simpler, but it's quite burdensome for amateurs, and it might be too close to cheating.
aaurelionsa month ago
I also started working on a similar project, but I think that LLM should know and be able to keep internal statistics about players. In poker, the best hand does not always win. Often, you can win by using emotions/words. LLM should be given the ability to communicate, mislead, etc.
lowbatta month ago
I like it!
I was interested in this idea too and made a video where some of the previous top LLMs play against each other https://www.youtube.com/watch?v=XsvcoUxGFmQ&t=2s
hnrich25 days ago
Saw Grok (4 Fast) "bluff open with a suited gapper." It was Nine/Deuce of Clubs. I guess I need to expand my definition of gapper!
cmxcha month ago
Would be amusing if the LLMs could achieve a steady state where nobody definitively wins or loses between each other.
That is, good enough to compete amongst each other but not good enough to for one to win.
nerdsnipera month ago
I'm fairly sure there was a bug where I won a hand that I should not have. game code was 'lNW4RF'
PLAYER shows A♠ 6♣ (Pair)
GPT (5.2) shows Q♠ Q♥ (Pair)
I had paired with a 6 and no aces on the board.
casey2a month ago
These bots are regularly going down 20%+ on high cards duels
csomara month ago
Was this vibe-coded: https://imgur.com/a/GvxA3mD ?
- projectyanga month ago
  Yep, I used claude code to help build this.
indigodaddya month ago
Are the LLMs "watching" the action, or are they only apprised of previous action once it gets to them?
- j_buma month ago
  How are these differebt in your mind? The history is the history.
  Or do you mean - each agent has a chance to think after every turn?
  - indigodaddya month ago
    Well they can be watching all the action and thinking the whole time as the action leads up them, just like we do in poker. To me it's different, subtly perhaps.
    projectyanga month ago
    For my implementation, I'm passing in the current hand's action history (e.g. Player 1 raises to $X preflop, Player 2 calls, Player 3 calls. Flop is A B C, Player 2 checks, etc) whenever the action is on the player.
    Your idea of having it being passed in real time and having the LLM create a chain of thoughts even if action is not on them is interesting. I'd be curious to see if it would result in improved play.
    a month ago
    undefined
koolbaa month ago
How long till one of the LLMs makes calls out to the other LLMs to evaluate how to play the hand?
Dinuxa month ago
This is amazing, I just wish I could pause the game and have them play step by step
indigodaddya month ago
Curious if you used pokerkit for this, or some other engine or custom engine?
- projectyanga month ago
  Nope, no external poker libraries. Just a basic nodejs and socket.io server with game logic.
  - indigodaddya month ago
    Cool
thinkloopa month ago
Cool idea. I tried to create a room but it says limit reached for today.
Descona month ago
Why not texasholdllm.com?!
fumblebeea month ago
this could make for an interesting new benchmark
hrimfaxia month ago
Would you consider open sourcing this project?
TheDudeMana month ago
So strange that people are into this, but were not into the much stronger non-LLM poker agents.
ionwakea month ago
Why are there 2 Claude Players ?
- projectyanga month ago
  On mobile I had to squeeze the names, but on a wider view you'll see that it's Claude (Opus 4.5) and Claude (Sonnet 4.5).
hahahahhaaha month ago
Can we chuck a nash equilibrium player in too?
cindyllma month ago
[dead]