RS-Poker V5: The one with self learning Agents
v5.0.0 is a huge breaking release. We destroy all backwards compat this one time, in the hopes it was worth it.
In breaking so many things we get a highly tuned RL arena for poker and a counterfactual regret minimization agent that’s configuratble and budget driven exploration is the most complete and state of the art in the open source world. No need to pay for closed source GTO calculators. This is all open source now.
If you want to research poker this release gives you the fastest tools
with compact and performant implementations of all the most recent
research into poker bots in the world. Write your own async Agent
or use our pre-built strategies. Compare how they do against each
other.
If you’re a poker enthusiast then this release is for you. You get a TUI to browse your hand history in Open Hand History format. Or anonymize your logs before sending them off to others.
Highlights
rspCLI — a single unified poker toolkit binary (holdem, arena, omaha, ICM, OHH) with interactive TUIs, a global--budgetflag, and a jemalloc allocator tuned for CFR’s allocation profile.- Async arena —
AgentandHistorianare nowSendwithasync fnmethods; the simulation round state machine awaits on a tokio runtime. - Perfect-hash hand evaluator — 2.2–2.4× faster
rank(),Rankshrunk from 8 bytes to 2. - CFR overhaul — single shared tree, compact action space, regret-based pruning, full board enumeration, PCFR+ regret matching, and a composable budget tree replacing the old tangle of depth/recursion/cancellation knobs.
- Pluggable opponent-hand estimation — a new
HandDistributionEstimatortrait and per-wave world sampling, the base for ML-driven range inference.
Getting rsp
Everything lives in one crate: rs-poker
on crates.io, source on GitHub.
The rsp binary is gated behind a feature flag, so install it with:
cargo install rs_poker --features rsp
That drops a single rsp binary on your PATH. A quick tour:
# Rank a five-card hand
rsp holdem rank AcKdQhJsTs
# Monte Carlo equity, AdKh vs 8c8s
rsp holdem simulate AdKh 8c8s
# Outs on a given board
rsp holdem outs AcAd KsKh --board QhJhTh --detailed
# Pit a directory of agent configs against each other
rsp arena compare ./examples/configs -n 5000 -p 3
# Chip stacks to tournament $EV
rsp icm simulate --chip-stacks 1500 1250 600 500 850 800 --payments 50 30 20 --iterations 50000
The global --budget flag bounds how much work the solver does, so you can
trade accuracy for time on any subcommand. Hand-history browsing and
anonymization live under the same binary too.
If you just want the library — the evaluator, arena, and CFR types — to build your own tools on top of, skip the feature flag and add it as a dependency:
cargo add rs_poker
I’ll write more about the different parts of the rsp, poker CFR, and the reinforcement learning arena in future articles. I think there’s some very interesing research here, and solving these types of games is possible.