Variable-Ratio ReinforcementRegulating opponents’ behaviorby Jeff Hwang | Published: Mar 19, 2010 |
|
Note: What follows is a special preview from Jeff’s upcoming book, Advanced Pot-Limit Omaha Volume II: LAG Play and THE Workbook, slated for a fall 2010 release.
Once you get to a certain point in your development as a poker player — you’ve learned hand valuations and acquired the necessary technical skills to play the game — the next big step to opening up your game is figuring out how to regulate your opponents’ behavior in such a way as to make them easier to play against. That is, the next step is founded in large part on psychology.
Enter variable-ratio reinforcement.
Variable-ratio reinforcement is generally defined as delivering reinforcement after a target behavior is exhibited a random number of times. Let’s take a slot machine, for example. A gambler sits down at a slot machine and bets $1 a pull. As you would expect, most of the time the gambler will bet $1 and lose, which of course is great for the casino. But if all the gambler does is bet $1 and lose every time, eventually he will quit or go broke, and never want to play again. So, every few spins, the slot machine will reward the gambler with a payoff: $1 here, $1 there; $5 here, $1 there.
Then, every once in a long while, the machine will reward the gambler with a big payoff in the form of a jackpot.
Now, none of this quite adds up, which is how the house wins in the long run. But the promise of the big payoff, along with the intermittent rewards, is generally enough for the casino to reinforce the target behavior, which is to have the gambler keep betting $1 a pull.
That brings us to our next topic, which is the reinforcement schedule.
Reinforcement Schedules:
Variable vs. Fixed
There are two basic types of reinforcement schedules: variable-ratio reinforcement schedules, and fixed-ratio reinforcement schedules.
Let’s start with the latter, which is the most basic. A fixed-ratio reinforcement schedule is one in which reinforcement is delivered at fixed intervals. Let’s say, for example, that you are casino management, and you want the slot machine to pay out 20 percent of the time, or every fifth spin. So, the gambler will lose $1 four times in a row and get a payout on the fifth one every time.
The reinforcement schedule would look something like this:
Slot Machine: Fixed-Ratio Reinforcement Schedule
Lose Lose Lose Lose Win
Lose Lose Lose Lose Win
Lose Lose Lose Lose Win
Lose Lose Lose Lose Win
Lose Lose Lose Lose Win
Adjusted for payouts, the schedule might look more like this:
Slot Machine: Fixed-Ratio Reinforcement Schedule With Payouts
-$1 -$1 -$1 -$1 +$2
-$1 -$1 -$1 -$1 +$10
-$1 -$1 -$1 -$1 +$1
-$1 -$1 -$1 -$1 +$4
-$1 -$1 -$1 -$1 +$1
In this scenario, for every 25 spins, the gambler would win $18 on the five winning spins and lose $20 on the rest, for a net loss of $2. For the house, this represents a payout of 92 percent and a house edge of 8 percent.
Now, all of this sounds great, but there is a major problem: Nobody would ever play a game with a payout (reinforcement) schedule like this one!
OK, maybe “nobody” and “ever” might be a little strong, but the point remains, because it wouldn’t take long for the gambler to figure out that this slot machine pays out every fifth spin, and only every fifth spin. As a result, he would quit playing.
Using a variable-ratio reinforcement schedule is the fix for this problem.
Variable-Ratio Reinforcement Schedule
A variable-ratio reinforcement schedule uses a predetermined ratio while delivering the reinforcement randomly. Going back to the slot machine, let’s say that you once again are casino management and want the slot machine to pay out 20 percent of the time, or every fifth time on average.
Now, your reinforcement schedule may look something like this:
Slot Machine: Variable-Ratio Reinforcement Schedule
Lose Lose Lose Lose Win
Lose Win Lose Lose Lose
Lose Lose Win Lose Lose
Win Lose Lose Lose Lose
Lose Lose Lose Win Lose
And adjusted for payouts, the schedule would look like this:
Slot Machine: Variable-Ratio Reinforcement Schedule With Payouts
-$1 -$1 -$1 -$1 +$2
-$1 +$10 -$1 -$1 -$1
-$1 -$1 +$1 -$1 -$1
+$4 -$1 -$1 -$1 -$1
-$1 -$1 -$1 +$1 -$1
In aggregate, the expectation is the same: Over 25 spins, the gambler will still realize a net $2 loss, for a 92 percent payout and 8 percent house advantage for the casino. But in reality, this scenario is far more likely to achieve the desired result, which is to have the gambler keep playing. In contrast to the fixed-ratio reinforcement schedule, a variable-ratio reinforcement schedule with a 20 percent reinforcement ratio provides clusters of payouts (for example, back-to-back wins), as opposed to having spins (or blocks of spins) on which the gambler can say for certain that he will lose, and quit playing as a result.
This is because the variable-ratio reinforcement schedule does not specify when the payouts occur, but only how often they occur on average.
That said, in regard to pot-limit Omaha, there is one major application for variable-ratio reinforcement that I will discuss another time. That application is the continuation-bet (c-bet).
Jeff Hwang is a semiprofessional player and author of Pot-Limit Omaha Poker: The Big Play Strategy and Advanced Pot-Limit Omaha: Small Ball and Short-Handed Play. He is also a longtime contributor to the Motley Fool. You can check out his website at jeffhwang.com.
Features
The Inside Straight
Featured Columnists
Strategies & Analysis
Commentaries & Personalities