Practical Probability - Part IVCorrelationby Steve Zolotow | Published: Mar 20, 2009 |
|
This column will discuss the important topic of correlation. While correlation is really more of a statistical measure than a topic within probability theory, it is extremely valuable to be comfortable with the concept. Correlation allows us to discuss the relationship between two things. Positive correlation occurs when an increase in one thing (often called a variable) implies an increase in another thing. Temperature is positively correlated with ice cream sales. If the temperature rises, ice cream sales also will rise. Negative correlation occurs when an increase in one variable implies a decrease in another variable. Temperature is negatively correlated with winter-coat sales. If the temperature rises, sales of winter coats decline.
The coefficient of correlation is a numerical index that shows how strong the relationship between the two variables is. The coefficient can range from 1.0, perfect positive correlation, to -1.0, perfect negative correlation. A correlation close to 1.0 or -1.0 is very strong. A coefficient close to zero means that the two variables have little or no relationship. The following table summarizes this:
Correlation | Positive | Negative |
None | 0 | 0 |
Slight | .1 to .3 | -.1 to -.3 |
Moderate | .3 to .6 | -.3 to -.6 |
Strong | .6 to 1.0 | -.6 to -1.0 |
Correlation is very useful in making predictions. If two variables show a high degree of correlation, knowledge of change in one will lead to a very accurate prediction of the change in the other. The most common misconception about correlation is that it implies causality. Variables can be strongly correlated even though the change in one did not cause the change in the other. Looking at the first example above, it is likely that an increase in temperature caused an increase in ice cream sales. It is extremely unlikely that an increase in ice cream sales could cause the temperature to rise. There also may be one or more other variables that cause the two variables to increase in tandem. The number of priests in a city correlates strongly with the amount of alcohol sold. It is absurd to say that increasing the number of priests causes a rise in alcohol sales, or that an increase in alcohol sales causes more people to join the clergy. In reality, the key variable is population. As the population rises, so do the number of priests and the sales of alcohol. It is also possible, especially when the sample is small, that the correlation happens randomly.
Now, how does this apply to poker or gambling? The use of correlation, even without measuring it scientifically by using the correlation coefficient, can be thought of as looking for patterns. A few years ago, I spent some time talking with a friend who loves poker; he has managed to play a few hours almost every night at a local casino for the last few years, even though he has a day job. During his lunch hour, he records everything he can remember about his recent sessions. I asked him what he did with all of that information, and he replied, smiling sheepishly, "Not much; what would you suggest?" I told him that he should copy the data from his five best and his five worst sessions, and I'd review it over the dinner with wine that he was going to buy for us. I studied his stuff and looked for positive and negative correlations. Was there something he did or that happened when he won, and didn't when he lost? On the other hand, was there something that occurred during the losing sessions that didn't occur during the winning ones?
I noticed the following patterns in the five winning sessions:
1. They all occurred on the first Saturday of the month.
2. His average playing session lasted about five hours.
3. In four of the five sessions, someone he called Peter Piper (his name was Pete and he chewed on an unlit pipe) was playing, and usually, several of the other players were tourists.
4. The last three big winning nights had the notation RSS.
The five big losing sessions also had some things in common:
1. They occurred on Tuesday or Wednesday night.
2. His average playing session lasted nearly nine hours.
3. The table consisted mostly of tight locals.
4. None had the notation RSS.
Remember from our discussion of correlation that sometimes something occurs because of some unknown variable, and sometimes even randomly. I had uncovered some things that correlated with winning and losing, but I needed to try to find a plausible reason why. I also was curious about what RSS meant. He told me to deal with the other stuff first.
The explanation for the days on which he won and lost is pretty simple. The games on Saturdays had a lot of tourists, and inexperienced players were playing. It is fairly common for people to get paychecks, Social Security checks, and other income at the end of the month. Some of those people take their money to the poker table for a little entertainment and a few drinks. These were the most profitable games. The games on Tuesday or Wednesday were tougher. It was easier to have a big loss, and harder to have a big win.
His losing sessions were nearly twice as long as his winning ones. I was sure that he fell into the frequent trap of quitting when the game was good and he was playing well, but playing longer on days when he was stuck and the game was tough. In his case, this was probably made even worse by the fact that he worked on weekdays. This meant that he was playing his longest sessions on days when he wasn't rested. It isn't hard to imagine how badly he might have been playing toward the end of these sessions.
My advice was that he take the middle of the week off. If he couldn't bring himself to do that, he should at least limit both his hours of play and his buy-ins. He also should play a little longer on weekends, especially when the games were good and he was winning. Now I was ready to find out what RSS stood for. RSS seemed to correlate moderately well with winning, and no RSS was noted when he lost. Perhaps it was something I could use myself.
At first, he refused to tell me, but I insisted that we finish with a little grappa. Not too amazingly, he opened up midway through the second glass, and admitted that it stood for the fact that he wore red socks and a red shirt during those sessions. My initial reaction was that it was random. He claimed to have had several other good results when wearing them, so they must be lucky. I wouldn't accept the concept of lucky outfits, but I did admit that it was possible that they led him to have a positive outlook, which influenced his play for the better. You be the judge – correlation, causality, or randomness?
Steve "Zee" Zolotow, aka The Bald Eagle, is a successful games player. He currently devotes most of his time to poker. He can be found at many major tournaments and playing on Full Tilt, as one of its pros. When escaping from poker, he hangs out in his bars on Avenue A – Nice Guy Eddie's on Houston and Doc Holliday's on 9th Street – in New York City.