Why you should use Bayesian Statistics
Nov 8, 2016 16:23:33 GMT
+Mozzy, ghaskan, and 2 more like this
Post by RationalPi#1895 on Nov 8, 2016 16:23:33 GMT
We’ve all seen it, the outrageous claims of incredible win rates for decks that are “guaranteed” to take even the lowliest player to legend. Every time you look into it, the player has only played a small number of games, resulting in a high variance and unreliable results. Of course, getting the variance down requires tons and tons of games before seeing meaningful results. Don’t you wish there was a way to get better statistics faster?
Enter Bayesian statistics. Bayesian statistics is an alternative formulation of statistics that uses both observed data and prior beliefs to give estimates that are better than either would be alone. This results in measurements of winrate that are less susceptible to aberrant win streaks and give meaningful results with fewer games.
Statisticians have already found the best priors for many different distributions. In , we are often interested in the winrate of a deck, which is the chance of winning a game for a given deck or matchup. In statistical terms, this is known as a binomial distribution, since you get either a win (1) or a loss (0) and the proportion of wins to losses is tied to some unknown parameter (*p*). The best prior for a binomial distribution is known as a beta prior, which says that the results should be distributed according to a beta distribution. The beta distribution is defined by two parameters, a and *b*, and the Bayesian estimate is given by:
where x is the number of successes in n trials.
If you look closely at that statistic you’ll realize that we’re basically just adding in a group of extra games with a win rate given by *a/(a+b)*.
Picking the right a and b is all about using prior information, so I dug into some existing stats to come up with my numbers. By looking at the raw data from the vS Data Reaper Report I was able to come up with parameters appropriate for a few different scenarios: estimating the winrates in a given matchup, estimating the overall ladder winrate of a deck, and estimating your average winrate as a player. Each of these is distributed differently, matchup winrates are more polarized than winrates against the field on the ladder and player winrates fall somewhere in-between. I chose a and b to be equal to each other, assuming that competitive decks are distributed around a 50% winrate.
Enter Bayesian statistics. Bayesian statistics is an alternative formulation of statistics that uses both observed data and prior beliefs to give estimates that are better than either would be alone. This results in measurements of winrate that are less susceptible to aberrant win streaks and give meaningful results with fewer games.
The Binomial Distribution and the Beta Prior
A Bayesian model starts with an initial distribution called a “Prior Distribution.” This distribution is the expected range of results before any statistics have been gathered, and it should contain the best knowledge available on how the final values should be distributed. For example, if you know that most true win rates fall between 40% and 60%, you can select a prior distribution that places most of the results in that range. This doesn’t mean that values can’t fall outside of that range, just that you need a lot more samples to push a Bayesian model beyond the center of the prior. In other words, extreme claims require extreme evidence.Statisticians have already found the best priors for many different distributions. In , we are often interested in the winrate of a deck, which is the chance of winning a game for a given deck or matchup. In statistical terms, this is known as a binomial distribution, since you get either a win (1) or a loss (0) and the proportion of wins to losses is tied to some unknown parameter (*p*). The best prior for a binomial distribution is known as a beta prior, which says that the results should be distributed according to a beta distribution. The beta distribution is defined by two parameters, a and *b*, and the Bayesian estimate is given by:
p=(a+x)/(a+b+n)
where x is the number of successes in n trials.
If you look closely at that statistic you’ll realize that we’re basically just adding in a group of extra games with a win rate given by *a/(a+b)*.
Picking Parameters
Now that we know what statistic we’re using, we need to pick the right parameters. In essence, the beta prior is like adding in a batch of (a+b) games at a winrate given by a/(a+b)*. The larger *a and b are, the more games it will take to significantly impact the estimated winrate, and the ratio of a and b determines the ratio of wins to losses.Picking the right a and b is all about using prior information, so I dug into some existing stats to come up with my numbers. By looking at the raw data from the vS Data Reaper Report I was able to come up with parameters appropriate for a few different scenarios: estimating the winrates in a given matchup, estimating the overall ladder winrate of a deck, and estimating your average winrate as a player. Each of these is distributed differently, matchup winrates are more polarized than winrates against the field on the ladder and player winrates fall somewhere in-between. I chose a and b to be equal to each other, assuming that competitive decks are distributed around a 50% winrate.
Estimate | a | b |
Matchup Winrate | 8.6 | 8.6 |
Deck Winrate | 105 | 105 |
Player Winrate | 49.5 | 49.5 |
Initially, I recommend choosing a and b equal to eachother, but there can be value in other choices. For example, it may be worth using your personal winrate as a basis when determining deck winrates on the ladder to account for the skill difference between yourself and your opponents, though it’s probably better to find even competition to test your decks against, since skill varies so widely on the ladder.
Tradeoffs of Bayesian statistics
There are advantages and disadvantages to using Bayesian estimates as opposed to the standard frequentist statistics. The biggest advantage is that you don’t have wild variation on your estimate for small sample sizes, which are common in . The main disadvantage is that it takes longer to converge on the correct value, if that value is far away from the mean of your prior. Ultimately, though, I think the advantages outweigh the disadvantages, and Bayesian statistics are much better suited to the tasks most often performed in .
TL;DR
You’ll get more reliable winrate statistics if you start off with a bunch of fake games at a 50/50 winrate. For individual deck matchups start with a 8.6-8.6 record, for ladder winrates start with a 105-105 record and for personal winrates across many decks start with a 49.5-49.5 record.