The Statistics of Lottery Selections
Selecting a number of objects from a set, without replacing between selections, is a well-studied statistical problem. For the sake of discussion, let's say we have a set of N lottery balls, labelled 1 to N, and a device that picks one ball completely at random, at the same time removing it from the draw.
Draw an X
What is the probability that the first ball will be ball number X? Well, there are N different balls, any one of which is equally likely to be picked, so we can say that on average one time in N we will get ball number X drawn from the machine. Another way to say this is that the probability of drawing ball X is 1/N. If there are ten balls in total, each has the same probability 1/10 of being drawn first.
After the first draw, there are N-1 balls left to choose from. The probability of drawing any particular one of the remaining balls, let's say ball Y, is now 1/(N-1). In our example above starting with ten balls, on the second draw the probability of drawing a particular remaining number is 1/9
Multiple probabilities
When dealing with probabilities, if we want to work out the probability of two events occurring (and we know that the outcome of one event does not influence the other) then we can simply multiply the events' individual probabilities together. So the probability of drawing ball X followed by ball Y is 1/N times 1/(N-1), which is
1/(N×(N-1))
Similarly, the probability of drawing X then Y then Z is
1/(N×(N-1)×(N-2))
and in general, the probability of any sequence of M balls being drawn (where M is no greater than N) is
1/(N×(N-1)×(N-2)×...×(N-M+1))
where the ... means we keep multiplying by the next smallest number until we reach the final N-M+1. For example, the probability of drawing a particular sequence of 4 balls in our 10-ball example is 1/(10×9×8×7)
Factorials!
Another way to write this is to use the 'factorial' notation, where we put a '!' sign after a symbol to indicate that we want to multiply together all the numbers from 1 up to the number in question. So:
3! = 1×2×3 = 6
5! = 1×2×3×4×5 = 120
N! = 1×2×3×...×(N-1)×N (The numbers get large very quickly!)
We can now write our formula for the probability of drawing a particular sequence of M balls from a set of N, as
(N-M)! ÷ N!
For the sequence of 4 balls drawn from 10 example, this gives us
(6×5×4×3×2×1) ÷ (10×9×8×7×6×5×4×3×2×1)
and you can see that this is the same result as before once we've cancelled out the 6×5×4×3×2×1 terms.
Arrangements
In a real lottery, however, we do not normally try to predict the exact sequence in which the balls will be drawn - the order doesn't matter, only the set of numbers drawn. How many different arrangements are there of M balls? Let's first ask how many arrangements of two balls there are. Well, the answer is 2, we can arrange them as 1,2 or 2,1.
So what if we have three balls? Now, there are three different possibilities for the first ball, after each of which there will be two remaining balls which we have just seen can be arranged in two ways. 3×2=6 so there are six possible arrangements. Let's list them to be sure:
1,2,3; 1,3,2; 2,1,3; 2,3,1; 3,1,2; 3,2,1; that's six arrangements.
If we have four balls, there are four different possibilities for the first ball, after each of which there will be six ways of arranging the remaining three, so the total number of possibilities is 4×3×2=24. In fact, in general there are M! different ways to arrange M balls.
So when a lottery machine draws M balls from a set of N, the total number of different outcomes is given by the number of ways to pick M balls, divided by the number of ways to arrange M balls - or
N! ÷ (M! × (N-M)!)
This type of expression for the number of combinations of M objects from a set of N occurs frequently in probability theory, and is commonly written NCM
The probability of our particular set of balls being picked is then 1/NCM, or
M! × (N-M)! ÷ N!
Some examples
In our example of choosing 4 balls from a set of 10, this works out as:
(4×3×2×1) × (6×5×4×3×2×1) ÷ (10×9×8×7×6×5×4×3×2×1), or
(4×3×2×1) ÷ (10×9×8×7), or 1/210
So in this lottery we'd hit the jackpot on average once every 210 times we played. In a more realistic example, we might have to choose 6 numbers from 49. The probability of a jackpot in this lottery is:
6!×43! ÷ 49!, or
(6×5×4×3×2×1) ÷ (49×48×47×46×45×44), or
1/13983816
So we'd expect to hit the jackpot around once every fourteen million times we played.
Smaller prizes
What about the probability of, say, matching 3 out of the 6 numbers chosen? Or in general, matching S out of the M chosen? This turns out to be given by:
(MCS x (N-M)C(M-S)) ÷ NCM
In our 10-ball example, if we need to have matched 2 balls in our choice of 4 to win a smaller prize, the probability of success is:
(4C2 x 6C2) ÷ 10C4, or
4!×6!×4!×6! ÷ 2!×2!×2!×4!×10!, or
3×6! ÷ 10×9×8×7, or 3/7
In our more realistic 49-ball example, if we have to match 3 balls to win, the probability is:
(6C3 x 43C3) ÷ 49C6, or
6!×43!×6!×43! ÷ 3!×3!×3!×40!×49!
which boils down to about 0.01765, or around 1/57
The probability of matching 4 balls is:
(6C4 x 43C2) ÷ 49C6, or 0.000969, around 1/1032
The total probability of a prize is the sum of the probabilities for 3 balls, 4 balls, 5 balls and 6 balls. This comes to about 1/53.7 - so if you play this lottery once a week, you can expect to win something a bit less than once per year.