ichabod
Legned
NDS DROP D FORMULA
------------------
What is the formula for the probability of getting x if you roll n dice with s sides each, and drop the d lowest of them before summing them up? This question has popped up before on these boards, and was recently asked again. Since the last time it came up, I've been working on it in my spare time. I've learned some new techniques which should answer the problem, but I can't get them to. I thought I'd lay out what I've learned and what I've been thinking, and perhaps someone else here can figure out where my error was.
TECHNIQUES
I found this book, Counting: The Art of Enumerative Combinatorics, by George E. Martin. I've skimmed it and picked up some useful stuff. First off, there are many ways to look at counting problems. The one I've been using is putting balls into boxes. Think of each die as a box. The number that is rolled on the die is the number of balls that happen to be in the box. Alternatively, you can think of each pip on a die to be a ball in a box. So nds drop d is equivalent to how many ways we can put 1 to s balls in each of n boxes, get rid of the d boxes with the least balls in them, and have x balls in the remaining boxes.
NOTATION
Those of you who were in earlier dice formula threads (like this one) should be familiar with the n choose k notation:
Where n! is 1*2*3*...*n. When dealing with inline text, I'll notate this as (n k). Normally this is undefined for k > n. That's because (n k) equals the number of ways you can choose k objects from a set of n objects. You can't choose more objects than you've got. For the purposes of this thread, I'm setting (n k) = 0 if k > n. There are ways to get around doing this, but it makes things simpler. It also makes sense, as there are 0 ways to chose 5 objects from 4 objects.
There are some more notations that I got from Martin's book:
(a) is n choose k with replacement. That is, how many ways can you choose k objects from n objects, if after each choice you replace the object chosen, so it can be chosen again. My inline notation for n choose k with replacement will be <n k>.
(b) is the number of ways to partition r into n parts. That means how many ways can we get r by summing up n positive integers. Note 5 can be partitioned into two parts two ways: 4+1 and 3+2. 3+2 and 2+3 are not considered to be different partitions. My inline notation for r partitioned into n parts will be PI (r, n).
(c) is called a Stirling number of the second kind. It counts the number of ways r distinguishable objects can be put into n indistinguishable boxes, with no box empty. My inline notation for Stirling numbers of the second kind will be {r n}.
This distinguishable/indisguishable disctinction comes up quite a bit in enumerative combinatorics. Think about it this way: if you have one ball and three indistinguishable boxes, there is one way to put the ball in the boxes (one ball in one box). But if you can tell the boxes apart, there are three ways (ball in 1st box, ball in 2nd box, and ball in 3rd box).
I don't necessarily use all of these notations in this post, but I thought they'd be handy to have around for discussion.
HANDY TABLES
These tables are taken from the aforementioned Martin book, pages 13 and 38.
Table 1: Selecting r from n distinguishable objects.
Table 2: Put r balls into n boxes with no box empty
Table 3: Put r balls into n boxes.
MY THINKING
First, let's define some variables:
Now, it seems to me that the boxes should be distinguishable, and the balls should be indistinguishable. Each die can be a different color, and thus be distinguishable. But we don't care which pips are the dice, just how many.
Unfortunately, we cannot clearly distinguish the k kept boxes from the d dropped boxes. This is because you might have multiple instances of m, the smallest number kept. We need to think about three types of boxes: the kept boxes greater than m, the boxes equal to m (some kept, some maybe dropped), and the dropped boxes less than m.
So, how many kept boxes have more than m balls in them? It depends. You can get a 9 with (2, 2, 5), (4, 3, 2), and with (3, 3, 3). So, we'll sum i from 0 to k-1, because we could have all the boxes with m balls in them (and thus 0 with more than m), but we have to have at least one kept box with m balls in it (there must be a minimum).
What are we summing? We have k-i boxes with m balls in them. So we need to place (x-(k-i)*m) balls in the remaining i kept boxes. Note that in every case we are going to put at least m+1 balls into the remaining i kept boxes, because they each have to have more than m balls. There is only one way to put m+1 balls into each box. So this is equivalent to putting (x-(k-i)*m-i*(m+1)) balls into i boxes. This simplifies to (x-k*m-i) balls. Table 3 tells us this is <i x-k*m-i>.
Now we need to reconcile this with the n boxes that we have overall. What we have at this point is i boxes with greater than m balls in them, and n-i boxes with no balls in them (yet). Note that we've already distinguished the i boxes with balls in them. But since there are only i of them, this is not the same distinguishment as all n boxes. We need to slide the i distinguished boxes in to n slightly larger distinguished boxes.
For example, say i=2 and n=4. The i boxes are distinguished as A and B. The n boxes are distinguished as 1, 2, 3, and 4. How many ways can we match the letters A and B to the numbers 1 to 4? Tweleve (four possible assingments for A, and for each A assignment three possible assignments of B). In general, we're looking for the number of permutations without repetition. So we get n!/(n-i)! from Table 1.
http://www.xenomind.com/enworld/parform3.gif>
Okay, but that's just one of the three sets of dice we have to deal with. Now, how many boxes have m balls in them? At least k-i, but also potentially sum, all, or none of the d dropped boxes. So k-i to k+d-i boxes. Now, there's only one way to put m balls in all of those boxes. However, we will again have to assign those to the n distinguishable boxes (n-i at this point, since we already filled i of them with m+1 balls).
But what if we first deal with the boxes that have less than m balls in them? If we completely deal with those, all we will have to do is put exactly m balls in each of the boxes that are left empty. But there's only going to be one way to do that! So if we can figure out how to deal with the boxes with less than m balls in them, the formula is finished.
How many boxes have less than m balls in them? Maybe none: it may be that all the dropped boxes have the same number of balls as the lowest number in any kept box. On the other hand, the rest of the kept dice have to have m balls in them, but none of the other dropped ones have to. So we could have 0 to d boxes with less than m balls in them. Well, there's m-1 ways to put less than m balls in one box and have at least one ball in the box. There's (m-1)*(m-1) ways to do that for two distinguishable boxes, and (m-1)^j ways to do that for j boxes. Now slide those j boxes into the n-i empty distinguishable boxes to get:
[img]http://www.xenomind.com/enworld/parform4.gif
which simplifies to:
Almost there. Up until now, m has been just a variable, but we need to account for all of the possible values of m. What can m be? Any number that can be rolled (1 to s):
Now that pesky number 0 is going to cause some problems here. If m = 1, m-1 = 0, and we get that there are 0 ways to put 1 ball in each of j distinguishable boxes. Obviously wrong, there's one way to do that. So let's calculate m = 1 separately, making the summation over m go from 2 to s:
"Wow!" I can hear you saying. "Beautiful!" reads the headline in tommorow's paper. "Amazing!" shout the throngs of probabilists partying in the streets.
Except it doesn't work.
Anyone got any ideas? Help?
------------------
What is the formula for the probability of getting x if you roll n dice with s sides each, and drop the d lowest of them before summing them up? This question has popped up before on these boards, and was recently asked again. Since the last time it came up, I've been working on it in my spare time. I've learned some new techniques which should answer the problem, but I can't get them to. I thought I'd lay out what I've learned and what I've been thinking, and perhaps someone else here can figure out where my error was.
TECHNIQUES
I found this book, Counting: The Art of Enumerative Combinatorics, by George E. Martin. I've skimmed it and picked up some useful stuff. First off, there are many ways to look at counting problems. The one I've been using is putting balls into boxes. Think of each die as a box. The number that is rolled on the die is the number of balls that happen to be in the box. Alternatively, you can think of each pip on a die to be a ball in a box. So nds drop d is equivalent to how many ways we can put 1 to s balls in each of n boxes, get rid of the d boxes with the least balls in them, and have x balls in the remaining boxes.
NOTATION
Those of you who were in earlier dice formula threads (like this one) should be familiar with the n choose k notation:

Where n! is 1*2*3*...*n. When dealing with inline text, I'll notate this as (n k). Normally this is undefined for k > n. That's because (n k) equals the number of ways you can choose k objects from a set of n objects. You can't choose more objects than you've got. For the purposes of this thread, I'm setting (n k) = 0 if k > n. There are ways to get around doing this, but it makes things simpler. It also makes sense, as there are 0 ways to chose 5 objects from 4 objects.
There are some more notations that I got from Martin's book:

(a) is n choose k with replacement. That is, how many ways can you choose k objects from n objects, if after each choice you replace the object chosen, so it can be chosen again. My inline notation for n choose k with replacement will be <n k>.
(b) is the number of ways to partition r into n parts. That means how many ways can we get r by summing up n positive integers. Note 5 can be partitioned into two parts two ways: 4+1 and 3+2. 3+2 and 2+3 are not considered to be different partitions. My inline notation for r partitioned into n parts will be PI (r, n).
(c) is called a Stirling number of the second kind. It counts the number of ways r distinguishable objects can be put into n indistinguishable boxes, with no box empty. My inline notation for Stirling numbers of the second kind will be {r n}.
This distinguishable/indisguishable disctinction comes up quite a bit in enumerative combinatorics. Think about it this way: if you have one ball and three indistinguishable boxes, there is one way to put the ball in the boxes (one ball in one box). But if you can tell the boxes apart, there are three ways (ball in 1st box, ball in 2nd box, and ball in 3rd box).
I don't necessarily use all of these notations in this post, but I thought they'd be handy to have around for discussion.
HANDY TABLES
These tables are taken from the aforementioned Martin book, pages 13 and 38.

Table 1: Selecting r from n distinguishable objects.

Table 2: Put r balls into n boxes with no box empty

Table 3: Put r balls into n boxes.
MY THINKING
First, let's define some variables:
- n = number of dice (boxes)
- s = numbers of sides for each die (number of balls that can fit into each box).
- d = number of dice (boxes) dropped.
- k = number of dice (boxes) kept. k = n - d.
- x = value we are checking the number of combinations for (number of balls in kept boxes).
- m = the smallest number kept.
Now, it seems to me that the boxes should be distinguishable, and the balls should be indistinguishable. Each die can be a different color, and thus be distinguishable. But we don't care which pips are the dice, just how many.
Unfortunately, we cannot clearly distinguish the k kept boxes from the d dropped boxes. This is because you might have multiple instances of m, the smallest number kept. We need to think about three types of boxes: the kept boxes greater than m, the boxes equal to m (some kept, some maybe dropped), and the dropped boxes less than m.
So, how many kept boxes have more than m balls in them? It depends. You can get a 9 with (2, 2, 5), (4, 3, 2), and with (3, 3, 3). So, we'll sum i from 0 to k-1, because we could have all the boxes with m balls in them (and thus 0 with more than m), but we have to have at least one kept box with m balls in it (there must be a minimum).

What are we summing? We have k-i boxes with m balls in them. So we need to place (x-(k-i)*m) balls in the remaining i kept boxes. Note that in every case we are going to put at least m+1 balls into the remaining i kept boxes, because they each have to have more than m balls. There is only one way to put m+1 balls into each box. So this is equivalent to putting (x-(k-i)*m-i*(m+1)) balls into i boxes. This simplifies to (x-k*m-i) balls. Table 3 tells us this is <i x-k*m-i>.

Now we need to reconcile this with the n boxes that we have overall. What we have at this point is i boxes with greater than m balls in them, and n-i boxes with no balls in them (yet). Note that we've already distinguished the i boxes with balls in them. But since there are only i of them, this is not the same distinguishment as all n boxes. We need to slide the i distinguished boxes in to n slightly larger distinguished boxes.
For example, say i=2 and n=4. The i boxes are distinguished as A and B. The n boxes are distinguished as 1, 2, 3, and 4. How many ways can we match the letters A and B to the numbers 1 to 4? Tweleve (four possible assingments for A, and for each A assignment three possible assignments of B). In general, we're looking for the number of permutations without repetition. So we get n!/(n-i)! from Table 1.
http://www.xenomind.com/enworld/parform3.gif>
Okay, but that's just one of the three sets of dice we have to deal with. Now, how many boxes have m balls in them? At least k-i, but also potentially sum, all, or none of the d dropped boxes. So k-i to k+d-i boxes. Now, there's only one way to put m balls in all of those boxes. However, we will again have to assign those to the n distinguishable boxes (n-i at this point, since we already filled i of them with m+1 balls).
But what if we first deal with the boxes that have less than m balls in them? If we completely deal with those, all we will have to do is put exactly m balls in each of the boxes that are left empty. But there's only going to be one way to do that! So if we can figure out how to deal with the boxes with less than m balls in them, the formula is finished.
How many boxes have less than m balls in them? Maybe none: it may be that all the dropped boxes have the same number of balls as the lowest number in any kept box. On the other hand, the rest of the kept dice have to have m balls in them, but none of the other dropped ones have to. So we could have 0 to d boxes with less than m balls in them. Well, there's m-1 ways to put less than m balls in one box and have at least one ball in the box. There's (m-1)*(m-1) ways to do that for two distinguishable boxes, and (m-1)^j ways to do that for j boxes. Now slide those j boxes into the n-i empty distinguishable boxes to get:
[img]http://www.xenomind.com/enworld/parform4.gif
which simplifies to:

Almost there. Up until now, m has been just a variable, but we need to account for all of the possible values of m. What can m be? Any number that can be rolled (1 to s):

Now that pesky number 0 is going to cause some problems here. If m = 1, m-1 = 0, and we get that there are 0 ways to put 1 ball in each of j distinguishable boxes. Obviously wrong, there's one way to do that. So let's calculate m = 1 separately, making the summation over m go from 2 to s:

"Wow!" I can hear you saying. "Beautiful!" reads the headline in tommorow's paper. "Amazing!" shout the throngs of probabilists partying in the streets.
Except it doesn't work.
Anyone got any ideas? Help?
Last edited: