5D6 drop the lowest two (math ?)

Nathan

First Post
Analytical solution

An approach to an analytic solution:

Let us assume we have n dice with d sides each, numbered from 1...d. (In your case n = 5, d = 6). We want to drop the lowest k dice (k = 2 in your case).

How many ways are there to roll these dice? In total, there are n^d = n * ... * n (d times) outcomes

What we do now is to add up the totals (highest n - k) of all the n^d outcomes. Afterwards, we have to divide by n^d to get the average.

To be able to do this summation, let us first classify all the outcomes.

First of all, we classify them by a number l which shall be the highest number of the outcome that is to be dropped.

This way, we get n classes as l can run from 1 to d. In the example 5d6, drop lowest two, we have for example:

l(1 2 5 3 2) = 2, l(6 6 6 6 6) = 6, etc.

We now subdivide this classification further by the number r of dice that are lower than l. We see that r can run from 0 to n - 1. In our examples:

r(1 2 5 3 2) = 1, r(6 6 6 6 6) = 0.

Now, all the outcomes are classified by a pair (l, r). This classification is still too coarse. We introduce another number u, the number of dice that show l. u can run from 1 to n - r. Examples:

u(1 2 5 3 2) = 2, u(6 6 6 6 6) = 5.

Our outcomes are now classified by triples (l, r, u). This is fine enough to proceed.

Our summation would be now as follows: we sum over all possible classes given by a triple (l, r, u) the sum S(l, r, u) of all totals of all outcomes belonging to the class (l, r, u).

So let us fix (l, r, u), l = 1...d, r = 0...n-1, and u = 1...n-r.

What is the sum S(l, r, u)? To answer this, let us take a look at how an outcome belonging to the class (l, r, u) looks like.

It looks like:

A A A A B B B B B C C C C C C

where the A are dice lower than l, the B mark the dice that show l and the C are dice higher than l. We have r times A, u times B, and therefore n - u - r times C. However, we have been a little bit inexact here! The dice usually don't come sorted this way when they are rolled one after the other! But our row as given above is sorted: at the beginning all the dice lower than l, at the end all the dice higher. Anyway, the total doesn't depend on the sort order, so we can work with sorted outcomes like the one above. However, then, we have to take the number of outcomes that can be sorted into something like the above into account.

Let us define the binomial coefficient

(a over b)

by

(a over b) := 1*2*3*4*5*...*a/(1*2*3*...*b*1*2*...*(a-b))

Combinatorics tell us that the number of outcomes of class (l, r, u) that can be sorted as above is given by (n over r) * ((n - r) over u). Let us call this number M1(r, u), the M standing for multiplicity.

All the r dice below l don't count for the total so let us neglect them. However, this introduces another multiplicity in our formula: each set of all sorted outcomes of class (l, r, u) differing only in the lowest r dice consists of (l-1)^r members all leading the same total. Let us denote this number my M2(l, r).

So, now we have to deal only with rows like:

B B B B C C C C C C C,

where the number of B's is l, the number of C's being n - u - r.

What is the total T(l, u, r) of all these rows? The dice B contribute l * (u - (k - r)) to the total as (k - r) dice showing l shall be dropped. The dice C contribute anything from l + 1 to d to the total. Recall that there are (n - u - r) dice of type C.

By the Gaußian summation formula, we have

T(l, u, r) = l * (u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2).

Now, let us throw everything together:

The average is

1/n^d * [the sum over l = 1...r of the sum over r = 0...k-1 of the sum over u = 1..n - r of (n over r) * ((n - r) over u) * (l - 1)^r * (l*(u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2)].

I haven't checked the formula by explicit calculations but it should work if I haven't forgotten anything in the deduction and haven't made any typing errors.

Maybe someone has time to check the formula.

Nathan
 

log in or register to remove this ad

Re: Analytical solution

Nathan said:

Maybe someone has time to check the formula.

Nathan

You bastard, you almost made my head explose!

Seriously, I'll try to read and understand that this week-end. Did I notice Pascal's triangle in there?

TS
 

dcollins

Explorer
Re: Analytical solution

Nathan said:
The average is

1/n^d * [the sum over l = 1...r of the sum over r = 0...k-1 of the sum over u = 1..n - r of (n over r) * ((n - r) over u) * (l - 1)^r * (l*(u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2)].

Err... if the answer invloves a "summation of a summation of a summation", it seems like you may have just re-created the need to individually count each possible result.

Moreover, the thing we really need isn't just an average, what would be ideal is a mass probability function so people can evaluate the likelihood of different results. (People have been asking for this for years and I've never seen such a formula to date.)
 

Nathan

First Post
Re: Re: Analytical solution

dcollins said:


Err... if the answer invloves a "summation of a summation of a summation", it seems like you may have just re-created the need to individually count each possible result.


No, I haven't. In my formula, there are always only three summations, independent of the numbers n, k, and d. And this isn't bad, and quite normal for an analytic formula calculating an average.

However, just counting each possible result would lead to a growing number of summations if n, k, and d grow.



Moreover, the thing we really need isn't just an average, what would be ideal is a mass probability function so people can evaluate the likelihood of different results. (People have been asking for this for years and I've never seen such a formula to date.)

So you are asking for the likelihood of a certain result? There should be no problem to get such a formula if you follow the deduction steps of my formula above.
 

MeanGenes

First Post
I'm going to attempt this the idiot way.


3d6 has an average of 10.5 (Assume a perfect distribution of 1, 3.5, and 6. 1+ 3.5 + 6 = 10.5)

4d6 has an average of 13 (Assume a perfect distribution of 1, 2.67, 4.33, and 6. Drop the 1. 2.67 + 4.33 + 6 = 13)

5d6: Assume a perfect distribution of 1, 2.25, 3.5, 4.75, and 6. Drop the 1 and 2.25 and you get 3.5 + 4.75 + 6 = 14.25

6d6: Assume a perfectly balanced roll 1,2,3,4,5,6. Drop the 1,2, and 3. 4 + 5+ 6 = 15.

I'm not going to bother with 7d6. :)
 

Nathan

First Post
MeanGenes said:
I'm going to attempt this the idiot way.


3d6 has an average of 10.5 (Assume a perfect distribution of 1, 3.5, and 6. 1+ 3.5 + 6 = 10.5)

4d6 has an average of 13 (Assume a perfect distribution of 1, 2.67, 4.33, and 6. Drop the 1. 2.67 + 4.33 + 6 = 13)

5d6: Assume a perfect distribution of 1, 2.25, 3.5, 4.75, and 6. Drop the 1 and 2.25 and you get 3.5 + 4.75 + 6 = 14.25

6d6: Assume a perfectly balanced roll 1,2,3,4,5,6. Drop the 1,2, and 3. 4 + 5+ 6 = 15.

I'm not going to bother with 7d6. :)

Your method won't lead to the correct results but should give a good approximation.

With respect to my last comment on how to get a formula for the likelihood of a certain total N:

First you have to find a formula for C(l, r, u), the number of ways to add up n - u - r numbers from 1...d to N - (n - k) * l.

Then the likelihood of an outcome N should be:

1/n^d * [the sum over l = 1...r of the sum over r = 0...k-1 of the sum over u = 1..n - r of (n over r) * ((n - r) over u) * (l - 1)^r * C(l, r, u)].
 

MeanGenes

First Post
Just curious...why won't my method work? I'm not a math major or anything, but it seems to me that if you assume that all your rolls will eventually average out to a "perfect distribution" (1, 2.25, 3.5, 4.25, and 6), then couldn't you just use that as a shortcut rather than going through all the math and adding up all the possibilities? My method works for 3d6 and 4d6, why not 5d6 and so on?

I'll be curious to see what results you get with your method and how close my method is.
 

Nathan

First Post
MeanGenes said:
Just curious...why won't my method work? I'm not a math major or anything, but it seems to me that if you assume that all your rolls will eventually average out to a "perfect distribution" (1, 2.25, 3.5, 4.25, and 6), then couldn't you just use that as a shortcut rather than going through all the math and adding up all the possibilities? My method works for 3d6 and 4d6, why not 5d6 and so on?

I'll be curious to see what results you get with your method and how close my method is.

I don't have a programme to hack my formula into at the moment, so I still hope that someone else will be doing this.

The initial poster mentioned that he calculated an average of 13.4 by brute force for 5d6, drop lowest 2. As your method gives a 13 here, we see that your formula can only be an approximation.

I can also give you an easy example: Let us assume we are throwing 2d2, dropping the lowest one.

There are for possible outcomes (given with totals):

1, 1 => 1
1, 2 => 2
2, 1 => 2
2, 2 => 2.

So the average will be:

7/4 = 1.75.

However, your method gives:

1, 2 => drop 1 => average 2, slightly higher than 1.75.
 

MeanGenes,

The exact mean for 5d6 drop 2 is 13.43.

Why did you use that precise distribution? Other distributions are as logical, and do not produce the same results, for example:

3d6: 1,8333 ; 3,5 ; 5,1666 (sum = 10,5)
4d6: 1,625 ; 2,875 ; 4,125 ; 5,375 (sum 3 best = 12,375)
5d6: 1,5 ; 2,5 ; 3,5 ; 4,5 ; 5,5 (sum 3 best = 13,5)

In fact, my 5d6 distribution seems to be closer to the truth that yours :)

The reason your distribution is "flawed", IMO, is that random distributions won't go into "extremes", and that's why a more centralized distribution will produce results that are closer to reality...

TS
 

Gizzard

First Post
...what would be ideal is a mass probability function so people can evaluate the likelihood of different results. (People have been asking for this for years and I've never seen such a formula to date.)

I'm not sure I understand what you are asking, but I use a Hypergeometric Distribution to calculate the probability of classes of important results.

As a practical matter, I use it for analyzing card distribtution in MtG; it answers questions like "What's the probabilty of getting a Workshop, any Mox and a Juggernaut in hand by the time I have drawn 8 cards?" The form of the distribution I use works for problems where there is no "replacement", ie, a drawn card is removed from future draw possibilities. I thought there was a related distribution that allowed replacement, but Hypergeometric Distribution might be a good place to start.
 

Remove ads

Top