Analytical solution
An approach to an analytic solution:
Let us assume we have n dice with d sides each, numbered from 1...d. (In your case n = 5, d = 6). We want to drop the lowest k dice (k = 2 in your case).
How many ways are there to roll these dice? In total, there are n^d = n * ... * n (d times) outcomes
What we do now is to add up the totals (highest n - k) of all the n^d outcomes. Afterwards, we have to divide by n^d to get the average.
To be able to do this summation, let us first classify all the outcomes.
First of all, we classify them by a number l which shall be the highest number of the outcome that is to be dropped.
This way, we get n classes as l can run from 1 to d. In the example 5d6, drop lowest two, we have for example:
l(1 2 5 3 2) = 2, l(6 6 6 6 6) = 6, etc.
We now subdivide this classification further by the number r of dice that are lower than l. We see that r can run from 0 to n - 1. In our examples:
r(1 2 5 3 2) = 1, r(6 6 6 6 6) = 0.
Now, all the outcomes are classified by a pair (l, r). This classification is still too coarse. We introduce another number u, the number of dice that show l. u can run from 1 to n - r. Examples:
u(1 2 5 3 2) = 2, u(6 6 6 6 6) = 5.
Our outcomes are now classified by triples (l, r, u). This is fine enough to proceed.
Our summation would be now as follows: we sum over all possible classes given by a triple (l, r, u) the sum S(l, r, u) of all totals of all outcomes belonging to the class (l, r, u).
So let us fix (l, r, u), l = 1...d, r = 0...n-1, and u = 1...n-r.
What is the sum S(l, r, u)? To answer this, let us take a look at how an outcome belonging to the class (l, r, u) looks like.
It looks like:
A A A A B B B B B C C C C C C
where the A are dice lower than l, the B mark the dice that show l and the C are dice higher than l. We have r times A, u times B, and therefore n - u - r times C. However, we have been a little bit inexact here! The dice usually don't come sorted this way when they are rolled one after the other! But our row as given above is sorted: at the beginning all the dice lower than l, at the end all the dice higher. Anyway, the total doesn't depend on the sort order, so we can work with sorted outcomes like the one above. However, then, we have to take the number of outcomes that can be sorted into something like the above into account.
Let us define the binomial coefficient
(a over b)
by
(a over b) := 1*2*3*4*5*...*a/(1*2*3*...*b*1*2*...*(a-b))
Combinatorics tell us that the number of outcomes of class (l, r, u) that can be sorted as above is given by (n over r) * ((n - r) over u). Let us call this number M1(r, u), the M standing for multiplicity.
All the r dice below l don't count for the total so let us neglect them. However, this introduces another multiplicity in our formula: each set of all sorted outcomes of class (l, r, u) differing only in the lowest r dice consists of (l-1)^r members all leading the same total. Let us denote this number my M2(l, r).
So, now we have to deal only with rows like:
B B B B C C C C C C C,
where the number of B's is l, the number of C's being n - u - r.
What is the total T(l, u, r) of all these rows? The dice B contribute l * (u - (k - r)) to the total as (k - r) dice showing l shall be dropped. The dice C contribute anything from l + 1 to d to the total. Recall that there are (n - u - r) dice of type C.
By the Gaußian summation formula, we have
T(l, u, r) = l * (u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2).
Now, let us throw everything together:
The average is
1/n^d * [the sum over l = 1...r of the sum over r = 0...k-1 of the sum over u = 1..n - r of (n over r) * ((n - r) over u) * (l - 1)^r * (l*(u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2)].
I haven't checked the formula by explicit calculations but it should work if I haven't forgotten anything in the deduction and haven't made any typing errors.
Maybe someone has time to check the formula.
Nathan
An approach to an analytic solution:
Let us assume we have n dice with d sides each, numbered from 1...d. (In your case n = 5, d = 6). We want to drop the lowest k dice (k = 2 in your case).
How many ways are there to roll these dice? In total, there are n^d = n * ... * n (d times) outcomes
What we do now is to add up the totals (highest n - k) of all the n^d outcomes. Afterwards, we have to divide by n^d to get the average.
To be able to do this summation, let us first classify all the outcomes.
First of all, we classify them by a number l which shall be the highest number of the outcome that is to be dropped.
This way, we get n classes as l can run from 1 to d. In the example 5d6, drop lowest two, we have for example:
l(1 2 5 3 2) = 2, l(6 6 6 6 6) = 6, etc.
We now subdivide this classification further by the number r of dice that are lower than l. We see that r can run from 0 to n - 1. In our examples:
r(1 2 5 3 2) = 1, r(6 6 6 6 6) = 0.
Now, all the outcomes are classified by a pair (l, r). This classification is still too coarse. We introduce another number u, the number of dice that show l. u can run from 1 to n - r. Examples:
u(1 2 5 3 2) = 2, u(6 6 6 6 6) = 5.
Our outcomes are now classified by triples (l, r, u). This is fine enough to proceed.
Our summation would be now as follows: we sum over all possible classes given by a triple (l, r, u) the sum S(l, r, u) of all totals of all outcomes belonging to the class (l, r, u).
So let us fix (l, r, u), l = 1...d, r = 0...n-1, and u = 1...n-r.
What is the sum S(l, r, u)? To answer this, let us take a look at how an outcome belonging to the class (l, r, u) looks like.
It looks like:
A A A A B B B B B C C C C C C
where the A are dice lower than l, the B mark the dice that show l and the C are dice higher than l. We have r times A, u times B, and therefore n - u - r times C. However, we have been a little bit inexact here! The dice usually don't come sorted this way when they are rolled one after the other! But our row as given above is sorted: at the beginning all the dice lower than l, at the end all the dice higher. Anyway, the total doesn't depend on the sort order, so we can work with sorted outcomes like the one above. However, then, we have to take the number of outcomes that can be sorted into something like the above into account.
Let us define the binomial coefficient
(a over b)
by
(a over b) := 1*2*3*4*5*...*a/(1*2*3*...*b*1*2*...*(a-b))
Combinatorics tell us that the number of outcomes of class (l, r, u) that can be sorted as above is given by (n over r) * ((n - r) over u). Let us call this number M1(r, u), the M standing for multiplicity.
All the r dice below l don't count for the total so let us neglect them. However, this introduces another multiplicity in our formula: each set of all sorted outcomes of class (l, r, u) differing only in the lowest r dice consists of (l-1)^r members all leading the same total. Let us denote this number my M2(l, r).
So, now we have to deal only with rows like:
B B B B C C C C C C C,
where the number of B's is l, the number of C's being n - u - r.
What is the total T(l, u, r) of all these rows? The dice B contribute l * (u - (k - r)) to the total as (k - r) dice showing l shall be dropped. The dice C contribute anything from l + 1 to d to the total. Recall that there are (n - u - r) dice of type C.
By the Gaußian summation formula, we have
T(l, u, r) = l * (u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2).
Now, let us throw everything together:
The average is
1/n^d * [the sum over l = 1...r of the sum over r = 0...k-1 of the sum over u = 1..n - r of (n over r) * ((n - r) over u) * (l - 1)^r * (l*(u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2)].
I haven't checked the formula by explicit calculations but it should work if I haven't forgotten anything in the deduction and haven't made any typing errors.
Maybe someone has time to check the formula.
Nathan