ichabod
Legned
A while back there was a thread about testing dice. IIRC, the best anyone really came up with is the chi-squared goodness of fit test. I find that test unsatisfactory because it has low power. That means while it may be good at saying a fair die is fair, it is poor at saying that a biased die is biased. It will frequently conclude that a biased die is fair. In order to counteract this low power, you have to make a hideous number of die rolls.
The thing about the chi-squared test is that it checks the entire distribution. It makes sure the right number of 1?s are being rolled, and the right number of 2?s, and the right number of 3?s, and so on. But in our situation, as d20 gamers, do we really need to test that much? I say no. I think we are really concerned if the die rolls too high on average. Looking only at that criteria, we can get a higher power test with a reasonable number of die rolls. How many die rolls do we need?
If you are not interested in the math, skip the next paragraph.
Take s to be the number of sides on the die, so for a d20 s = 20. Let d be the difference in the mean that we want to detect. In other words, how much bias is acceptable. Finally, n will be the number of die rolls to get 95% confidence in the results of the test. Since a fair die is a discrete uniform distribution with parameter N = s, we know that it has E(X) = (s + 1) / 2, and Var(X) = (s^2-1)/12. The central limit theorem tells us that the mean will be distributed normally with mean equal to the mean, and variance equal to the variance divided by n. Since we are only worried about high results, we can use a one sided confidence interval. We want the high end of the interval to be mean + d. The high end for 95% confidence interval is mean + 1.645 * sqrt(variance). So d = 1.645*sqrt(variance) = 1.645*sqrt((s^2 - 1) / 12n). Solving for n gives us n = (1.645^2 * (s^2 - 1)) / (12 * d^2). Now the variance of a die biased by b will be a bit higher, so the power will be lower than 95%, but not by much. The only problem would be if the variance of the biased die was really high, which should be obvious when recording a significant number of die rolls.
So, to test the die, determine how big a difference in the mean (average) roll you want to detect, and call it d. Compare d with the number of sides the die has on the table below:
Roll the die the number of times indicated on the table. Take the mean of all the numbers rolled. It should be (s + 1) / 2, where s is the number of sides of the die. If the observed mean minus d is greater than (s + 1) / 2, there?s a good chance the die is biased. Chuck it, preferably at the person who was using it. Note that this test will be wrong about 5% of the time.
So how big should d be? I think 0.5 is pretty good, but 0.25 will detect any die that is rolling it?s highest value twice as often as it should. Another way to look at it is that 0.5 will detect a d20 that gives an extra 2.5% chance to succeed at a to hit roll, save, or skill check.
The original thread I referred to was trying to see if you had bad luck. You just do the test in reverse: take the observed mean and add d. If that is less than (s + 1) / 2, there?s a good chance you have bad luck. Be sure to roll several different dice with the same number of sides to distinguish your luck from any bias in the individual dice.
If you are concerned about both low and high bias, you will to make more die rolls to have the same level of confidence. You can use the above numbers, but the test will be wrong about 10% of the time. If you want to be wrong about 5% of the time, use this table for the number of rolls:
For the math inclined, the formula for this table is n = (1.96^2 * (s^2 - 1)) / (12 * d^2).
So for this test, you again determine d, then roll the number times indicated on the table for the type of die you are testing. If the difference between (s + 1) / 2 and the mean of all your die rolls is greater than d, the die is biased in the direction indicated by the sign of the difference.
And remember, these tests are only test for dice that roll too high or too low on average. They will not detect die with the same average as a fair die. For instance, dice that roll 2?s and 5?s much more frequently than normal are biased for craps, but these tests will not detect them.
Oblikatori centance misspeled myphusmaze fur.
The thing about the chi-squared test is that it checks the entire distribution. It makes sure the right number of 1?s are being rolled, and the right number of 2?s, and the right number of 3?s, and so on. But in our situation, as d20 gamers, do we really need to test that much? I say no. I think we are really concerned if the die rolls too high on average. Looking only at that criteria, we can get a higher power test with a reasonable number of die rolls. How many die rolls do we need?
If you are not interested in the math, skip the next paragraph.
Take s to be the number of sides on the die, so for a d20 s = 20. Let d be the difference in the mean that we want to detect. In other words, how much bias is acceptable. Finally, n will be the number of die rolls to get 95% confidence in the results of the test. Since a fair die is a discrete uniform distribution with parameter N = s, we know that it has E(X) = (s + 1) / 2, and Var(X) = (s^2-1)/12. The central limit theorem tells us that the mean will be distributed normally with mean equal to the mean, and variance equal to the variance divided by n. Since we are only worried about high results, we can use a one sided confidence interval. We want the high end of the interval to be mean + d. The high end for 95% confidence interval is mean + 1.645 * sqrt(variance). So d = 1.645*sqrt(variance) = 1.645*sqrt((s^2 - 1) / 12n). Solving for n gives us n = (1.645^2 * (s^2 - 1)) / (12 * d^2). Now the variance of a die biased by b will be a bit higher, so the power will be lower than 95%, but not by much. The only problem would be if the variance of the biased die was really high, which should be obvious when recording a significant number of die rolls.
So, to test the die, determine how big a difference in the mean (average) roll you want to detect, and call it d. Compare d with the number of sides the die has on the table below:
Code:
1 0.5 0.25 0.1
------------------------
d4 4 14 55 339
d6 8 32 127 790
d8 15 57 228 1421
d10 23 90 358 2233
d12 33 129 516 3225
d20 90 360 1440 8998
Roll the die the number of times indicated on the table. Take the mean of all the numbers rolled. It should be (s + 1) / 2, where s is the number of sides of the die. If the observed mean minus d is greater than (s + 1) / 2, there?s a good chance the die is biased. Chuck it, preferably at the person who was using it. Note that this test will be wrong about 5% of the time.
So how big should d be? I think 0.5 is pretty good, but 0.25 will detect any die that is rolling it?s highest value twice as often as it should. Another way to look at it is that 0.5 will detect a d20 that gives an extra 2.5% chance to succeed at a to hit roll, save, or skill check.
The original thread I referred to was trying to see if you had bad luck. You just do the test in reverse: take the observed mean and add d. If that is less than (s + 1) / 2, there?s a good chance you have bad luck. Be sure to roll several different dice with the same number of sides to distinguish your luck from any bias in the individual dice.
If you are concerned about both low and high bias, you will to make more die rolls to have the same level of confidence. You can use the above numbers, but the test will be wrong about 10% of the time. If you want to be wrong about 5% of the time, use this table for the number of rolls:
Code:
1 0.5 0.25 0.1
-------------------------
4 5 20 77 481
6 12 45 180 1121
8 21 81 323 2017
10 32 127 508 3170
12 46 184 733 4578
20 128 511 2044 12774
For the math inclined, the formula for this table is n = (1.96^2 * (s^2 - 1)) / (12 * d^2).
So for this test, you again determine d, then roll the number times indicated on the table for the type of die you are testing. If the difference between (s + 1) / 2 and the mean of all your die rolls is greater than d, the die is biased in the direction indicated by the sign of the difference.
And remember, these tests are only test for dice that roll too high or too low on average. They will not detect die with the same average as a fair die. For instance, dice that roll 2?s and 5?s much more frequently than normal are biased for craps, but these tests will not detect them.
Oblikatori centance misspeled myphusmaze fur.