Monte Carlo versus "The Math"

Truename · Nov 1, 2009

There's been a lot of criticism of the 4e "math" lately. No biggie, really, but one thing has been bugging me. The analysis is incredibly shallow--nothing more than comparing one number (of dozens!) to another--and it's combined with this bellicose attitude and slapping down of anyone that dares disagree. Ptui.

The 4e math may be broken, or it may not be. The existing analyses don't prove anything. Play experiences back up the opinion of whoever's speaking. I'd like something more substantial.

Besides, I'm a programmer. This is my idea of fun.

Okay, so the purpose of this thread is to do a stochastic analysis of the 4e math, using something called a Monte-Carlo simulation. Basically, what we do is write a program that simulates a 4e fight, random dice rolls and all, and we run it a few hundred thousand times. From that, we get a complete view of everything that can happen. Not just the average case, but full range of possibilities. Because it's a simulation, we can incorporate a lot more variables than the typical DPR calculation.

This is a big project, so I'll be taking it one piece at a time. I hope you'll join in with your comments. First up: a look at how many rounds it takes to kill a monster.

bonus · Nov 1, 2009

A tool for 4e fight simulations and statistical experimentation would be pretty cool, although take note that if you try to simulate everything that can happen in a fight, your program would have an exponential asymptotic time complexity. Also, you'd have to implement the whole 4e rule system, which would be very hard to say the least. But if you start small and build up, you could produce some very cool things, especially if this is undertaken as a collaborative effort. So, yay!

Truename · Nov 1, 2009

bonus said:
A tool for 4e fight simulations and statistical experimentation would be pretty cool, although take note that if you try to simulate everything that can happen in a fight, your program would have an exponential asymptotic time complexity. Also, you'd have to implement the whole 4e rule system, which would be very hard to say the least. But if you start small and build up, you could produce some very cool things, especially if this is undertaken as a collaborative effort. So, yay!

There's a lot to do, for sure! I think the model will have to make some simplifying assumptions, but I've made good progress so far. I'm definitely hoping this will attract a lot of collaboration--it's too much work to continue if nobody else is interested.

Elric · Nov 1, 2009

Truename said:
This is a big project, so I'll be taking it one piece at a time. I hope you'll join in with your comments. First up: a look at basic hit probabilities.

Unfortunately, there's no way to derive a "basic scenario" for D&D combat given the range of characters that could be played and tactics that could be employed. There's enough variation in characters that the choices here will drive the conclusion. Monsters have some variation as well.

The question "Suppose I have a party with hit bonuses H who will use XYZ abilities in a preset order against monsters that all have defense M and equal hit points (and sit there without attacking back). What will their overall chance of hitting be?" would be more reasonably answered.

Truename · Nov 1, 2009

Monte Carlo vs. DPR

Let's start by looking at the difference between a stochastic analysis and a DPR analysis. For this one, we'll start very simply. The question we're analyzing: How many rounds does it take for a PC to kill a monster?

The PC: a level 1 dwarf great-weapon fighter, using Character Builder's recommended stats, wielding a maul. No feats or powers yet--just basic attacks.

PC Attack bonus: 7 (4 stat + 2 maul + 1 greatweapon)
PC Dmg dice: 2d6 (maul) + 4 (stat)

[sblock=Character Builder summary]
====== Created Using Wizards of the Coast D&D Character Builder ======
Dwarf-Fighter, level 1
Dwarf, Fighter

FINAL ABILITY SCORES
Str 18, Con 12, Dex 14, Int 10, Wis 13, Cha 8.

STARTING ABILITY SCORES
Str 18, Con 10, Dex 14, Int 10, Wis 11, Cha 8.

AC: 17 Fort: 16 Reflex: 12 Will: 11
HP: 27 Surges: 10 Surge Value: 6

TRAINED SKILLS

UNTRAINED SKILLS
Acrobatics +2, Arcana, Bluff -1, Diplomacy -1, Dungeoneering +3, Endurance +3, Heal +1, History, Insight +1, Intimidate -1, Nature +1, Perception +1, Religion, Stealth +2, Streetwise -1, Thievery +2, Athletics +4

FEATS

POWERS

ITEMS
Scale Armor, Adventurer's Kit, Maul
====== Copy to Clipboard and Press the Import Button on the Summary Tab ======
[/sblock]

The Monster: a generic level 1 soldier, created using p184 of the DMG.

Monster AC: 17 (1 level + 16 soldier)
Monster HP: 29 (8 soldier + 13 con + (1 level * 8 soldier)

A DPR analysis of this fight says the monster lasts 4.6 rounds on average.

[sblock=DPR breakdown]
Avergae damage for maul: 2d6+4 = ((2 + 12) / 2) + 4 = 11
Crit damage for maul: 2d6+4 = 12 + 4 = 16
Roll to hit = 17 AC - 7 att bonus = 10

Chance to roll miss = (roll to hit - 1) / 20 = 45%
Chance to roll crit hit = 1 / 20 = 5%
Chance to roll normal hit = 1 - miss chance - crit chance = 1 - 45% - 5% = 50%

Average crit damage = crit chance * crit dmg = 5% * 16 = 0.8
Average hit damage = normal hit chance * normal dmg = 50% * 11 = 5.5

DPR = 0.8 + 4.95 = 5.75
Avg rounds = 29 HP / 5.75 DPR = 4.6 rounds
[/sblock]

The stochastic analysis shows us something completely different. It says the average fight lasts 5.5 rounds, not 4.6. That's a significant difference... big enough that, if correct, pretty much invalidates DPR as a trustworthy approach.

[sblock=Source code (in Ruby)]

Code:

REPS = 1000000
PRECISION = 1
DISPLAY_WIDTH = 75

MONSTER_LEVEL = 1
MONSTER_CON = 13

PC_ATT_BONUS = 7
PC_DMG_BONUS = 4

class Monster
    def initialize
        @ac = MONSTER_LEVEL + 16
        @hp = 8 + MONSTER_CON + (MONSTER_LEVEL * 8)
    end
    
    def dead?
        return @hp <= 0
    end
    
    def defend(att, dmg)
        @hp -= dmg if att >= @ac
    end
end

class Pc
    def attack(monster)
        att_roll = die(20)
        att = (att_roll + PC_ATT_BONUS)
        dmg = damage(att_roll)
        
        monster.defend(att, dmg)
    end
    
    def damage(att_roll)
        dmg_dice = die(6) + die(6)
        dmg_dice = 12 if att_roll == 20
        return dmg_dice + PC_DMG_BONUS
    end
end

def die(size)
    return 1 + rand(size)
end

def fight
    monster = Monster.new
    pc = Pc.new
    round = 0
    until monster.dead?
        round += 1
        pc.attack(monster)
       end
  
    return round
end


def analyze
    results = {}
    total_rounds = 0
    max_value = 0
    REPS.times do
      rounds = fight
      results[rounds] = 0 unless results[rounds]
      new_value = results[rounds] += 1
      max_value = new_value if new_value > max_value
      total_rounds += rounds
    end
    
    results.keys.sort.each do |key|
      value = ""
      
      ticks = (results[key].to_f / max_value.to_f * DISPLAY_WIDTH).to_i
      ticks.times do
        value += "="
      end
      puts "#{key.to_s.rjust(2)}: #{value}" if value != ""
    end
    
    avg_rounds = total_rounds.to_f / REPS
    format = "%.#{PRECISION}f"
    puts "Fights simulated: #{REPS}"
    puts "Average # of rounds per fight: #{format % avg_rounds}"
end
    
analyze

[/sblock]

Why the difference? I'm not entirely sure. (Hopefully it's not a bug! That's why I included the source code.) I think it's partly because, in a real fight, some damage is "wasted" after the monster hits zero HP.

The Monte Carlo analysis also gives us a histogram that summarizes all of the fights. For this simulation, I ran a million fights. 'Cause I could.

Code:

LEVEL 1 SOLDIER VS. DWARF GREATWEAPON FIGHTER (no feats, no powers)
  2  (2.3%): =======
  3 (18.0%): ==================================================
  4 (38.7%): ===================================================================
  5 (57.8%): =============================================================
  6 (72.6%): ===============================================
  7 (83.0%): =================================
  8 (89.8%): =====================
  9 (94.0%): =============
 10 (96.6%): ========
 11 (98.1%): ====
 12 (99.0%): ==
 13 (99.4%): =
Fights simulated: 1000000
Average # of rounds per fight: 5.5

So, although the average # of rounds is 5.5, the majority of fights take 4 rounds. Over 80% of fights take 7 rounds or less to complete, although a very small fraction of this simulation's fights dragged on and on, presumably due to lots of bad rolls.

You can see how the Monte Carlo simulation gives us a much richer, more accurate analysis than the DPR approach. And we're just barely getting started. Next up: feats, powers, and monsters that actually fight back!

Elric · Nov 1, 2009

Truename said:
Let's start by looking at the difference between a stochastic analysis and a DPR analysis. For this one, we'll start very simply. The question we're analyzing: How many rounds does it take for a PC to kill a monster?

The stochastic analysis shows us something completely different. It says the average fight lasts 5.5 rounds, not 4.6. That's a significant difference... big enough that, if correct, pretty much invalidates DPR as a trustworthy approach.

As you say, that you can "overkill" the monster is the reason why HP/DPR doesn't equal average rounds to kill the monster. Average rounds to kill the monster must be weakly higher than this figure. You can see a lot more discussion of this here

bonus · Nov 1, 2009

That's pretty cool! Ruby's a good choice for this. Interested in seeing the rest of the simulations!

Truename · Nov 1, 2009

Elric said:
As you say, that you can "overkill" the monster is the reason why HP/DPR doesn't equal average rounds to kill the monster. Average rounds to kill the monster must be weakly higher than this figure. You can see a lot more discussion of this here

Interesting stuff. Most of it went over my head, I'm afraid--I'm a programmer, not a mathematician, and I don't know much about statistics. I hope you'll contribute your expertise to this thread, too.

Cadfan · Nov 2, 2009

Truename said:
The stochastic analysis shows us something completely different. It says the average fight lasts 5.5 rounds, not 4.6. That's a significant difference... big enough that, if correct, pretty much invalidates DPR as a trustworthy approach.

No, no. The DPR analysis works just fine because its being compared to other DPR values. The relative relationship between two character's DPR calculations and the relative relationship between their "rounds of combat" calculations should be so similar as to be indistinguishable in casual discussion.

Try an example, done mathematically. You've got character A, and character B, with DPRs of 9 and 7, respectively. DPR wise, we expect character A to kill monsters approximately 28.6% faster than character B.

Now imagine they're each fighting a monster with 60 hp.

Using what you call the DPR approach, we expect Character A to kill the monster in 6.667 rounds. But if we were to make our analysis more specific, we'd determine that Character A expects to kill the monster in 7 rounds- 6 rounds to get to 54 damage, then one more to get to 63. The remainder is rounded up to the nearest whole number.

Using what you call the DPR approach, we expect Character A to kill the monster in 8.571 rounds. But if we were to make our analysis more specific, we'd determine that Character A expects to kill the monster in 9 rounds- 8 rounds to get to 56 damage, then one more to get to 63.

Well, lets check our ratios to see whether these forms of analysis generated significantly different outcomes.

DRP ratio, 6.667/8.571 = .77786
Round analysis, 7/9 = .77778

There you go.

For reference and full disclosure, lower hit points and higher damage (ie, fewer rounds needed to kill the monster) will generate more variability in the comparison of the DPR and Round analysis. This is because the remainder makes up a larger portion of the division of HD/DPR.

Using a mathematical expression for the round analysis instead of relying on DPR as an average will also introduce more variability. But that variability averages out and shouldn't affect overall conclusions.

AbdulAlhazred · Nov 2, 2009

There are some other possible sources of variability. For instance 2d6 vs 1d12 damage weapon, which Monte Carlo will show slightly different population standard deviation for in length of number of rounds.

Personally I'm not sure that Monte Carlo REALLY matters a huge amount, though it is true that the distribution is added information. However the really interesting stuff is going to be sets of opponents and then you get into the murky waters of what are the effects of tactics. You can reduce the battle down to the blows struck and abstract tactics in terms of simply opportunities to attack, etc, but how do you account for the fact that the evolving nature of the battle itself feeds back into how it plays out?

Even taking a simple battle with say 3 opponents on each side where they are all roughly similar creatures with basic melee attacks. Now you can pretty well model that abstractly, everyone is going to get to swing at somebody and who wins is likely to be decided almost entirely on which side focuses its attacks better. I don't think you'll learn a LOT more with this kind of scenario than with one-on-one fights.

But what would be a step up from that? Lets say one side has a creature that can deploy an area attack. How many of the enemy can it hit on each attack? Is this going to affect how much the other side groups its units together (presumably they need to be close to each other to all concentrate on one target). Exactly what is going to happen now is going to depend highly on exactly who moves where and when. So you can run that fight 1 million times and determine the effects of tactics randomly, but the veracity of the result depends on how true your "random effects of tactics generator" is to what happens in real games.

I kind of fear that by the time you get to something approaching the level of complexity of a level 1 party fighting a level 1 encounter with some typical monsters the number of guestimations required to do that "effects of tactics" is going to be large and nobody will ever know how accurate it is, except by actually collecting data from real combats.

And I think that is really the ultimate key. This is an area where nothing is going to beat real world data. Still, it could shed some light on certain very specific questions. I just think people would have to go back and do some sanity checking against real world data all the same.