Asisreo
Patron Badass
I've given what needs to be done in order to conclusively determine whether a monk is underpowered or not. I'll give my reasonings here:I agree it can be interesting, but it doesn't say too much one way or another about quantitative estimates. Monte Carlo simulations more generally would be an improvement over calculations on a spreadsheet --- for me, the gold standard would be to write down a standardized set of encounters that make up an adventuring day that are varied in terms of difficulty level, whether there are a few powerful monsters or lots of little ones, with a good cross-section of casters and melee brutes, etc., with some rules about how the monsters act, and simulate a few different party comps through those encounters, and see how it turns out (and how many resources they have left at the end). I guess that's more or less what @Asisreo is suggesting, but actually putting that into code in a way that can be automated enough times for the d20 variance to wash out is... pretty intractable.
So, realistically, it seems to me a person has three choices:
1. They can settle for good faith attempts to quantify effectiveness in simpler ways, which necessarily abstract away a lot of detail, but if guided by reasonable estimates and iterated with input from people with lots of different table experiences, is hopefully a decent approximation. Maybe that's computing some averages for benchmark encounters!
2. They can decide none of this matters --- they don't actually care about character's objective power levels and just want to play the game.
3. They can decide they do care about power levels, but reject any attempt to quantify it objectively, and instead 'go with their gut' and express their opinions online as though they reflected a greater reality than someone doing 1.
I personally favor 1, and welcome constructive input from anyone and everyone who has a (realistic) way to refine the estimates. I also have no problem with someone who picks 2 --- D&D is fun in lots of different ways, and plenty of people can have fun without ever getting into the quantitative side of it. I do take issue with 3 though -- specifically the second half of three where they loudly complain and take it personally if people give quantitative measures that conflict with their subjective experience. Note, again, that dismissing quantitative analysis entirely is entirely different from constructively criticizing the specific methodology being used. It's also different from saying that 'mechanical effectiveness' isn't what's important to you. I'm all for trying to improve on methodology, and I also completely respect people with different priorities. But if you actually want to be part of a conversation about mechanics, you need to offer up something constructive; not just say 'spreadsheet, spreadsheet, white room, white room' over and over.
This game was not balanced by way of spreadsheets, simulations, approximations, or estimations. This game was balanced around playtesting. Stuff like UA and the D&DNext playtest was based on approximations and spreadsheets but alot of the material there was scratched because of the feedback during actual play. It wasn't fun for the majority of the playtesters.
It's been said recording action logs of an adventure would be too difficult and the getting a large enough sample size would be impossible, but it's not as hard as one thinks. Think like this: Computer TTRPG's already record dice rolls and anything a person might say in the chat. It wouldn't be so hard to extract the data from a no-mic/chat-only session and start a macro that organizes the data into easier to parse information. Who did what damage due to what action at what time. Stuff like that would be easy for a log like roll20. If you're worried about the sample size, if only 50 DM's from all over the internet participated 4-5 times, we could get a decent gauge on everything.
The reason I'm upset with the whiteroom analysis is that it's misleading data. It claims to be purely objective but there's no reasons to just take assumptions at base value without supporting evidence, especially when the assumptions directly intervene with what's supposed to be empirical data.