D&D General Data from a million DnDBeyond character sheets?

Can you see what skills they have proficiency in? Because if you can do some spot checking and compare that to how many they should have based on just their class. For example a fighter would have only 2 skill proficiencies, if a fighter with a blank background has more you know extra proficiencies were likely added by a background.

Of course, some groups may have custom backgrounds that add benefits other than skills as well.
That’s not in the dataset. The data included is actually fairly limited.

Character id
Character name
Base hp
Stats (possibly just means starting stats)
Starting class
Starting class level
Other classes
Total level
Starting subclass
Other subclasses
Feats
Then there’s some notes and item type related fields.
 

log in or register to remove this ad

That’s not in the dataset. The data included is actually fairly limited.

Character id
Character name
Base hp
Stats (possibly just means starting stats)
Starting class
Starting class level
Other classes
Total level
Starting subclass
Other subclasses
Feats
Then there’s some notes and item type related fields.

If I were better at screen scraping I could get more data, you can access most PC character sheets. The default is that PCs are public, I doubt most people realize they can change it. Of course there may be some checks in place as well to prevent this kind of activity.

I'd be curious how this data was actually derived.
 

Interesting. I'm assuming you were able to test to confirm that they are starting abilities some way? I'm interested in how. Also assuming that's true and if you can confirm that class features are not included then I think the best thing to do is to include all >=1 except for all 8's (starting unmodified point buy).
The data included is actually fairly limited.
Given the data included: the list must be beyond simply level 1 characters.
Then there’s some notes and item type related fields.
How extensive is that? Is it just stating a character count for player notes? Or does it give more?
 

Given the data included: the list must be beyond simply level 1 characters.

How extensive is that? Is it just stating a character count for player notes? Or does it give more?
notes is just a length count

Inventory is detailed but quite a few characters have a blank inventory

Oh and There’s also a gold field. But there’s alot of 0’s there as well.

There are characters of all levels and even some above max level.
 
Last edited:

notes is just a length count
That could be useful. We just need to figure out the minimum threshold for likely notes; though I am unsure at present what the criteria should be.
Inventory is detailed but quite a few characters have a blank inventory
I am guessing that a blank inventory either means it is maintained elsewhere physically, or it is a test character for stats/features. If the character was at least 1st iteration, I would expect at least the starting gear to be present.
Oh and There’s also a gold field. But there’s alot of 0’s there as well.
Same as with above: likely either separate inventory, or test character.

Honestly, I would not be surprised if a significant majority of the character listings aren't anything more than experiments.
 

That could be useful. We just need to figure out the minimum threshold for likely notes; though I am unsure at present what the criteria should be.

I am guessing that a blank inventory either means it is maintained elsewhere physically, or it is a test character for stats/features. If the character was at least 1st iteration, I would expect at least the starting gear to be present.

Same as with above: likely either separate inventory, or test character.

Honestly, I would not be surprised if a significant majority of the character listings aren't anything more than experiments.
I can’t speak for all but if I go through the trouble of creating a character it’s likely one I’m at least interested in playing even if I don’t ever end up playing it due to real life constraints, ex: time, no campaign to play him in, etc.

So I’m my mind it’s probably just as important to see what characters people are interested to play as it is what they actually played, and even more so when we don’t have a clear delimiter around which is which.
 

I can’t speak for all but if I go through the trouble of creating a character it’s likely one I’m at least interested in playing even if I don’t ever end up playing it due to real life constraints, ex: time, no campaign to play him in, etc.
That is my point about the inventory. It is reasonable to conclude that a character intended for play, not just as an experiment, will have all the starting bases covered: race, background (even if custom), class, maybe subclass (depending on class and level), proficiencies, starting equipment, starting gold, bio stats (height, weight, etc.).

The question is: how many characters not meeting the above criteria are experiments, and how many are future frameworks for existing characters?
 

I actually think it's important to keep outliers and understand their significance unless we are dang sure they represent not an actual character as opposed to a character played under some house rules.
I think it's good to look at house rule characters and see if they deviate significantly from the other data in other ways. One thing I am really concerned with is the 44.9% of characters who are level one. I think that's where your mass of unplayed characters are. Fifth edition is not lethal enough to kill half of all starting characters.

If we look at the full data set, but trimmed of duplicate character IDs (which I am going to call the UID data from now on, and I think would be a good baseline for discussions), 44.9% are level 1. If we look at the ones without backgrounds, 50.5% are level 1. That's suggestive, but not a huge difference. However, if we look at characters with the default name ('<username/>'s Character'), 61.9% are level 1. I think that's enough of a variation to exclude those characters, at least if they are level 1.
Given the possibilities of houserules, or more likely custom backgrounds with no name being an option, I'd suggest those as the likely explanation for blank background names. *Note: custom backgrounds are an official option in the rules as well.
I would like to make it clear that you can give a name to a custom background. However, custom backgrounds are limited to three or four options based on how many skill, tool, and language proficiencies they give. So if you are doing something at all different with backgrounds, you would be leaving it blank. So I'm fine with leaving blank ones in.
 

Interesting. I'm assuming you were able to test to confirm that they are starting abilities some way? I'm interested in how. Also assuming that's true and if you can confirm that class features are not included then I think the best thing to do is to include all >=1 except for all 8's (starting unmodified point buy). I'm fine having some obviously houseruled characters in the data.

Thoughts?
Yes, I confirmed this, and mentioned it earlier in the thread. If you look at the abilities in the data, over half the characters have the standard array, with no modifications. But they all have races, and on D&D beyond the racial ability bonuses are automatically entered in. So I poked around, and found character #70, who has the standard array in the data. If you look at that character on D&D Beyond, it has the racial bonuses and an ASI. And if you look at the JSON for that page, you can see that the racial bonuses and the abilities are stored differently.

As for class features, just look at the 20th level barbarians in the data. I found 1,154 in the UID data. 35.2% of them have a 15 for strength. None of them have a 24 for strength. I think that clearly shows that the class feature is not being applied.
 


Remove ads

Top