D&D General Data from a million DnDBeyond character sheets?

Are some of the duplicates because they are the same character at different levels? Can you tell?
based on initial evaluation I think there’s a few hundred tops that may be that way (if that)

I will say a few fields like items and such truncated for me when I imported to database so if there are differences I except it’s coming from some of the longer item and notes type fields.
 

log in or register to remove this ad

I imported to database and using a query to select distinct. Takes me from about 1.2 mil records down to about 500,000. There’s minimal difference (a few hundred) in doing this on all rows and just doing this on character ids so it’s not like the cause is due to having multiple level instances of a single character or anything.
Fascinating, I hadn't thought to look for duplicate IDs.
I do think the rest can be trimmed down a little, though I’m not sure I fully agree with your methodology but none will be perfect.
I'm not totally happy with my methodology either, but I'm happy to discuss it. And I agree that there isn't a perfect way to trim the data.
Also I wonder if it’s better not to just talk about characters created in beyond, as even if not played I’ve created alot of characters that I would play but don’t have a new campaign to use them in.
I think that is a valid way to analyze the data. However, I'm worried that when you start posting results, people are going to take it as characters meant for play whether that's what they are or not.
 
Last edited:


I only found two characters with duplicate entries at different levels: 370520 (Oona Durothil, A rogue-assassin, 11th and 12th) and 529223 (Ashthrodir, an paladin-ancients/warlock-archfey, 12th and 16th). I'm not totally sure of my search on this one, though. (I haven't made an actual database, I'm just messing around with list comprehensions and counters in Python).
 

I only found two characters with duplicate entries at different levels: 370520 (Oona Durothil, A rogue-assassin, 11th and 12th) and 529223 (Ashthrodir, an paladin-ancients/warlock-archfey, 12th and 16th). I'm not totally sure of my search on this one, though. (I haven't made an actual database, I'm just messing around with list comprehensions and counters in Python).
One thing I was having trouble with was the other class field returning not null when it appears to be null. Probably a special character present, but that may have more to do with my import than the data file itself.

Might just need to compare starting class level to total level instead.
 

Fascinating, I hadn't thought to look for duplicate IDs.
I didn’t either. Stumbled upon it when I showed 3 level 1 characters were multiclassed. Turned out to be one listed 3 times and the class other field appeared blank.

I'm not totally happy with my methodology either, but I'm happy to discuss it. And I agree that there isn't a perfect way to trim the data.
I’ll have to think through some more on how I would do it.
I think that is a valid way to analyze the data. However, I'm worried that when you start posting results, people are going to take it as characters meant for play whether that's what they are or not.
Yea. But there’s no clear indicator which are and aren’t. We can try to eliminate some of the obvious ones not meant to be PCs. Though, there’s going to be some error bands no matter which way we go. Either you’re removing some valid characters or including some invalid ones. Or both.
 

I didn’t either. Stumbled upon it when I showed 3 level 1 characters were multiclassed. Turned out to be one listed 3 times and the class other field appeared blank.
One thing about multiclassing: I've seen multiple classes in the second class field. Like starting class is 'Fighter', and other class is 'Rogue/Wizard'. I see over 8k records like that in the full data.
 

One thing about multiclassing: I've seen multiple classes in the second class field. Like starting class is 'Fighter', and other class is 'Rogue/Wizard'. I see over 8k records like that in the full data.
Yea. I’m not seeing how to get the individual levels for those classes either?
 


Wow, trimming out duplicate character IDs drops my trimmed data set from 404k to 167k.
What does the character ID signify? Could a player (or DM) use the same ID for different characters, the digital equivalent of erasing and reusing a character sheet? Or could a DM use a single ID for every character in a game so as to avoid the players having to pay for subscriptions? Etc.?
 

Remove ads

Top