The Pitfalls of D&D Beyond Data

ClaytonCross

Kinder reader Inflection wanted
A few thoughts.

1. "Subclass Distribution (Active Characters)" = "The distribution of subclasses of active characters". "No subclass" is a valid part of the "distribution of subclasses of active characters" and for such a title it needed to be included in the graph for the title to actually match the graph. So I really don't buy your claim that the title of the graph was ambiguous about whether all active characters or just those with subclasses were included in the graph. It was clear IMO.

It not ambiguous and I never said it was. It is exact and concise. Any "ambiguity" was created my desire for something they had no intent on providing. That's not actually ambiguity, its me ignoring their intent and over riding it with my own. My mistake not theirs.

2. The "Subclass Distribution (Active Characters)" graph was not shown in isolation. It was shown alongside a "Class Distribution (Active Characters)" graph. When 2 graphs are shown back to back with the same (Active Characters) or (whatever) designation that's supposed to mean that they are showing different breakdowns of the same population. As we now know, that's not the case for these 2 graphs.

That's an inference your manufacturing. It is not a stated intent by them and not possible based off the pure fact that not all characters have subclasses. Measuring something with a delimiter means not needing to mention those without it. If you post "number of wood houses on the block" it is automatic that the number of brick houses are not listed since they don't fit the parameter of the delimiter. The classes and races slide are a total population measurement because they apply to all characters but asking the "Rogue Subclass Distribution (Active Characters)" would only apply to rogues as the delimiter was specified. Your not going to count paladins in that number any more than you would count characters without subclasses in a "Subclass Distribution" since subclass is the delimiter is stated.

3. A better name would have been "Subclass Distribution (Active Characters with a subclass)". If that was deemed to long for the title then near the bottom of the graphic I would have at least included a fine print disclaimer about only classes with a subclass being included.

That is redundant. Subclass it the delimiter so characters without subclass are automatically disqualified. If they had included characters without subclass it would make the numbers inaccurate as you are protesting and require further explanation because it is set out side their stated delimiter.
"Subclass Distribution (Active Characters including those without subclass)"

4. There are various subclass distributions they could have shown. I think they chose the least useful distribution to show. However, you are correct that this point doesn't make the graph inaccurate of misleading. That part of what you are trying to say I agree with.

I agree, they could have broken down each class only show players with access to all the classes and it would be better information. They did disclaimer that this was a high level view and do every week during those videos. So it is what we got not what we wanted.

I've tried to understand this part most of the day and I don't really understand what you are saying enough to make a productive comment. Maybe you can dumb it down for me?


Part of your problem with multiclass representation was that you applied a scope out side the delimiter of characters with subclasses looking for a personal goal of all characters being represented in a 1 to 1 ratio to determine what is the most preferred subclass. Multi-classing breaks this because they are counted twice, which is why posted this:

A quick example of the multiclass problem: 2 Characters, A Fighter 10 And a Cleric 1/Wizard 9. Making a class chart as they did would give the following percentages:

Fighter 33%
Cleric 33%
Wizard 33%

Hopefully that makes it apparent just how egregious treating multiclassing like that can be. Not one person is going to breakdown the classes of those 2 characters as shown above... Not one person, but that is how D&D Beyond breaks that down in their class circle graph.

But that only matters if your trying to achieve a total popularity vote of a favorite subclass. Which is what you are concerned about and or looking for here:

This phenomenon of people quoting D&D Beyond stats to prove points happens the most right after they post some of their data. Those posts and threads are months old at this point and so I'm not going to dig them up. However, there's been comments about their data in more recent threads, but it's not important enough to go digging through pages and pages of comments looking for the proverbial needle in the haystick that I know is there (because I've recently read it) just to prove you wrong. Instead I'll give you one current example.

The example: @Morrus did. It's still on the front page of this forum and shows in the thread title: "90% of D&D Games Stop By Level 10; Wizards More Popular At Higher Levels"

By the way, why the heck does someone with 762 posts in the last 4 years think he has any grasp on the variety of points and justifications for those points that people actually post on this forum?

But if your only looking to say if a player picks Class A the most commonly Picked Subclass is B. That is not relevant to the slide. So their intent by title and the scope of the delimiter is not the pool of all character comparison that you want. That does make them wrong. It just means your desire/expectation is not what they are offering. They didn't miss lead you. You just tried to make "Subclass Distribution (Active Characters)" into "Most Popular Class/Subclass combinations (Active single class characters including those without subclass, only from players with access to all classes and subclasses)" which is different poll.

Your agreement of what this poll sells itself as is not what it sells itself as but what you want from it that it doesn't provide.
 

log in or register to remove this ad

ClaytonCross

Kinder reader Inflection wanted
I think it can be both "showing player preference wasn't their intent" and "they still inaccurately portrayed their graph". That said it's good to note that wasn't their intent and I agree it also shows that they respect and understand what they data shows and doesn't show.

That said my biggest concern is not with their intent but rather how our community will use the data provided. I fear most are going to make the same mistake you did at first. Heck, I'll even admit I initially took the data intent that way, however I don't think any of my claims actually depend on the intent of the data and that's why I'm not backing away from them now.

I think this sums it up well. I don't think we stand in the exact same place on this but we are much closer than we are far apart and I am not trying to dictate how you feel or what you think. I am trying to find our common ground. I think our departure here is that you want D&D Beyond to do something to prevent people from miss reading and miss using the date (which is what I brought up in the last post) but while I believe you are correct that is going to happen I don't think their is any amount of editing the title that will prevent that. I don't see it as D&D Beyond not doing enough or having labeled accurately enough. I think they did their part but people will do as they a have always done and twist ether accidently or by through miss understanding the data into a point they are trying to make. I don't think their is a way to fix that, its just how things are and we have to deal with it on a case by case measure. In this case, we have Badeye's responses to point to and clarify when that happens. Credit where credit is due the D&D Beyond staff is pretty good at addressing these concerns. More than many other platforms like Roll20 for example.
 

FrogReaver

As long as i get to be the frog
It not ambiguous and I never said it was. It is exact and concise. Any "ambiguity" was created my desire for something they had no intent on providing. That's not actually ambiguity, its me ignoring their intent and over riding it with my own. My mistake not theirs.

Well I see what's making this part of the conversation so difficult. You firmly believe the graph title meaning is clear and unambiguous and that the only reason someone might disagree about it going with graph is if they are having their judgment clouded by overriding the graph creators intent with their own. Similarly I believe the graph title meaning is clear and unambiguous and that the only reason someone might disagree about it not going with the graph is if they aren't letting the words on the title speak for themselves. We fundamentally disagree there. Those beliefs make it hard to talk about this. But I'm going to try one last time. I'm going to present my evidence for why the graph is mislabeled. I hope you will do more than just tell me I'm wrong and I also hope you will do more than just blame it on me overriding their intent with my own. My reasons for believing the graph is mislabeled have nothing to do with intent. If you care to try and change my mind about the graph being mislabeled you are going to have to address the actual reasons I think it is mislabeled.

A little background first: I come from a math and computer science background. Graphs that break down any population into particular sectors ALWAYS need to identify the population they are breaking down. They also should identify what the graph is showing about that population. In our case, subclass distribution is what the graph is showing about the population of active characters. There's simply no other way to have both of those pieces of information identified for the graph, and make no mistake, both of those pieces of information need to be identified.

I think it's also important to note that in regards to breaking down a population that a NULL result is a perfectly acceptable occurrence. So something labeled subclass distribution doesn't automatically rule out Null values needing accounted for provided that some members of that population can't be categorized by the given categories. If there are Null Values for a given population and you don't want to see the Null values then the mathematically correct thing to do is to define a new population such that all of the members of the population will fit into the categories you have defined. However, when you do this you have to label the new population appropriately.

Additionally there's also the generally accepted practice that when 2 graphs are similarly named and appear next to each other that they are breaking down the same population in different ways.

So do you actually dispute any of this?
 

FrogReaver

As long as i get to be the frog
As an example of what I am saying above. Let's say I have a bag of marbles. 96 Clear. 3 Red. 1 Blue.

If I created the Chart/Graph below

"Color Distribution (FrogReaver's bag of marbles)"
3 Red (75%)
1 Blue (25%)

Do you find that to be a correct summarization? I don't. I purposefully mislabeled the population. A correct labeling is below:

"Color Distribution (FrogReaver's bag of marbles *colored marbles only*)"
3 Red (75%)
1 Blue (25%)
 

FrogReaver

As long as i get to be the frog
That's an inference your manufacturing. It is not a stated intent by them and not possible based off the pure fact that not all characters have subclasses. Measuring something with a delimiter means not needing to mention those without it. If you post "number of wood houses on the block" it is automatic that the number of brick houses are not listed since they don't fit the parameter of the delimiter.

Yep delimeters exist. Subclass Distribution is not a delimeter. It's the kind of breakdown the graph is showing.

The classes and races slide are a total population measurement because they apply to all characters but asking the "Rogue Subclass Distribution (Active Characters)" would only apply to rogues as the delimiter was specified. Your not going to count paladins in that number any more than you would count characters without subclasses in a "Subclass Distribution" since subclass is the delimiter is stated.

Rogue is clearly a delimeter there. You would still need to show Rogues that don't have a subclass though because "subclass distribution" is not a delimeter, it's the kind of breakdown you are showing. In this example your options you would need to show would be none, thief, assssain, arcane trickster, etc. If you only want active character rogues with a subclass you need to actually delimit them out somewhere.

That is redundant. Subclass it the delimiter so characters without subclass are automatically disqualified. If they had included characters without subclass it would make the numbers inaccurate as you are protesting and require further explanation because it is set out side their stated delimiter.
"Subclass Distribution (Active Characters including those without subclass)"

A subclass distribution by definition includes the null values if there are any. Thus "subclass distribution" cannot be a delimiter for non-null subclasses.

Maybe I should ask this question: How would you title a graph like there subclass graph but that also included a section for "no subclass"?
 
Last edited:

ClaytonCross

Kinder reader Inflection wanted
As an example of what I am saying above. Let's say I have a bag of marbles. 96 Clear. 3 Red. 1 Blue.

If I created the Chart/Graph below

"Color Distribution (FrogReaver's bag of marbles)"
3 Red (75%)
1 Blue (25%)

Do you find that to be a correct summarization? I don't. I purposefully mislabeled the population. A correct labeling is below:

"Color Distribution (FrogReaver's bag of marbles *colored marbles only*)"
3 Red (75%)
1 Blue (25%)

That's not an equal comparison as "Clear" is a visual presence and is not the same as "colorless".

If you had a bag of marbles.
96 Clear. 3 Opaque Red. 1 Opaque Blue.
With the Chart/Graph below

"Opaque Colored Marble Distribution (FrogReaver's bag of marbles)"
3 Red (75%)
1 Blue (25%)

Then you would have a more accurate depiction of the case of having or not having a subclass (which is finite and specific in that a character has a subclass or it does not) and it would in fact be a correct depiction. You use of a visual property in the pretense that a clear marble does not get measured as if a color is false because if you give me a bag of colored marbles and asked me to sort by color, I would define clear as a color and separate them into their own pile. If you asked me to count the colored marbles I would say 100 total and if you asked me to count the marbles that have an Opaque color I answer with 4 total because that would be a finite answer.
 

FrogReaver

As long as i get to be the frog
That's not an equal comparison as "Clear" is a visual presence and is not the same as "colorless".

If it makes you feel better take my example and exchange every instance of "clear" with "colorless". Actually I'll do that for you.

As an example of what I am saying above. Let's say I have a bag of marbles. 96 Colorless. 3 Red. 1 Blue.

If I created the Chart/Graph below

"Color Distribution (FrogReaver's bag of marbles)"
3 Red (75%)
1 Blue (25%)

Do you find that to be a correct summarization? I don't. I purposefully mislabeled the population. A correct labeling is below:

"Color Distribution (FrogReaver's bag of marbles *colored marbles only*)"
3 Red (75%)
1 Blue (25%)
 


ClaytonCross

Kinder reader Inflection wanted
Yep delimeters exist. Subclass Distribution is not a delimeter. It's the kind of breakdown the graph is showing.

Rogue is clearly a delimeter there. You would still need to show Rogues that don't have a subclass though because "subclass distribution" is not a delimeter, it's the kind of breakdown you are showing. In this example your options you would need to show would be none, thief, assssain, arcane trickster, etc. If you only want active character rogues with a subclass you need to actually delimit them out somewhere.

A subclass distribution by definition includes the null values if there are any. Thus "subclass distribution" cannot be a delimiter for non-null subclasses.

Maybe I should ask this question: How would you title a graph like there subclass graph but that also included a section for "no subclass"?

The assumption here is that the break down component can not also be a delimiter.

Random example:
If I say "here is the truck distribution for units on base." and you ask "how many cars are there?" my answer will be "I don't know, this is trucks distribution. Not vehicle distribution. We are working on the trucks right now so why are you asking that?"

Sure, if you have an idea for survey you can ask for it but that doesn't mean a survey is wrong for presenting the information it was intended to present. Now that you have the "truck survey" you can now add cars and buses for ground transportation, then add plains, helicopters, and sea vassals but none of that will ever make the poll on trucks ever about an thing other then the poll on trucks. The new polls will be new scopes.
 

ClaytonCross

Kinder reader Inflection wanted
Also to anyone that cares: Colorless marbles are referred to as Clear.

Clear = Transparent
Opaque (not able to be seen through; not transparent.)

Clear is not the opposite of color its the opposite of opaque.

Remember this:
So while your one of the more interesting posters to read you do have a tendency to get a bit dismissive of posters based on off point arguments like posts.

Yep... your picked an argument that is vague and abstract to get me to agree with something I don't believe then justify your argument on its irrelevance.

My point from the post stands. You have a subclass or you don't have subclass. Trying to say that sorting marbles on a color scale because you don't consider clear a color when in fact a clear red marble would be red with transparency is distraction from the fact that if a class does not have a subclass it does not count as a class with subclass.
 

Remove ads

Top