Menu
News
All News
Dungeons & Dragons
Level Up: Advanced 5th Edition
Pathfinder
Starfinder
Warhammer
2d20 System
Year Zero Engine
Industry News
Reviews
Dragon Reflections
White Dwarf Reflections
Columns
Weekly Digests
Weekly News Digest
Freebies, Sales & Bundles
RPG Print News
RPG Crowdfunding News
Game Content
ENterplanetary DimENsions
Mythological Figures
Opinion
Worlds of Design
Peregrine's Nest
RPG Evolution
Other Columns
From the Freelancing Frontline
Monster ENcyclopedia
WotC/TSR Alumni Look Back
4 Hours w/RSD (Ryan Dancey)
The Road to 3E (Jonathan Tweet)
Greenwood's Realms (Ed Greenwood)
Drawmij's TSR (Jim Ward)
Community
Forums & Topics
Forum List
Latest Posts
Forum list
*Dungeons & Dragons
Level Up: Advanced 5th Edition
D&D Older Editions, OSR, & D&D Variants
*TTRPGs General
*Pathfinder & Starfinder
EN Publishing
*Geek Talk & Media
Search forums
Chat/Discord
Resources
Wiki
Pages
Latest activity
Media
New media
New comments
Search media
Downloads
Latest reviews
Search resources
EN Publishing
Store
EN5ider
Adventures in ZEITGEIST
Awfully Cheerful Engine
What's OLD is NEW
Judge Dredd & The Worlds Of 2000AD
War of the Burning Sky
Level Up: Advanced 5E
Events & Releases
Upcoming Events
Private Events
Featured Events
Socials!
EN Publishing
Twitter
BlueSky
Facebook
Instagram
EN World
BlueSky
YouTube
Facebook
Twitter
Twitch
Podcast
Features
Top 5 RPGs Compiled Charts 2004-Present
Adventure Game Industry Market Research Summary (RPGs) V1.0
Ryan Dancey: Acquiring TSR
Q&A With Gary Gygax
D&D Rules FAQs
TSR, WotC, & Paizo: A Comparative History
D&D Pronunciation Guide
Million Dollar TTRPG Kickstarters
Tabletop RPG Podcast Hall of Fame
Eric Noah's Unofficial D&D 3rd Edition News
D&D in the Mainstream
D&D & RPG History
About Morrus
Log in
Register
What's new
Search
Search
Search titles only
By:
Forums & Topics
Forum List
Latest Posts
Forum list
*Dungeons & Dragons
Level Up: Advanced 5th Edition
D&D Older Editions, OSR, & D&D Variants
*TTRPGs General
*Pathfinder & Starfinder
EN Publishing
*Geek Talk & Media
Search forums
Chat/Discord
Menu
Log in
Register
Install the app
Install
Upgrade your account to a Community Supporter account and remove most of the site ads.
Rocket your D&D 5E and Level Up: Advanced 5E games into space! Alpha Star Magazine Is Launching... Right Now!
Community
General Tabletop Discussion
*TTRPGs General
ENnies - let's launch the voting booth!
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="Olgar Shiverstone" data-source="post: 993826" data-attributes="member: 5868"><p>Sorry to use too much mathematical jargon. Let me see if I can translate some of this so that a person without a background in statistics can follow it.</p><p></p><p></p><p></p><p>This is just trying to define the basic theory that applies. A branch of decision making analysis called utility theory examines the consequences of people's choices when they are forced to make decisions based on a deterministic system -- like when assigning scores to a ranking of products. Although ranking systems are intended to be linear (each "step" in the score is worth the same value), people's individual value systems are not: the difference in quality between a "5" and "6" product that you vote for may not be the same as the difference between a "9" and a "10" product. This is what messes up strictly numerical rating systems, particularly at low numbers -- a "2" product is exactly twice as good as a "1", but a "3" is only 1.5x as good as a "2". Value theory (and it's extension, utility theory, which deals with uncertainty -- but really doesn't apply here, since there aren't any "maybe" answers that include uncertainty) is designed to correct desicion weighting -- the scores that people given -- based on their perceived value of the score. Each individual has a different value curve -- for one person, a "5" might be twice as good as a "4", while for another it might be 4x as good. Value theory enables all those scores to be compared equally -- without the correction, you're comparing apples to oranges, in essence, an it is possible for some people's votes to carry more weight than others.</p><p></p><p>But value theory is fairly complicated to apply, because it requires evaluating a set of tradeoffs for each person, so we can't apply it directly, here. What we can try to do is come close, with the goal that each person's vote essentially carries equal weight, and that products are ranked by their quality, not just popularity (or why else have the 0-10 rating?). </p><p></p><p>Central to this concept is the fact that individual rankings of a product don't directly assess the overall quality of the product -- you're actually estimating the quality of the product from a sampling of people that have used the product. Done correctly, you'll estimate the real quality of the product within a certain margin of error (essentially what polls do when they sample X number of people and report an answer +/- a certain amount).</p><p></p><p></p><p></p><p>First, get rid of anyone trying to screw up the voting system, by having a non-regular voting pattern (which Morrus is already doing). The "average" voter will have a certain voting distribution that can be described mathematically, within a certain margin of variability. Anyone who falls well outside that can be assumed to be trying to fix votes and should be deleted.</p><p></p><p></p><p></p><p>This concept is a little difficult to follow if you haven't had stats, but essentially when two people rate a product, their rating's aren't equal. Even if you have a 10-point rating scale, no two people are going to use the entire scale in the same way (because of the value information I presented above). Some are biased toward high scores, some are biased toward low scores, some might have a tight grouping (only score 4-6, for example), others might use the whole range. Differing variance and mean (voting bias) can skew results, and cause certain poeple's votes to effectively carry more weight. With a large enough sample of votes, this tends to be reduced somwhat -- but why not correct it right off the bat?</p><p></p><p>We can "correct" everyone's votes so that everyone uses the same distribution -- a normal (Gaussin, bell) curve with a mean (average) score of 0 and a variance of 1 (ie, N(0,1)). If you take all the ratings a person gives, calculate the mean and standard deviation of those scores , you can arrive at a corrected score for each product by taking the individual's score, subtracting the mean score, and dividing by the standard deviation. This generates a set of scores that range from -4 to +4, distributed along a bell curve -- and if done for every individual, their scores will be distributed along the same curve. A -4 correleates to the intended lowest score, +4 to the highest, and the middle value -- 0 -- will now correlate to the intended "average" score: 5.</p><p></p><p>That way, when you add their scores, you've eliminated individual bias, to ensure that everyone's score means the same thing.</p><p></p><p></p><p></p><p>A sample -- set of votes -- has to be big enough to generate a truly representative sampling. If a product has sold 1 million copies, for example, and you only get 10 ratings, are those ten truly indicative of the quality of the product? Or did only the biggest whiners/fanboys vote?</p><p></p><p>Morrus has established a cutoff, which is good. You can calculate an exact number needed, based on how accurate you want to be -- but for our purposes a swag estimate will probably work.</p><p></p><p></p><p></p><p>We want to make sure the winner is really the winner, without question. Say, for example, you have Product #2 that gets 10 votes, all 5's (we'll ingore norming for the moment, and assume these are normed scores). The total score is 50, mean 5. Product #2 gets 4 10's, 4 2's, and 2 1's: total 50, mean 5. Product #3 gets 5 7's, a 6, 2 2's, and 2 1's: total 47, mean 4.7. Who wins? Strictly by total or mean score, 1 and 2 tie, both slightly better than 3 -- but is that how we should judge it? Product 3 has more "above average" scores than either of the other two products, for example. Because of variance, the apparent winner may not be the actual winner.</p><p></p><p>As sample sizes get very large, it's possible to construct scenarios where widely varying scores are actually the same due to variance. That's the purpose of the ANOVA, to test that the winning score is actually statistically different than the others. There's more to it than that, of course -- I'm trying to avoid any deeper discussion. The point is -- make sure everyone's vote counts equally, and that the winner is really far enough ahead to be the winner.</p><p></p><p>There's quite an involved science behind ratings and evaluations. Multiple voter (also know as stakeholder) systems which involve individual rating schemes are one of the most complicated systems to get to work in a truly fair manner -- be glad elections are usually held on a "one-man, one-vote" plurality/majority system.</p><p></p><p>If you're interested in more reading about decision making and value theory, there's a great little book written purely in layman's terms, called <em>Smart Choices</em>, by Hammond, Keeney, and Raiffa.</p><p></p><p>Hope I haven't bored everyone to tears. Thanks for bearing with my pedantry.</p></blockquote><p></p>
[QUOTE="Olgar Shiverstone, post: 993826, member: 5868"] Sorry to use too much mathematical jargon. Let me see if I can translate some of this so that a person without a background in statistics can follow it. This is just trying to define the basic theory that applies. A branch of decision making analysis called utility theory examines the consequences of people's choices when they are forced to make decisions based on a deterministic system -- like when assigning scores to a ranking of products. Although ranking systems are intended to be linear (each "step" in the score is worth the same value), people's individual value systems are not: the difference in quality between a "5" and "6" product that you vote for may not be the same as the difference between a "9" and a "10" product. This is what messes up strictly numerical rating systems, particularly at low numbers -- a "2" product is exactly twice as good as a "1", but a "3" is only 1.5x as good as a "2". Value theory (and it's extension, utility theory, which deals with uncertainty -- but really doesn't apply here, since there aren't any "maybe" answers that include uncertainty) is designed to correct desicion weighting -- the scores that people given -- based on their perceived value of the score. Each individual has a different value curve -- for one person, a "5" might be twice as good as a "4", while for another it might be 4x as good. Value theory enables all those scores to be compared equally -- without the correction, you're comparing apples to oranges, in essence, an it is possible for some people's votes to carry more weight than others. But value theory is fairly complicated to apply, because it requires evaluating a set of tradeoffs for each person, so we can't apply it directly, here. What we can try to do is come close, with the goal that each person's vote essentially carries equal weight, and that products are ranked by their quality, not just popularity (or why else have the 0-10 rating?). Central to this concept is the fact that individual rankings of a product don't directly assess the overall quality of the product -- you're actually estimating the quality of the product from a sampling of people that have used the product. Done correctly, you'll estimate the real quality of the product within a certain margin of error (essentially what polls do when they sample X number of people and report an answer +/- a certain amount). First, get rid of anyone trying to screw up the voting system, by having a non-regular voting pattern (which Morrus is already doing). The "average" voter will have a certain voting distribution that can be described mathematically, within a certain margin of variability. Anyone who falls well outside that can be assumed to be trying to fix votes and should be deleted. This concept is a little difficult to follow if you haven't had stats, but essentially when two people rate a product, their rating's aren't equal. Even if you have a 10-point rating scale, no two people are going to use the entire scale in the same way (because of the value information I presented above). Some are biased toward high scores, some are biased toward low scores, some might have a tight grouping (only score 4-6, for example), others might use the whole range. Differing variance and mean (voting bias) can skew results, and cause certain poeple's votes to effectively carry more weight. With a large enough sample of votes, this tends to be reduced somwhat -- but why not correct it right off the bat? We can "correct" everyone's votes so that everyone uses the same distribution -- a normal (Gaussin, bell) curve with a mean (average) score of 0 and a variance of 1 (ie, N(0,1)). If you take all the ratings a person gives, calculate the mean and standard deviation of those scores , you can arrive at a corrected score for each product by taking the individual's score, subtracting the mean score, and dividing by the standard deviation. This generates a set of scores that range from -4 to +4, distributed along a bell curve -- and if done for every individual, their scores will be distributed along the same curve. A -4 correleates to the intended lowest score, +4 to the highest, and the middle value -- 0 -- will now correlate to the intended "average" score: 5. That way, when you add their scores, you've eliminated individual bias, to ensure that everyone's score means the same thing. A sample -- set of votes -- has to be big enough to generate a truly representative sampling. If a product has sold 1 million copies, for example, and you only get 10 ratings, are those ten truly indicative of the quality of the product? Or did only the biggest whiners/fanboys vote? Morrus has established a cutoff, which is good. You can calculate an exact number needed, based on how accurate you want to be -- but for our purposes a swag estimate will probably work. We want to make sure the winner is really the winner, without question. Say, for example, you have Product #2 that gets 10 votes, all 5's (we'll ingore norming for the moment, and assume these are normed scores). The total score is 50, mean 5. Product #2 gets 4 10's, 4 2's, and 2 1's: total 50, mean 5. Product #3 gets 5 7's, a 6, 2 2's, and 2 1's: total 47, mean 4.7. Who wins? Strictly by total or mean score, 1 and 2 tie, both slightly better than 3 -- but is that how we should judge it? Product 3 has more "above average" scores than either of the other two products, for example. Because of variance, the apparent winner may not be the actual winner. As sample sizes get very large, it's possible to construct scenarios where widely varying scores are actually the same due to variance. That's the purpose of the ANOVA, to test that the winning score is actually statistically different than the others. There's more to it than that, of course -- I'm trying to avoid any deeper discussion. The point is -- make sure everyone's vote counts equally, and that the winner is really far enough ahead to be the winner. There's quite an involved science behind ratings and evaluations. Multiple voter (also know as stakeholder) systems which involve individual rating schemes are one of the most complicated systems to get to work in a truly fair manner -- be glad elections are usually held on a "one-man, one-vote" plurality/majority system. If you're interested in more reading about decision making and value theory, there's a great little book written purely in layman's terms, called [i]Smart Choices[/i], by Hammond, Keeney, and Raiffa. Hope I haven't bored everyone to tears. Thanks for bearing with my pedantry. [/QUOTE]
Insert quotes…
Verification
Post reply
Community
General Tabletop Discussion
*TTRPGs General
ENnies - let's launch the voting booth!
Top