Menu
News
All News
Dungeons & Dragons
Level Up: Advanced 5th Edition
Pathfinder
Starfinder
Warhammer
2d20 System
Year Zero Engine
Industry News
Reviews
Dragon Reflections
White Dwarf Reflections
Columns
Weekly Digests
Weekly News Digest
Freebies, Sales & Bundles
RPG Print News
RPG Crowdfunding News
Game Content
ENterplanetary DimENsions
Mythological Figures
Opinion
Worlds of Design
Peregrine's Nest
RPG Evolution
Other Columns
From the Freelancing Frontline
Monster ENcyclopedia
WotC/TSR Alumni Look Back
4 Hours w/RSD (Ryan Dancey)
The Road to 3E (Jonathan Tweet)
Greenwood's Realms (Ed Greenwood)
Drawmij's TSR (Jim Ward)
Community
Forums & Topics
Forum List
Latest Posts
Forum list
*Dungeons & Dragons
Level Up: Advanced 5th Edition
D&D Older Editions, OSR, & D&D Variants
*TTRPGs General
*Pathfinder & Starfinder
EN Publishing
*Geek Talk & Media
Search forums
Chat/Discord
Resources
Wiki
Pages
Latest activity
Media
New media
New comments
Search media
Downloads
Latest reviews
Search resources
EN Publishing
Store
EN5ider
Adventures in ZEITGEIST
Awfully Cheerful Engine
What's OLD is NEW
Judge Dredd & The Worlds Of 2000AD
War of the Burning Sky
Level Up: Advanced 5E
Events & Releases
Upcoming Events
Private Events
Featured Events
Socials!
EN Publishing
Twitter
BlueSky
Facebook
Instagram
EN World
BlueSky
YouTube
Facebook
Twitter
Twitch
Podcast
Features
Top 5 RPGs Compiled Charts 2004-Present
Adventure Game Industry Market Research Summary (RPGs) V1.0
Ryan Dancey: Acquiring TSR
Q&A With Gary Gygax
D&D Rules FAQs
TSR, WotC, & Paizo: A Comparative History
D&D Pronunciation Guide
Million Dollar TTRPG Kickstarters
Tabletop RPG Podcast Hall of Fame
Eric Noah's Unofficial D&D 3rd Edition News
D&D in the Mainstream
D&D & RPG History
About Morrus
Log in
Register
What's new
Search
Search
Search titles only
By:
Forums & Topics
Forum List
Latest Posts
Forum list
*Dungeons & Dragons
Level Up: Advanced 5th Edition
D&D Older Editions, OSR, & D&D Variants
*TTRPGs General
*Pathfinder & Starfinder
EN Publishing
*Geek Talk & Media
Search forums
Chat/Discord
Menu
Log in
Register
Install the app
Install
NOW LIVE! Today's the day you meet your new best friend. You don’t have to leave Wolfy behind... In 'Pets & Sidekicks' your companions level up with you!
Community
General Tabletop Discussion
*Dungeons & Dragons
Race/Class combinations that were cool but you avoided due to mechanics?
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="Cadence" data-source="post: 8077096" data-attributes="member: 6701124"><p>It doesn't seem obvious to me that relative risk is necessarily more useful in this case than the difference in absolute risk. The classic example of where looking at just relative risk breaks down is in the extremes. If event A occurs with probability 0.0001 and event B occurs with probability 0.001. B's probability is 10x larger (it's gone up 900%). Even if you have A and B compete a 1,000 times, A will still win around 4% of the time and will tie B around 37% of the time in spite of B's win probability being a massive 10x that of A. Looking at that same relative risk with P(A)=0.05 and P(B)=0.50, where there's a much bigger difference in absolute risk, B wins the match essentially all the time. So, in something like treatment effectiveness, looking at just relative risk as a descriptive feels like it can give a very odd picture of the actual overall impact of the treatment.</p><p></p><p>As I noted in a previous post, if P(A)=50% vs. P(B)=55% were to compete 100 times, then B would have around a 74% chance of winning the title for the session, A around 22%, and they would tie around 4%. So there will certainly be a lot of times when even after a full hundred rounds that B hasn't shown better than A, let alone clearly so. (The expected number of extra hits for B over A over the 100 is of course 5, so even some of the times A is losing it is only by a few hits). The probability of getting the arbitrary alpha=0.05 level statistical significance in this case is less than 20%, even if doing the one sided test because you think you know which is better. It feels odd to say that something is clearly noticeable based on just the successes and failures if the best test for finding a difference would only reject 20% of the time. If that was enough power to be happy with a sample size, then that means one is happy with a 20% false discovery rate (0.05/(0.05+0.20), right?</p><p></p><p>Making it more extreme, 15% vs. 20% does make it slightly more apparent, but even then A still has about a 15% chance of doing better over 100 trials, and 5% chance of tying. The estimated power at alpha=0.05 is still only around 22%.</p><p></p><p>Going to 5% vs. 10% A is down to winning or tying a total of only 10%, but the estimated power is still just around 30%. So it certainly does matter in the ends. If everyone was 99% at something and I was only 94%, it feels like it would stand out and the party would groan when it was my turn and I missed. What percent of combat happens off in the tails like that?</p><p></p><p><Slap-dash R code at bottom in case my numbers are off. Also, please insert disclaimer about the arbitrariness of alpha=0.05 and how hypothesis tests aren't usually what you want... and also a that power seems like a relevant idea here anyway.></p><p></p><p>If Legalos had a 5% bonus over Gimli and they kept track over several game sessions, it feels like Gimli would be able to say he wasn't doing nearly as well after a few of them against hard to hit monsters. But against things in the middle it feels like it would take a while longer before his inner statistician would let him concede. That it's hard to be confident in the difference after just 100, but easier when you get several times more, seems to fit in with what you might get in baseball - how does a .250 vs. .300 batting average feel after only 100 plate appearances at the beginning of the season for making long term decisions vs. after 500+ plate appearances? (Well, I mean except for batting average being a horrible statistic).</p><p></p><p>All that being said, it's hard for me to argue with the fact that the human brain isn't always big on caring what the probabilities say if it fits the story that it's working on:</p><p></p><p></p><p></p><p></p><p>#nsims=number of simulation runs, I didn't feel like digging up the convolution of</p><p># different binomials</p><p>#sz is the number of trials a and b have, where they succeed with probabilities pa</p><p># and pb</p><p>#The first three numbers that are output are the estimated probability b wins, estimated</p><p># probability a wins, and the estimated probability they tie.</p><p>#The next are the estimated power at a=0.05 for rejecting the null hypothesis that</p><p># they're equal using either the exact McNemar's test (since we know the order they</p><p># were in) or the usual two-sample z-test. As the pairing explains no variance</p><p># I was a bit surprised the McNemar test was as different in a few cases.</p><p>nsims=100000</p><p>sz=100</p><p>pa<-0.5</p><p>pb<-0.55</p><p>aplus<-rep(0,nsims)</p><p>bplus<-rep(0,nsims)</p><p>abeq<-rep(0,nsims)</p><p>pmcn<-rep(0,nsims)</p><p>pind<-rep(0,nsims)</p><p>for (i in 1:nsims){</p><p>x<-rbinom(sz,1,pa)</p><p>y<-rbinom(sz,1,pb)</p><p>aplus<em><-sum(x>y)</em></p><p>bplus<-sum(y>x)</p><p>abeq<-sum(y==x)</p><p>pmcn<-binom.test(aplus,aplus+bplus,p=0.5,alternative="less")$p.value</p><p>pind<-prop.test(c(aplus,bplus),c(sz,sz),alternative="less")$p.value</p><p>}</p><p>sum(bplus>aplus)/nsims</p><p>sum(aplus>bplus)/nsims</p><p>sum(aplus==bplus)/nsims</p><p>sum(pmcn<0.05)/nsims</p><p>sum(pind<0.05)/nsims</p></blockquote><p></p>
[QUOTE="Cadence, post: 8077096, member: 6701124"] It doesn't seem obvious to me that relative risk is necessarily more useful in this case than the difference in absolute risk. The classic example of where looking at just relative risk breaks down is in the extremes. If event A occurs with probability 0.0001 and event B occurs with probability 0.001. B's probability is 10x larger (it's gone up 900%). Even if you have A and B compete a 1,000 times, A will still win around 4% of the time and will tie B around 37% of the time in spite of B's win probability being a massive 10x that of A. Looking at that same relative risk with P(A)=0.05 and P(B)=0.50, where there's a much bigger difference in absolute risk, B wins the match essentially all the time. So, in something like treatment effectiveness, looking at just relative risk as a descriptive feels like it can give a very odd picture of the actual overall impact of the treatment. As I noted in a previous post, if P(A)=50% vs. P(B)=55% were to compete 100 times, then B would have around a 74% chance of winning the title for the session, A around 22%, and they would tie around 4%. So there will certainly be a lot of times when even after a full hundred rounds that B hasn't shown better than A, let alone clearly so. (The expected number of extra hits for B over A over the 100 is of course 5, so even some of the times A is losing it is only by a few hits). The probability of getting the arbitrary alpha=0.05 level statistical significance in this case is less than 20%, even if doing the one sided test because you think you know which is better. It feels odd to say that something is clearly noticeable based on just the successes and failures if the best test for finding a difference would only reject 20% of the time. If that was enough power to be happy with a sample size, then that means one is happy with a 20% false discovery rate (0.05/(0.05+0.20), right? Making it more extreme, 15% vs. 20% does make it slightly more apparent, but even then A still has about a 15% chance of doing better over 100 trials, and 5% chance of tying. The estimated power at alpha=0.05 is still only around 22%. Going to 5% vs. 10% A is down to winning or tying a total of only 10%, but the estimated power is still just around 30%. So it certainly does matter in the ends. If everyone was 99% at something and I was only 94%, it feels like it would stand out and the party would groan when it was my turn and I missed. What percent of combat happens off in the tails like that? <Slap-dash R code at bottom in case my numbers are off. Also, please insert disclaimer about the arbitrariness of alpha=0.05 and how hypothesis tests aren't usually what you want... and also a that power seems like a relevant idea here anyway.> If Legalos had a 5% bonus over Gimli and they kept track over several game sessions, it feels like Gimli would be able to say he wasn't doing nearly as well after a few of them against hard to hit monsters. But against things in the middle it feels like it would take a while longer before his inner statistician would let him concede. That it's hard to be confident in the difference after just 100, but easier when you get several times more, seems to fit in with what you might get in baseball - how does a .250 vs. .300 batting average feel after only 100 plate appearances at the beginning of the season for making long term decisions vs. after 500+ plate appearances? (Well, I mean except for batting average being a horrible statistic). All that being said, it's hard for me to argue with the fact that the human brain isn't always big on caring what the probabilities say if it fits the story that it's working on: #nsims=number of simulation runs, I didn't feel like digging up the convolution of # different binomials #sz is the number of trials a and b have, where they succeed with probabilities pa # and pb #The first three numbers that are output are the estimated probability b wins, estimated # probability a wins, and the estimated probability they tie. #The next are the estimated power at a=0.05 for rejecting the null hypothesis that # they're equal using either the exact McNemar's test (since we know the order they # were in) or the usual two-sample z-test. As the pairing explains no variance # I was a bit surprised the McNemar test was as different in a few cases. nsims=100000 sz=100 pa<-0.5 pb<-0.55 aplus<-rep(0,nsims) bplus<-rep(0,nsims) abeq<-rep(0,nsims) pmcn<-rep(0,nsims) pind<-rep(0,nsims) for (i in 1:nsims){ x<-rbinom(sz,1,pa) y<-rbinom(sz,1,pb) aplus[I]<-sum(x>y)[/I] bplus<-sum(y>x) abeq<-sum(y==x) pmcn<-binom.test(aplus,aplus+bplus,p=0.5,alternative="less")$p.value pind<-prop.test(c(aplus,bplus),c(sz,sz),alternative="less")$p.value } sum(bplus>aplus)/nsims sum(aplus>bplus)/nsims sum(aplus==bplus)/nsims sum(pmcn<0.05)/nsims sum(pind<0.05)/nsims [/QUOTE]
Insert quotes…
Verification
Post reply
Community
General Tabletop Discussion
*Dungeons & Dragons
Race/Class combinations that were cool but you avoided due to mechanics?
Top