It's not a coincidence, and the numbers weren't picked arbitrarily. The scaling was by the standard deviation, in order to match the first two moments of the distributions. Except that really it should have been 2*3d6-10.5, not 11, but
@NotAYakk acknowledged that that was done because AnyDice doesn't like non-integers. I guess we should do 4*3d6-21 vs 2*1d20 and just halve the numbers on the axis. But it won't look hugely different.
Because the relationship is accidental and selected, not by analysis and correct math, but by selection of values that cause parts (PARTS!) of the curves to look similar.
Rescaling doesn't make the two distributions identical, but it does self-evidently make them more similar than before the scaling. Probabilists and statisticians of a more theoretical bent do this sort of thing all the time: approximate one distribution with another by matching lower order moments and then show that the error (measured by cumulative probabilities) is bounded by a function of the higher order moments.
No, rescaling, arbitrary centering, eliminating 1/3 of the data points of one distribution, and then comparing 10 data points to 20 data points makes those look similar. The only decision made here that is remotely based on actual characteristics of the curves was the rescaling, which is questionable (because multiplying the 3d6 distribution gives data points spaced 2 apart which you then compare against data spaced 1 apart). After that it's literally making choices to achieve the goal of making the curves look similar.
I'm not sure why the comparison to the non-standard-deviation-matched version of the distribution is relevant to the argument. (I assume also that you mean CDFs rather than PDFs? Not trying to be pedantic, just make sure I'm following you)
Technically, what's been discussed is 1 - the cumulative probability function. Is this the point where we actually start using proper terminology in this thread? I figured having that argument wouldn't help understanding, so I've been working informally using the language similar to what's been previously used.
So, using probability density function is technically correct, if imprecise.
Here I have to confess that I don't follow what you are trying to say. What same point? You need 13 on 3d6 to have 2*3d6-11 equal 15, so that's presumably not where the 12 is coming from. Maybe you mean that 12 or higher on 3d6 is where you have about the same likelihood as 15 or higher on 1d20, and 15 or higher on 3d6 is less likely than 20 on d20. Ok, but what does that imply about the argument? Just that even after matching standard deviations, you still have a lower chance of getting a 20 with 3d6-11 than you do with 1d20? Everyone agrees on that point.
Typo.
It points out that the stretched distribution distributed very different behaviors due to being stretched, which makes relying only on the visual similarity in that range even less good.
But nobody is arguing that the curve is actually a line... they're similar to the extent that the probabilities are similar. Now if you wanted to make an argument that linear comparison of probabilities isn't necessarily the right metric, you might have a point (I have at times argued that "rolls per success" is a more useful metric in some contexts than "successes per roll", for example), but that doesn't seem to be what you're doing.
Huh. The OP (and later posts) have relied on the fact that there's a similarity that supports the argument that stretching the d20 line is functionally similar to a stretched 3d6 line, therefore 3d6 and d20 aren't much different. My point is that the math suggested by such stretching and skewing is very badly founded and an improper use of math. The point that you can alter the math of 5e to move some breakpoints on the d20 is orthogonal to my point that the math of the graphs is absolutely wrong. The justification that relies on bad math is what I'm arguing against.
I mean, the first point the OP makes is that using 3d6 is the same as rolling a d20, if you change the target numbers and the bonus to the roll. What that example shows is really only that the likelihood of rolling at least a 16 on a d20 is close to the likelihood of rolling at least a 13 on 3d6. Cool, I guess. The reason this schema works isn't any real similarity in the curves of the d20 and the 3d6 but instead skewing the inputs to the d20 to stretch it. What the OP did was change the math of the bonuses so that you need a 16 instead of a 13 on a d20 to hit the new math AC with the new math attack bonuses.