So, you're idea of "realism" is to choose a chargen method that will nearly always result in characters that are better than the standard array. Is that correct?
See, that's the *ahem* elephant in the room. Die roll methods will almost always result in higher value characters than standard array.
This is a fallacy and factually incorrect. It all depends on the die rolling method, but the standard array was selected because it is a close approximation of the average of 4d6k3.
Rolling 3d6 for stats results in only a 9.56% chance that you will roll higher than the standard array of 72 points. The average is 63 points, considerably below the standard array. Also note that on any given die roll using 3d6, the probabilities don't change. No matter how many times you roll you have an almost 50% chance of rolling a 9, 10, 11, or 12, and a nearly 68% chance of it being an 8,9,10,11, or 12. The chance of rolling either a 3 or an 18 is less than 1/2% (.463%).
What is true is that 4d6k3 will produce scores on average that are
slightly higher than the standard array. See here:
http://anydice.com/articles/4d6-drop-lowest/
This appears to be simply because WotC rounded down instead of up. The actual average is 73.46 points, instead of the 72 points granted by the standard array. This slight variation may have been intentional as a benefit to rolling because that also happens to be the mean, so half of the characters rolled will be less than this 73 points.
So the best you can claim with 4d6k3 is that a
half of the characters generated will result in characters with 1 or more points higher than the standard array, which is hardly "almost always" and doesn't account for the fact that almost half will be less. Of those that are higher, most will be only a point or two higher. If using 3d6, then almost 90% of rolled characters will be the standard array or less. Most considerably less.
The standard array doesn't create a bell curve. Every single character has the same number of points. There is no curve. (Technically humans are 1 point less, so there are really two points - 75 for non-human, 74 for humans, still no bell curve, it's linear).
The bell curve that rolling creates (although the 4d6k3 isn't technically a bell curve, since it's asymmetrical) means that more people will have 65 (3d6 + modifiers) or 76 (4d6k3 + modifiers) as the most common point score. The least common would be 21 (all 3's + modifiers), or 111 (all 18s + modifiers), but those are very, very rare. The closer you get to the middle (65 or 76 depending on method) the more common the totals are.
But it's this bell curve that creates a "realistic" distribution across characters that is lacking in standard array and point buy systems. Not every character or NPC generated has the same point value for stats. Most are bunched in the middle (average), with a small number of very good and very bad characters.
For proof, I'd ask you to canvas your groups. Yup, there will be that one guy who has a lower than standard array, but, that's offset by the other nineteen characters that are all higher.
Canvassing your groups tells you nothing about the die rolling method. It
will tell you about the application of the method. And this is colored by the fact that many groups probably don't require you to take the first set of stats that you roll. In which case players are allowed to cherry pick the better sets. That will skew the value considerably. But it's also important to actually check the total value (72 points) rather than whether any numbers or even several, are above 15.
We do that, simply because I'm not concerned about whether the characters have higher scores in terms of game balance etc. The stats that aren't selected can be used for NPCs, followers, or other characters. One of the reasons why I like the multiple character approach, is that players don't seem to mind some substandard characters if they aren't their only character. Although ironically, it's often (usually?) the substandard character that they enjoy playing the most.
Look, I get that people like random generation. Fair enough. But, at least be upfront about it. People are randomly generating their characters so they can get higher powered characters.
If that's your experience, then they aren't following the rules. If their purpose is to make higher powered characters and they are choosing to roll to do so, then they must also accept lower powered characters.
I'd recommend the following:
For players who choose to roll, they must take the first set of stats they roll (arrange as desired).
The floor for rolling is 63 points (the average of 3d6). If the total does not equal 63, then you may increase stats to reach a total of 63.
No stat can be raised above 15.
No second stat can be raised above 15 if any other stat is below 8.
If you'd like a narrower distribution, then set the floor closer to 72.
This way, folks that roll have a chance of getting higher ability scores (one of the reasons why people like rolling). Setting the floor means they will only get one roll. As soon as you allow a second roll, their stats will trend higher.
Of course, you're going to run into players who decide that if their character is less than everybody else's 72, it's not good enough. Either they'll complain, perhaps attempt to kill off the character, whatever. To me it's simple. If you can't abide by the rules, and play the character you've rolled randomly in good faith, then you can't roll randomly. You get the standard array.
However, in many of our experiences, that's not why people are rolling their characters. In fact, I can't say anybody here who is expressing their preference for rolling characters is doing so because it creates higher powered characters. So I agree, let's be upfront about it.
Some people randomly generate their characters so they can get higher powered characters.