If I read your test scenario right, it looked like the impact was greater at low levels. Receding rapidly at levels that have large jumps in power. Did you look at high level play?
My players love rolling stats. To create parity I then let them use stats rolled by anyone at the table in lieu of their own.
Personally I do not like ANY stat generation method now that I have looked at them.
It was consistently 20-30% difference at levels 4 and 5. Beyond that ... the number of options becomes more difficult to track. Someone who's maxed out stats is going to start taking feats for example.
So for example with my sample characters, at 6th level the high roller may well have maxed out their strength. At that point they can take a feat like shield mastery to knock opponents prone to get advantage on attacks which is pretty huge. The person with lower scores (assuming maxing out strength for simplicity) is probably going to have to wait for 8th or 10th level. But by then the other guy has taken ___.