Maybe your expectations are off for what 1 player should be able to accomplish even with a very good feat. That there's too many variables to really know if your analysis was biased or not without hearing more elaboration on it.
I was expecting to show that the feat had a huge impact. I chalked up the early results in the testing to odd luck, but they kept coming up similarly as I progressed through testing at different levels.
I simulated/ran a party of one wizard, one cleric and two (human variant) fighters through 4 encounters at each of the following levels: 1, 6, 11 and 17 (for a total of 16 encounters). One was medium, one hard, one difficult, and one double the XP for difficult. I recorded every die roll and every result for each combat. I had rules about what strategies the PCs and monsters would use that I tried to follow as consistently as possible (any set of rules you try to envision falls apart when you face 'real' combat scenarios). When I ran the PCs through the simulations, the fighters had GWM and used it for EVERY attack.
Then, I ran the same simulation, but I had the fighters use GWM optimally (only where it was more likely to result in more damage). I used the same rolls for the same purposes by the same entities. In other words, the first attack by Fighter 1 in a combat used the same attack die roll. The first save by the same PC used the same save die roll result, whether it was for the same monster ability or a different one. If I ran out of previously used rolls, I added more to the series and recorded them.
I then repeated everything a THIRD time, but treated the fighters as if they did not have GWM. I did not replace it with an ability score bonus or other feat. I just ignored that feat. I again used the same die rolls in the same order and tried to use the same strategies.
I considered doing it a FOURTH time, treating the human fighters as if they were not variant humans, but just normal humans, but I decided it was not necessary as it was unlikely to have much of a difference.
The medium and hard battles only saw a few minor changes in remaining resources in a few of the scenarios. Some extra hps lost here or there, maybe a few extra spells. I think it was 5 of the 8 medium and hard challenges across the levels actually results in no changes in used resources at all and 3 of 8 saw minor differences that nobody would really care about.
1 or 2 of the 4 deadly also had absolutely no difference in used resources. The others were also situations in which hp totals might differ in how scenarios play out in at least one of the three approaches - spells used changed a bit more at higher levels especially (11 and 17)... but it would not have been something that made the party try to rest earlier (IMO).
The twice deadly encounters saw the most change as you would expect. All 4 saw changes and one of the changes was quite drastic - the battle lasted several more rounds and the PCs took a lot more damage, using a lot more spells (Mariliths are resource eaters - they suck up hps). However, the other three were not earth shaking differences, although they were bigger.
In the end, I concluded that GWM is generally inconsequential for anything less than deadly and, although it can have an impact on really difficult changes, it wasn't as big as I expected it to be.