Yeah, but wouldn't a character who dabbled prior to hitting 6th level end up limiting themselves over a character who maxed out a class and then went gestalt after they hit the cap?
Wouldn't a 3/3 Wiz/Fighter swordmage kinda character end up way behind a level 6 fighter who then started tacking on levels of wizard?
Well no, not really. The way gestalt works is that at each level you get all the unique class features of both classes and the best of the common class features.
So the difference between a Character who was a gestalted Wiz 1/Fighter 1 and a regular 2nd level character with a level in each class is that the regular character has an extra hit die and higher skill caps although he does suffer from having to deal with cross-class skills.
Otoh if you're gaining the gestalt level later then you'll only gain what's different between your old classes and the new ones. Since Wizards and Fighters have the same number of skill points you won't gain any new ones. In fact you'll basically get only spells, an improved will save and what, 2 bonus feats?
So let's look at two peak level e6g characters:
One leveled up as a straight Fighter 6/Wizard 6, the other as Fighter 3/ Wizard 3, and then gestalted to backfill the other 3 levels of each class so he winds up as a Fighter 3/Wizard 3|Wizard 3/Fighter 3.
The second character actually winds up much stronger because he gets to spend 3 full levels of his skill points using the Wizard skill list, and therefore actually has a decent Concentration rank. The First example either buys cross-class or goes without. They both have the exact same feats, hit points, BAB and saves.
Hmmm... That shows actually that the clear optimization path is to A) level first in the class with the skill list you care more about or B) Level first in the class with the fewest skill points. It also means that unless carefully plotted out, a E6g character is likely to be slightly less optimized than a regular gestalt character because you don't combine the skill lists on your first 6 levels. Which means there really is an advantage to multi-classing.