Manbearcat
Legend
The part that's tricky is "provided the non-combat abilities are clearly meaningful in play." The problem is that D&D is (historically and critically speaking) a combat/wargame system which permits some roleplaying on the side. Which is to say, combat and non-combat utilize totally different architectures (when there is any non-combat architecture at all, anyway). This makes the application and utility of non-combat rules and features much less consistent than combat mechanics. I've run a game where the PCs were desperate to level up so they could take NWPs (2e) to learn new languages, even though there was plenty of fighting. I've also been in plenty of games like my current one where languages and non-combat are just about irrelevant, merely an irritant to be hand-waived by disbursing a Helm of Comprehend Languages. In a game where both are possible and accepted and encouraged playstyles, I don't see how you could possibly evaluate non-combat features against combat features in any meaningful way.
It is difficult, to be sure, but I think if you construct a framework of sturdy and unified constants, you have a better chance of evaluating the variables (especially as system experience gathers).
For instance, in 4e, you have conflict resolution in the arenas of (1) tactical combat resolution and (2) non-combat (skill challenges) resolution. What are our constants?
1) Encounter XP rewards and results within the fiction (framing your PC within that challenge).
2) # of successful contests versus # of failed contests ultimately dictating resolution.
With the two of those schemes in play, we have a reasonable opportunity to evaluate the potency of build choices based on how 2 affects 1.
However, there are a lot of variables when evaluating PC build choices as well. In combat, you have a lot of built-in opportunities/synergies to force-multiply and to wipe out your own tactical disadvantage or enemy advantage/momentum. In Skill Challenges, the stock foundation is not constructed in this way. Therefore, the cost of a single failure is more weighty. Accordingly, the value of an individual force-multiplier (Cat's Grace giving + 2 to all Dex Skills until the next extended rest), the value of turning a weakness into a strength (Secrets of the City allowing you to make a Streetwise check for various checks), and the value of loss mitigation (Fast Talk allowing a reroll of a failed Bluff/Diplomacy/Intimidate check as another Bluff check) becomes more weighty.
There are other variables in the evaluation of course, including how often the GM frames conflict resolution as Skill Challenges versus combat. If you're looking at a ratio that swings wildly in one direction or another, then the value of investing in a PC build resource to further the success of that specific arena is perturbed (possibly to the point of err...pointlessness). Whether the modality of most 4e groups tend toward combat primarily or more even distribution, I don't know. My testimony is much more akin to @pemerton 's in that I run a pretty even distribution and my players have strong confidence in the relative potency of their investment in non-combat resolution resources and, given those things, they do so regularly.