How I would do it
Here's my opinion on how this idea needs to be started. In the above-mentioned thread, I point out that most character classes provide relatively similar challenges compared to other classes of the same level. I think this needs to be used as a starting point. I think it is most apparent that a lot more time has been spent making sure the PC classes balance against each other than in making sure the monster CRs balance against NPC CRs. Before I begin, let me first address some potential refutations for this being a poor place to start.
Battling a fighter requires different abilities than fighting a cleric.
Quite true. With a few exceptions (fighter/barbarian, sorcerer/wizard), most character classes provide widely different playing experiences compared to each other and require widely different tactics to overcome in combat. Against bards, the healthiest defense is a silence spell and a big weapon. Against wizards, an anti-magic field or spell resistance would be in order. The catch is, the abilities required to successfully beat these classes are found at every level. At low levels, a party might use the aid another action to improve their chances of hitting a high AC fighter NPC. At middle levels, a party might use stoneskin to lessen the fighter's blows. At high levels, they might disjoin the fighter's magical equipment. The point is, each class has weaknesses that can be exploited at every level and those are fairly balanced against each other. There's no class that is virtually impossible to defeat until you reach Xth-level.
Character classes aren't balanced against each other; monks are wimps. How is a monk equally as challenging as a barbarian?
This is a very good point. Many classes have different power curves. A fighter, for example, progresses very evenly from 1st to 20th level in what I would call a linear progression. A wizard however, gains increasing power each time he gains a new spell level. This is somewhat offset by the fact that he only gains a new spell level every other level, but nevertheless his power increases by what I would call an exponential progression. So by the time the barbarian is using greater rage, a monk in the meantime has become one of the most difficult classes to kill in the game.
Why don't I think this is a big deal? Usually because there are balancing factors. Magic items are some of the biggest factors. For instance, a wizard has the advantage of fly, but a barbarian can simulate that with winged boots. Second, although power curves for classes may be very different, they average out over time. Consider the strength of a 1st-level fighter over that of a 1st-level wizard. From a PC's point of view, a wizard is going to run out of spells long before the fighter is going to run out of hit points and armor class. From an NPC's point of view, a wizard is going to run out of spells before he inflicts much major damage; after that his only defense is a lucky critical hit with a crossbow or staff. However, by around 10th-level, the wizard gains the ability to really hold his own against the fighter. He has more defensive spells at his fingertips and he can finally deal an even amount of damage as the fighter each round, and usually to a wider area. By 20th-level, a wizard has surpassed the fighter in power. He can deal more damage, boost his AC and hp to impressive levels, transform into creatures capable of melee combat, and even alter reality on a limited scale. The fighter is still usually a crucial part of any PC party, but by this point, the wizard is a bit tougher than a fighter of equivalent level. However this difference is not enormously pronounced. Fighters of this level usually have magical defenses that protect them so while not quite as versatile, they still have more staying power.
Monsters are supposed to be tougher than NPCs. Everyone knows NPCs are a weaker fight but they are supposed to be smarter.
First of all, this is an overgeneralization. Most NPCs may be smarter than a minotaur, but orc NPCs usually aren't. For that matter even wizards have a hard time rivaling the intelligence of mind flayers. I think the intelligence of a creature (i.e. how effective the creature ought to be played) is part of what can make it difficult or easy. However that should apply as equally to NPCs as it does to monsters. Barbarians are difficult because they mindlessly charge and wreak havoc upon hit points. Wizards are difficult because they plan for every contingency and always have just the right spell prepared. Likewise, you wouldn't expect a good melee with a spell-weaver any more than you'd expect an ooze to try to flank you.
Now that I've addressed these concerns, let me get to the meat of my explanation. First of all, I think the NPCs from the DMG should be used as a baseline. They are all balanced in the sense that they have the same value of magical equipment at each level, they all use an elite array for ability scores, and most of all, they are all based on what I consider to be one of the most balanced aspects of the game: the relative power of character classes to each other. Hence, each one ought to be a comparable challenge to a party of typical PCs at each level.
I think there are two ways to go about doing this. The first is a bit unscientific. It involves examining a monster's abilities and seeing which class it behaves most like. A babau might be compared to a rogue, a mind flayer to a wizard. From there, you look at the creature's most powerful abilities, AC, hp, saves, ability scores, etc. and see which level of NPC that creature is most similar to. Although this system is enormously useful in cut and dry situations (like comparing ogres to barbarians), it is tricky to use with some creatures who are rather abnormal. How would you classify an aboleth? Are they more like wizards or more like fighters? What if they aren't really more like one class than another?
I feel the second approach is more realistic, but it involves more work. Each creature should be evaluated on the following criteria: hit dice, AC, average damage, saving throws, spells/special abilities, and magical gear. Thereafter, each criterion should be assigned a score that is an indicator of what level that criterion is appriximating.
The score for hit dice is simply the number of hit dice the creature has.
The score for AC should be assigned based on that creature's primary role. If it is primarily a melee combatant, then its score should be measured against the classes who excel in melee (barbarian, fighter, paladin, ranger, monk). The lowest level at which an AC the creature has is available should be the score for AC. If the creature is similar to two different types of classes (such as a fiend with lots of spell-casting and good melee) then compare to the two types of classes and take the average.
Average damage should be compared similarly to AC. For creatures that either don't deal damage or use spells and special abilities as their primary means of overcoming combat, skip this criterion.
Saving throws should be compared similarly to AC. Each saving throw needs to be determined individually based on whether that is an in-class saving throw or not. So a creature with a good Fort save progression should have its Fort save compared to other classes with good Fort saves and the lowest level such a save is achieved used as the score. Once a score for each saving throw has been determined in this way individually, the result is averaged.
Spells and special abilities should be measured up to the spell-casting classes. A score should be assigned based on the level that the most powerful abilties of the creature can be approximately duplicated using the lowest level caster possible.
Finally, any magic gear included in the creature's entry needs to be taken into account. The average value for a monster's treasure should be included in this part of the score is the monster usually utilizes its treasure to its advantage somehow (like how a dragon might wear the magical rings and amulets in his hoard).
When all of these scores are determined, they should be averaged. Finally, the size of the creature should be taken into account. Unless a creature's small size plays upon a strength, the size should modifier the CR accordingly by +1 for each size greater than medium and -1 for each size smaller than medium. The final score reflects the CR of the creature, as based upon comparison to the character classes.