BryonD, what I would object to in your post are "shoe-horned" and "pop quiz". I can see where you're coming from, obviously, but these carry a pejorative tone that I don't want to accept. (I'm sure that pejorative tone accurately captures your aesthetic response - but I would prefer a description that doesn't build that experiential response into the description itself.) (EDIT: I'm skipping over your "different cities" point because I agree that simulationist and non-simulationist play are different experiences that satisfy different RPGing preferences.)
Starting with shoe-horned. In a sense, when I state up a monster as AC X, hp Y I've also made a mechanical decision about how hard the encounter will be. Now, I imagine your response (along the lines upthread of our exhange about the pirate and the knight) is that the monster's stats are derived from an ingame description. So that it is story/gameworld first, mechanics second.
My response to that is that in choosing to set up that particular story/gameworld, it defies belief that you as an experienced GM don't have in mind some sense of the mechanical implications of the decisions you're making. I almost want to say: the simulationist inhabits a type of self-constructed "naivety" about the mechanics, and the non-simulationist is just being upfront and reaslist. Of course that would be wrong - the simulationist isn't being naive, but is deprioritising the metagame/mechanical aspects.
But I don't think that prioritising the mechanics equals
shoe-horning. Imposing a structure isn't, thereby, fitting something in that wouldn't fit in by itself. Sometimes I choose a particular monster because I think the fight with it will go a certain way. I did that yesterday - the scenario called for a Large bear, and in prepping I placed a single elite level 13 dire bear, rather than a lower level solo bear (a level 7 or 8 solo would be a rough XP equivalent) because I wasn't sure exactly how many 10th level PCs would be facing it at once, and thought the slightly swingier high level elite would produce a more interesting range of outcomes across a wider range of possible PC party size. The fact that the decision is driven (in part) by the mechanics doesn't mean I have to
shoe-horn in the outcomes.
Now a skill challenge is a bit different from a combat. Hit point attrition has a certain robust story content at a D&D table - every knows that we're wearing down the monster, even if the precise nature of that wearing down is a bit up for grabs (except perhaps in 3E, where at least some players interpret hit points solely as meat, in which case even the precise nature of the wearing down is probably known by all). Whereas the ingame interpretation of successes in a skill challenge is much more up for grabs every time. Nevertheless, in a 6/3 skill challenge I describe the results of successes in such a way as to give a general feel for how things are progressing, and also add a bit of quasi-mechanical commentary - "You feel like you've only just started" vs "You feel like you're pretty close to getting the job done" - to add an extra bit of infromation to the descriptions. So no one was shocked when the dwarf's last move turned out to be the success - even though they weren't 100% sure that it would be - just the same as if he'd struck the killing blow in a combat.
So, again, while mechanics are playing a role here, I don't see it as
shoe-horning.
As to "pop quiz": the player of the dwarf had already set up the "plug the spring" idea because on his first turn he considered the situation as I'd described it, thought "as a strong guy with a big axe my best bet here is to probably plug the spring", and then I suggested that (i) if he wanted to do that he had to knock off a lot of stone, so I would require him to expend one of his encounter close burst powers, and (ii) to get the stones in the right place would also require a Dungeoneering check. (This is roughly following the model in DMG p 42.)
(EDIT: in my prep notes I'd expected the PCs to try and expunge the spirit, and had made some notes on how Religion and Arcana checks might play out. The idea of plugging the spring instead came as a surprise to me.)
When it came round to his next turn the wizard and paladin had already picked up on his idea and done more stuff with the stone. He then went in for (what he hoped would be, and what turned out to be) the last big effort. The player knew what he wanted to do - use Come and Get it to "pull" the water away from the rocks, so he could push them into the holes. I, as GM, suggested that what might make more sense is if, using his skill at timing his polearm strikes in relation to the fluid movement of the battlefield (as exemplified in part by this Come and Get It power), he waited for the water to surge up again and then pushed in the stone. The player liked that, and went for it. He made an Athletics check for being in the water, and an attack roll to actually drive the stone home. Expending Come and Get It meant that the issue of timing was not a problem for that PC (Come and Get It in this context, as in many other occasions of use, acted as a sort of fate point - "my PC's timing perfectly matches the flow of battle" - then as a model of an ingame action like taunting or luring).
Again, the mechanics are informing the decisions here. Personally, I don't feel that they do so any more intimately than in (for example) Rolemaster, where players routinely scour the character sheets looking for an applicable skill or spell before deciding how to tackle a situation. A game with a much-stripped down character sheet (especially for fighters), like Basic, would play a bit differently here. I don't have enough experience with 3E to really make a comparison.
But I think it's certainly not "fiction last". Why did the fighter jump into the water? Because otherwise how can he manipulate the stones at the bottom of the pool?
Another example - when the PCs actually encountered the bear, they decided to tame and befriend it instead of fighting it. The ranger and the wizard made Nature checks. The range was adjacent, so reached out to the bear. The wizard was at range, giving rise to the question - how does he actually calm the bear? Answer: he used Ghost Sound to make soothing noises and Mage Hand to stroke it. The sorcerer wanted (i) to back away so as not to get slammed in case the bear remained angry, and (ii) to try and intimidate the bear into submission. I (as GM) asked the player how, exactly, the PC was being intimidating while backing up? His answer: he is expending Spark Form (a lightning-based encounter power) to create a show of magical power arcing between his staff and his dagger, that would scare the bear. A successful Intimidate roll confirmed that the light show did indeed tend to subdue rather than enrage the bear.
Again, this is not fiction
divorced from the mechanics. But I don't think it's fair to call it "pop-quiz" roleplaying. The engagment with the fiction is permeating the whole thing, and shaping the way that mechanical resources and deployed and that deployment adjudicated.
FINAL EDIT: I've also taken this to a
new thread, to see if we can get some more actual play discussion of how mechanics and fiction interact in gamepla.