Again, how we think of what we are doing at the table matters. Thinking that it is a scene carries all kinds of implications that I believe influences GM adjudication. I know this, because when I think of it as a scene, I run the game differently.
Ooh! Ooh! Ooh! Goody. We are going to get into a discussion of what a "scene" is and how thinking about how you play a game is as important to how you play a game as the rules of the game. These are like two of my favorite topics.
When I use the word "scene" with respect to a table top game, I mean "everything that happens between handwaves" (granted, that's a bit of a tongue and cheek definition). That is to say, you know a scene is starting because you beginning to keep track of what is going on, and you are no longer hand waving player motion through time and place. The end of the prior handwave is marked by the "Bang", the important thing that establishes a scene is beginning, and the end of a scene is marked by a handwave called a "Cut".
A handwave is when things happen but they aren't really important enough to describe. How a table frames its scenes together is a big part of their procedure of play, and sometimes it's not really clear cut. Some cut->bangs are so small and have so little OOC conversation and book keeping between them, that they are practically indistinguishable from a "long take" (sometimes called "process simulation" or "purist for process" where every detail is treated as important). In your example of play, I would tend to treat what you described as a single scene, but it could probably be broken down into three scenes because you handwave the movement between the three or even possibly four separate establishing "shots" - one in the street, one before the door, one in the entrance to the gambling hall, and finally one on the third floor where the scene proper really begins.
The fact that nothing really happens in those establishing "shots" is why I would consider this a single scene. For all practical purposes, you could have just cut and banged to the scene where you were on the third floor balcony of the gambling hall, and indeed a lot of GMs would have done so. However, the establishing shots at least give the players some minimal understanding of where they are, and if you didn't have them you'd need to have a larger "Bang" to explain how and where they were before confronting Iron God Meng. Since you've expressed a preference for short less literary bangs, you've guided the players to the bang in a series of steps.
As you might have guessed from my variation on your example of play, I tend to treat each bang as establishing at least a potentially meaningful scene. If the players are stopped at the door to the gambling hall, that's because something meaningful can occur here. If the players are stopped on the street, then its because the street vendors are potentially important. If they weren't, I wouldn't have even mentioned them, and would have just jumped to the bang where they arrive at the gambling hall confronted by doorkeepers in blue garb.
Now, all this talk of "scenes" has probably made you uncomfortable. So if you don't think of this as a scene, what are you thinking?