Let’s explain this a slightly different way. My barbarian has just found out that his detested cousin has taken advantage of his absence to usurp his position. He encounters a bar looking to let off steam.
The DM briefly describes the bar in two lines. Large mirror, wood panelling, halfling barman on a stepstool.
I, as player, have an image in my mind’s eye as to what this bar looks like. The image is more detailed than the description: that’s how humans work, and without that RPGs probably wouldn’t be possible.
In game:
My barbarian is spoiling for a fight. He walks up to the big guy at a table with friends and rudely knocks off his hat.
The DM didn’t establish that there was a big guy at a table with friends. He didn’t establish that he was wearing a hat to knock off.
The alternative, which would break immersion for some posters, would be to interrupt their action declaration.
1. Ok, is there anyone in the bar?
2. What does he look like?
3. Um, does he have a hat or something I can knock off?
You aren't declaring just an action. You're declaring an action and populating the bar with specific patrons and saying what the patrons are wearing. Again, perfectly fine move in some games, not standard by the book D&D.
What happens if you envision a crowded bar, the guy next to you envisioned an empty bar with just one lone drunk passed out at a table, the DM envisioned a crime scene where as your eyes adjusted to the dim interior you realize everyone is dead and the halfling behind the bar is holding bloody knives and they just hadn't gotten that far into the description?
Instead, ask if the bar is crowded and what you're hoping to find. It's just a different play loop and not one that has ever broken my sense of immersion. On the other hand it would break my sense of immersion if another player was suddenly taking over narration and description of the world.