I don't see how cyclical initiative doesn't speed up combat - all other factors remaining equal. Declaring actions, even generally, and then rolling and collecting initiatives (or even counting them down) adds steps each round after the first compared to cyclical initiative. That's cyclical initiative's primary strength compared to initiative every round.
And differences in scalability? I'm not sure what differences you're seeing in rolling initiative once and sticking with that order and rolling every round that imply some difference in scalability. My experience suggests that rolling every round scales worse.
Are you assuming that the rolling each round isn't done per character but per side?
Keep declares general and loose, and they go fast. A key way to do this is to not allow gaming mechanics to creep into the declaration, even if you are using a grid. Even if the fighter can see that the door is five squares away, he doesn't say, "I run to this square." He says, "I run over by the door." I know this sound nitpicky, but it is absolutely critical to making the declarations work. As a bonus, it translates equally well on or off the grid. It doesn't translate well to a tactical module, which is a big reason why you want to switch to cyclic initiative if using one.
With rolling, I'll do it one of several ways, but I'll generally roll every round. You can roll once for each side. You can roll once for each major group. You can roll individually for the characters but not for the monsters. Or you can do the way I specified in the link, where the monster group(s) all get a 10 on their initiative roll, and then each player is rolling to see where they fit.
I like that, because it is ultra fast to resolve, but still lets each player rolll individually. Players roll. Meanwhile, I'm double-checking the monster initiative groups. Say I have three groups, which is a rare, fairly complex fight. The initiatives are maybe 12, 15, and 17. Those are now DCs for the characters to beat. I ask, "Who beat 17?" We have show of hands. And so on. Takes 15-20 seconds, and everyone know where they stand. With a single group of monsters, it's as fast as everyone can roll a d20 and say whether they met the single DC or not. After the first round, everyone knows what the DC is.
Since the characters on a side are all going at the same time anyway, all we really need to know is does character A go before this group of monsters or after them?
And of course this takes slightly longer than not rolling every round. However, now is where the saving comes in. Everyone knows what they declared. Everyone knows when they go. All the players that are eligible to go can act ... now! That's why in my version, I didn't allow sequencing of such actions. If the fighter and rogue go together, they act on their declarations without seeing what the other guy is doing specifically.
What gets cut out is a lot of analysis paralysis based on what other people have done, as well as the lack of attention that happens on waiting for everyone else to go. This is particularly striking when I use a single group of monsters. You are either going now, or the monsters are whacking you, or you just went or are about to go. You also get some handling time improvements on the DM side, as you are looking at monster hit points and defenses when they are getting whacked by several people, then switching over to monster attacks when they are doing the whacking. This is basic efficiency training--arrange so that you pick up an object or look at a statistic as few times as you reasonably can.
The DM getting the results from the players does take a flexible touch. If several players go together, you may very well need to take them in some order. I just go around the table, if it is necessary. Usually, though, there are only 2-4 players going at once, and the natural speed differences in them resolving their attacks means that the results come in staggered.
It scales because the difference between 4 players and 8 players is four extra declarations, plus the time it takes to get 4 results. It is not the time it takes for 4 additional players to learn that it is now their action, state what they are doing, roll, give the results, etc.
I don't pretend that someone trying this for the first time would see immediate and dramatic improvements. I had the advantage of already having played with various side-by-side initiatives in multiple systems. It does have its own minor skill set to learn. But I doubt anyone can give it a fair try with 5 or more players and not see a fairly significant improvement after a bit of practice.
