So one way to approach
4e balancing from 'first principles' is doing damage-budget balancing.
The PCs have a certain damage budget for a fight (which scales up in 'harder' fights due to use of action points and daily powers), which can be expressed as a damage curve (to describe the spike at the start as they unload encounter powers etc).
Monsters, similarly, have a damage budget curve.
As it is not expected for PCs to go down, but it is expected for monsters to go down, you have to take into account monster attrition.
Imagine a very simple situation. Both PCs and Monsters have a flat damage output curve per "up" creature on each side. And there is no area damage.
The PCs have sufficient healing to keep themselves up, but the monsters don't.
The PCs have enough damage output to kill all 6 monsters in 6 rounds. That means they have the damage output to kill 1 monster per round, roughly.
The damage output of the monsters comes to 6+5+4+3+2+1 times the one-round damage output of a monster, or 21 times a single monster's one-round damage.
If the monsters arrived 1 at a time, they would end up doing 1+1+1+1+1+1 times the one round damage output of a monster, or 6 times a single monster's one-round damage.
By attacking in sequence, 6 monsters became 3.5 times weaker.
As noted, this neglects huge swaths of detail -- no burst damage, no area damage, etc. But it does give a ball-park figure.
Now imagine that wave of 6 monsters attacked in two groups of 3.
3+2+1+3+2+1 = 12, or roughly
half as much damage output as all 6 together.
An overlap in this ends up changing things. Put a mere 2 round delay on the second wave:
3+2+4+3+2+1 = 16, up to about 75% of the power of every monster attacking at once.
A naive solution is to estimate what fraction of each wave will be around when the next wave hits, and then do a simple sum of estimated damage output, then scale the XP. This, however, isn't quite right -- the reason is somewhat mathematical.
...
Now, this gets even more mathematical from here on out.
The formula for n monsters attacking compared to 1 is n(n+1)/2 -- a triangular sum.
But multiplying the size of an encounter by n generates n times as much XP -- while the encounter ended up getting more than n times more dangerous.
This is because the XP scale is non-linear in terms of damage-budget balancing. A doubling of XP generates an encounter that is more than twice as dangerous, in terms of damage dealt to the party.
This also means that when you half the damage output of an encounter, you less than half the "XP value" in terms of challenge of the encounter.
A curve that works reasonably well for the range of expected PC encounters is x^(3/2).
Thus:
Suppose you have two waves A and B, in which the first wave is expected to be defeated (or nearly completely defeated) before B arrives.
Then the rough XP value of the encounter is [ A^1.5 + B^1.5 ]^(2/3).
Let's take a level 2 followed by a level 3 enconter. They are worth 625 and 750 XP respectively. Naive addition gives you a 1375 XP encounter, or a level 6.5 encounter!
Put through the transformation, we instead get ~1100, or a level 5.5 encounter.
....
Another way is to exploit the XP curve.
If you have two L X encounters back-to-back, it is a L X+3 encounter.
Three level X encounters back-to-back, it is a L X+4 encounter.
Four level X encounters back-to-back, it si a L X+5 encounter.
Or, add up the XP total of the encounters in question, and then subtract 1 level for every "wave" beyond the first (assuming all waves are roughly equal).
This works out to the same thing, roughly, thanks to how the XP curve is shaped.
...
Note that the above is almost entirely pure theorycraft.