4e is dead easy to re-skin, making it easy to adapt to far more concepts than one might expect from the fairly consistent class & power formats. It's also easy to 'tweak' in small ways for the reasons you state.
I'm pretty much with you so far.
It is just exhausting, though, to try to add to extensively. Creating a new class is a Herculean task.
Yes and no. I mean if you were to set out to create an entire V-shaped PHB1-style class from total scratch this might be true, but why ever would that be needed? I mean suspension bridges are hard to build, that doesn't mean we don't build culverts over ditches, its a whole other order of thing. With something like 40 classes in existence, and about 20 of them being full-fledged 400+ power major classes with 3-5 builds each, it seems like doing so would be rather unnecessary, wouldn't it? Even if you want a 'new class' it must surely be close enough to an existing class that you can borrow its powers virtually wholesale. At that point is it really more work than adding a class to 2e?
Extensively re-working rules is a minefield, because they are already so neatly balanced.
In classic D&D, extensive re-working or additions were, in contrast, low-risk. One more broken magic item, class, race, or whatever wasn't going to break the game that much more. And, they were readily accepted by players. 5e very successfully aims for those same qualities. Really, the same should have been true in 3.x, and could have been, but for the rise of the Cult of RAW.
The entire illogic of this line of reasoning just bends my mind. First of all what makes it 'higher risk'? The results cannot possibly be worse than most of what already exists in 'classic' D&D, regardless of which system you use. If your choices are to play and homebrew 2e, then sure, its already broken, so how are you gaining vs playing 4e with one somewhat broken piece? Surely its still better.
There are 2 things to consider here: 1st of all 4e certain is already 'broken' in some degree. Charge optimized characters are stupidly ridiculous, Rain of Blows, Twin Strike, and every minor action/reaction easily triggered attack power are already pretty broken. Seriously optimized action denial is broken, so is damage-type focus optimization (frostcheeze, etc). The difference between this and 2e or 3e is its quite obvious. Point 2 here is that the system in 4e is QUITE ROBUST, because despite all the things noted above, it still works quite well, these 'broken' things don't break the whole system to bits.
Contrariwise to what you're saying, adding some bogus overpowered thing to 4e doesn't actually have that much impact. Suppose you added an item that tripled a character's damage output for one attack per encounter. That would be pretty much stupid broken in 4e. Nobody could even begin to argue it wasn't. Yet it still wouldn't break the game. Oh, it would make killing one monster per encounter quite trivial, but the GM would have no trouble simply adding an extra monster! Its hard to even find something comparable to use as an example for 2e where there are probably a dozen items and 2 dozen spells that if they are in play then any hope the GM has of making an encounter that challenges the party is out the window, unless it consists of just 'Save or you die instantly'.
There's barely any 'danger' of anything in 4e. Just add whatever the hell you want to the game and don't worry about it, the whole thing will just keep rolling along. I'm sure you CAN screw it up of course, but you have to go so utterly far beyond the bounds of what is conventional in 4e to do so that even most pretty bad idiot DMs won't go there. In 2e all you have to do is give away any of a number of pre-existing items that already have a chance to show up randomly in low-level play if you go by RAW.