Because. It. Is. Magic.
If my players wanted to play in a setting where everything operated under the assumption of codified rules, I would break out my Twilight 2000 box set.
Aha, so how is magic in D&D not completely codified, considering that it works without fail and can be reproduced at will with exactly the same result (speak: mechanics are always the same)?
[*]The guard's realmspike is 6ft long, steel tip affixed to a sanded oak shaft with rivets, and a red tassle dangles from the neck.
[*]The orc's urgsticker is 6.5ft long, with treantwood shaft surmounted by a pitted duskiron point collared with tiny spikes and vulture feathers.
[/LIST]
Mechanically, both are "spears/1d8/x3", though I don't tell the players this, of course, just as I wouldn't tell them if the orc's spear was poisoned or magical.
Exactly how is this different than the Merlin's Magic Missile versus Sauron's Sorcerous Slap of Force issue? By most reckonings, there's no difference save the presentation.
I sincerely don't understand the viewpoint that a fluff difference must reflect a mechanics difference, and am very much trying to.
Man, that's like saying all battle axes have to be single-edged Nordic-style affairs with a wooden haft and Celtic knotwork for decoration, because they all have the same speed factor and damage dice.
There is a bit more of a difference between glowing magic missiles shot out from a tuba and a knife which cuts at range (both magic missiles) than a spear with a 6ft shaft and a spear with a 6.5ft shaft.
First, the spear is clearly recognizable as a spear no matter what variant you have, not so with the magic missiles.
Second, the spears can be used for the same things and have the same limitations (except 0.5 ft length). Some reflavored spells here are vastly different from their originals.
For example the cutting at range knife is much more stealthy than the original magic missile and when its the 4E MM which can affect nonliving matter you can do some quite nice things with it casters with who use normal MMs can not.
Or take the sunflower seed spitter from a previous posts. Now you need ammunition for your spells. And so on.
When you only stay in "tactical combat mode" all those differences won't really matter, I agree. But by glossing over such differences you imo rob the players of the chance to be creative with those spells outside combat, either by saying that it doesn't work, hurting the flavour you actually wanted to create with the refluffing in the first place, or telling your players that they should be nice and not try to be creative.
Or you "say yes" and have unintended consequences where the refluffed spell is more useful than the original one.
Also, what do you think is "better"? The storm cleric having exactly the same spells as the sun cleric, just with a different look, or the two clerics actually having different spells?
In the end, saying that refluffing doesn't change anything doesn't work. It will change something. And thats why you should not do it on the fly just for coolness. If you want different spells, give them different spells. Thats even more "cool" and you do not have to worry about the refluff being better than the original.
It also helps the consistency of the world as imo its pretty illogical that dozens of spells look completely different from each other but behave in exactly the same way (and no, "Its magic" doesn't really cut it, as magic in D&D is very scientific and codified).