Hic rhodus, hic salta; if you think it’s so easy then go spit out something great (no, really.) People say the same thing about prose writing, photography, and modern art, because they can imagine doing the basic physical action; if people started out with perfect manual dexterity then they’d think traditional painting was easy (it still wouldn’t be, there are many other skills involved.)
With eg StableDiffusion as a medium taking it seriously involves promptcraft (which you already noted), selection of many drafts (which is time-intensive and mostly requires the ability to see what works in a composition), and of course knowing what you want to create in the first place. Knowing the tools themselves, art history, the tools of all other visual media (you can throw in “volumetric lighting” because that’s what many other people do, or you can learn what it means and develop an intuition about when it would be helpful), color theory, anatomy, and so on are all things that contribute to the skill ceiling. One very weird guy, FlyingFoxDemon, is behind almost all of the viral AI-assisted videos that have come out so far, even though he’s far from the only hobbyist with access to high-end GPU.
Obviously you don’t mean this in the way that I do, but this is a good summary of my major claim here - that SD and the like are media in which humans work.
(“Fundamentally” I would actually say is false, human brains are physical objects, but once machines can really do every part of the workflow we have much bigger worries than anything to do with the art industry.)