I mean to be a little fair - ChatGPT is tuned to be a chatbot. It's not really supposed to be churning out literary text, it's supposed to be responding to queries as if it were a customer service representative on the other end whose job it was to answer whatever inane question you happen to be asking them. AI art tools are tuned to generate impressive looking output - if there were an AI art tool that was trained to churn out emojis or memes and you asked it to generate something in the style of a van Gogh you'd have the equivalent of the chatbot being asked to generate something literary.AI art can be beautiful and stunning. The text seems to really trite.
That said - text is actually harder to work with in a learning context than images, and the models around images are older and better understood than the text ones. Also an image is static while text flows through time - asking an AI art bot to generate consistent animation is a harder task than asking it to generate still images. That would be the equivalent of asking it to generate a lengthy text.
(There are other problems too - large language models by themselves are actually a bad fit for what people are trying to make them do. The fact that they work at all is impressive, but the upper limit of what they're going to be able to do is going to disappoint a lot of folks in the coming years I suspect.)