You’re not going to get good data. Your method as proposed in this thread will only show you how YOU interact with your system. That’s going to skew your results. You don’t know where your blind spots are; you don’t know what aspects of your system are harder to understand for people who didn’t design it.I am testing my ability outside of my home group to see if my system can still adapt as well is done in my own personal circles.
Plus there are those who might have fun with the challenge itself.
Pre-publication, if you want to see if your system is as robust as you claim, you need the input of people other than you and your inner circle.
If you want to know if others will have fun using your system, you need to let others use some form of it.

A real-world illustration:
In my Criminal Law class, the professor was fresh off a gig helping legislators rewrite sections of the Texas Criminal Law Code. IOW, he knew this stuff cold.
One day in class, he was interacting with a student who was ESL, and she interpreted a statute differently than he did…and differently than most native speakers probably would.
But instead of simply correcting her, he stopped. He went quiet as he read and reread the statute as she interpreted the language. After about 5-10 minutes of silence, he addressed the class: her interpretation relied on a less common use of the vocabulary in a particular passage, but it was a completely valid one. As such, a good attorney could make an assertion in court using that reading and win. That, he told us, was a legal train wreck waiting to happen, so as soon as class was over, he had a ton of phone calls to make.
That young lady changed the state’s law with her alternative understanding of the language.