I have to agree here... It's great that someone is putting in the work to get all of this codified, but the xml could be much less verbose. The GUID's are ugly and really unnecessary as it would be fairly trivial to define a unique, readable source attribute to allow redefinition with new rules- or use a namespace as above.
d) is definitely important- it's rather the point of having an ID, no?
With e) I would even eliminate more of the full text as it could be easily created from the attributes or even declared once as an xml entity.
I don't buy the code generation argument. There is nothing in XMLBeans that requires complex UIDs. A simpler schema will allow people to find errors, update rules, and contribute in other ways.
As far as validation- that's the point of the schema, isn't it? It is easy enough to find errors if an xml document is non-conformant...
Just my 2c. Good initiative with this project.
Datasets based on this schema are not intended to be handwritten. I handwrite them bc I created it, understand what is intended with the schema, and have a wonderful tool for assisting me in handwriting. It is intended to be used and written by programs.
The point of the GUIDs is to make sure anyone who adds content does not inadvertantly use another key being used by someone else in a completely seperate and unrelated dataset. When yet another person decides to use both of them for whatever reason, that person can be safe in knowing that they wont have duplicate keys. (yes, i know, GUID is not truly unique, but it's awfully darn close and a lot of thought went into them to make them that way, hence the standard)
Naturally, XMLBeans has nothing to do with GUIDs. It does however, have a lot to do with making objects as i'm sure you are aware.
I specifically made some things verbose as they have different properties than other related things, as well as different uses. I wanted to be able to define the specifics of something in one area, and then load a much simpler reference in a different area. In portions that have similar data, forexample, a power's keywords, I wanted to make sure that the keywords are segregated into grouped categories. I don't want to mix damage types with effect types or accessory keywords. Some powers also describe offering a choice of keywords for the user to select and apply, such as dragon's breath. I wanted to differentiate these choice of selection keywords from regular keywords.
Additionaly, I wanted there to be the ability to have mulitple choice groups, such that I can make choice (1) in a first group of items, and choice (2) from a second group of elements. To represent things like this, I have to make things a bit more complex than what some others above have suggested.
Back to the GUIDs, my intention with SQL is for there to be many tables, each with an entry containing the guid. The guid is a indexable unique key, but shouldn't be the Primary Key, as that should be some sort of integer. Rows in other tables would use that integer as a foreign key to link themselves together and provide consistency in the DB. That begs the question then, "how do I make sure new data i'm creating or importing in my local database does not unintentionally overwrite data in someone else's local database? Surely they could just as well have a diferent integer for the same item that I have." My solution, GUID stamped entries. An import program would look at an item's ID attribute, then check the apporpriate table via index to see if that item already exists. If it does, it can then compare the source datasets and the versions to see if it is importing new data. It can then add that item to it's tables generating an integer Primary Key for reference in other tables. As we all know, integers are often the best sql datatype for primary and foreign keys. As i've said above, the GUID is also used here when you create your own data and want to share with others, you primary key integer won't be very useful to others, since it may already be in use for a different item in their database. Thus, you supply your own GUID for the item you wish to share from your tables. (Any program, when adding new data, should be creating GUIDs to insert into the DB as well.)
For those wondering just what in the nine hells a GUID (or UUID) is:
Globally Unique Identifier (Universally Unique Identifer)
It can be represented several ways, I use ones that are based off time and node (
version 1 UUIDs) using hex in a
8-4-4-4-12 format. This is the
UUID Generator I use most often when I am inputting data by hand. There is also version 4 that instead of time / node, uses random numbers for generating. There are several libraries for various programming languages to do they same.
While UUIDs are not gaurenteed to be unqiue, the keyspace is so large (2^128) that it is widely felt that the chance of overlapping is sufficiently small, when used for a specific purpose such as differentiating similar data, there is a presumption of uniqueness.
And knowing is half the battle.
(the other half involves lots of guns)
-J