Click to See Complete Forum and Search --> : XML: attributes vs sub-elements?
CaptainPinko
01-11-2005, 02:53 AM
when creating a XML format (ie .XSD) how do you decide whether you want to describe an element by sub-elements (ie nested elements) or attributes?
The specific example I'm thinkinf of now is I'm trying to map a board game map editor. Every territory will have a unique identifier as a name, a type (land, sea, mountain, canal, etc), a value, and a list of territories it is adajecent to.
Right now I'm thinking that the name type & value should be properties with the list of countries being in a sub element. However I'm worried this will lead to unwiedly large starting tags even with reasonable names like "Western Equatorial Rain Forest". On the other hand having everything as sub elements seems a little inelegant and heavy-handed.
I'm using the W3 Schools XSD Tutorial (http://www.w3schools.com/schema/default.asp) but it doesn't really give any good advice.
Is ther any cute little rule for deciding how to organise this? I'm think of OOP where you describe your problem. Where you find a noun make and object, an adjective/description a property, and a verb a method.
CaptainPinko
01-11-2005, 04:07 AM
Here is the current schema I'm considering, I was wondering if anyone had any constructive criticism or advice.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- ************************************* -->
<xs:simpleType name="territoryID">
<xs:restriction base="xs:string">
<xs:minLength value="1" />
<xs:maxLength value="6" />
</xs:restriction>
</xs:simpleType>
<xs:complexType name="borders">
<xs:sequence>
<xs:element name="neighbour" type="territoryID" minOccurs="1" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
<xs:simpleType name="terrainType">
<xs:restriction base="xs:string">
<xs:enumeration value="land"/>
<xs:enumeration value="sea"/>
</xs:restriction>
</xs:simpleType>
<!-- ************************************* -->
<xs:complexType name="territory">
<xs:sequence>
<xs:element name="borders" type="borders" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute name="id" type="territoryID" use="required" />
<xs:attribute name="name" type="xs:string" use="optional" />
<xs:attribute name="terrain" type="terrainType" use="required"/>
<xs:attribute name="value" type="xs:integer" use="optional" default="0"/>
</xs:complexType>
<xs:element name="map">
<xs:complexType>
<xs:sequence>
<xs:element name="territory" type="territory" minOccurs="2" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="name" type="xs:string" use="optional"/>
<xs:attribute name="version" type="xs:int" use="optional" default="0"/>
<xs:attribute name="authour" type="xs:string" use="optional"/>
</xs:complexType>
</xs:element>
</xs:schema>
Also, I was wondering if it is bad style to have your types named the same thing as the only element that is of that type (eg "neighbour" and "territory")?
blingbling!!
01-11-2005, 05:29 AM
Your schema looks fine to me, I tend to write mine in the same style as yours, with named complex and simple types at the root level which are then refered to under the 'document element' with named types. As far as the 'little trick' goes -- i'm not sure there is a preferred way to do these things - it depends what you want. If all you need is to verify the validity of instance documents then i'd stick with whatever works. In some cases the stucture makes a difference, say if you're using XMLBeans (http://xmlbeans.apache.org).
BTW - the W3Schools site is OK, but you should check the W3C site for a meatier tutorial http://www.w3.org/TR/xmlschema-0/ - there is a bit of 'style' stuff mentioned in there I think.
hth
--Robin
bwkaz
01-11-2005, 09:18 PM
It's exactly the same semantics to use an attribute with a required value and a sub-element. For example, the following two are basically equivalent:
<region name="region1" type="type1" value="10000" /> and
<region>
<name>region1</name>
<type>type1</type>
<value>10000</value>
</region> The only difference would be if you want to allow more than one type or name (or any other attribute), which doesn't make sense here.
If you want to select attributes by their "name", then it might make sense to make "name" an ID attribute (like the id attribute in XHTML). That may not be possible with XSD, though, and it may only make sense if you're using a DOM where you have a getElementById function.
As for which is the "best", it's usually personal preference. But if you're going to be doing any kind of XPath stuff, it might be easier to make everything elements (at least, if you know the amount of XPath that I do -- which is "not hardly enough to do stuff with attributes" ;)).
CaptainPinko
01-12-2005, 01:37 AM
from what I have been looking at it seems like XSD only provides that the file is syntaxically correct.... not that is coherent in anyway (just like 5/0 is syntactically correct).
it looks for ensuring that IDs are unique, that all <neighbours> refer to actual territories in the map I'll have to resort a script w/ SAX or someting.
Oh and for anyone interested I highly recomend the beer free ALTOVA XML Spy 2005 Home Edition (http://www.altova.com/products_ide.html) .*
* I'm not affiliated in any way.
bwkaz
01-12-2005, 08:53 PM
Originally posted by CaptainPinko
from what I have been looking at it seems like XSD only provides that the file is syntaxically correct.... not that is coherent in anyway (just like 5/0 is syntactically correct). That's the way I understand it. DTDs operate the same way -- all that they ensure is that your SGML or XML file conforms to the document type (or schema) that they define. They don't ensure that it makes sense.
However, the HTML DTD defines the id attribute as type ID. This means that in HTML, if two different elements in one document have identical values for the id attribute, that document does not validate against the HTML DTD. If XSD has a way to define the same thing, then it should be the case that duplicate name values (or whatever attribute is your unique identifier) make the document invalid.
As far as element versus attribute, it's usually a matter of preference. I like elements wherever possible (for the above reason: I don't know as much XPath as I'd like to).
blingbling!!
01-13-2005, 06:57 AM
There is a way to specify Keys and Keyrefs in XSD, using the key and keyref elements. http://www.w3.org/TR/xmlschema-0/#specifyingUniqueness
I found some great articles last night at XML.com - here's some links in case you've not seen them:
This article (http://www.xml.com/pub/a/2002/11/13/normalizing.html) is about xsd design (not a simple tutorial), and towards the bottom of the page there are several links to similar articles. NOTE: There is a typo in one of the examples in the link above - see if you can spot it!
hth
--Robin
CaptainPinko
01-13-2005, 05:04 PM
Originally posted by blingbling!!
There is a way to specify Keys and Keyrefs in XSD, using the key and keyref elements. http://www.w3.org/TR/xmlschema-0/#specifyingUniqueness
I found some great articles last night at XML.com - here's some links in case you've not seen them:
This article (http://www.xml.com/pub/a/2002/11/13/normalizing.html) is about xsd design (not a simple tutorial), and towards the bottom of the page there are several links to similar articles. NOTE: There is a typo in one of the examples in the link above - see if you can spot it!
hth
--Robin
Sweet man, thanks! I'll check that out as soon as possible.