I'm trying to build a new schema to validate XML against for my job. But I'm having a hard time answering the question: can I and how do I create a complex element that has some elements that need to be in a set sequence and other subelements that do not? Ultimately I think I should be able to have opening and closing "sequence" tags and opening and closing "all" tags around two sets of elements, but xsd doesn't seem to like that. Here's what I have:
<xsd:complexType name="Original">
<xsd:sequence>
<xsd:element maxOccurs="1" minOccurs="1" name="AssetIdentifier" type="xsd:string">
<xsd:annotation>
<xsd:documentation>Definition: The Asset Identifier element is intended to
reflect the root of all following digital filenames.</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element maxOccurs="1" minOccurs="0" name="ArchiveID" type="xsd:string">
<xsd:annotation>
<xsd:documentation>Definition: The Filename element in this section is
intended to reflect the root of all the following derivative digital
filenames.</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element maxOccurs="1" minOccurs="1" name="Title" type="xsd:string">
<xsd:annotation>
<xsd:documentation>Definition: The known title of the asset. If no title is
known, one can be assigned; a number or letter sequence, whichever is
the most logical. Using the value "unknown" is also
acceptable.</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element maxOccurs="1" minOccurs="1" name="RecordDate" type="xsd:date">
<xsd:annotation>
<xsd:documentation>Definition: The actual recording date of the asset.
Estimates, partial dates, and date ranges (i.e. 19XX, Feb. 19-24,
1934-1935, etc.) are allowable, as is 'unknown'. Best practice, when
applicable, is to use the YYYY-MM-DD format in accordance with ISO 8601.
Even partial dates, i.e. 1990-05 should adhere to this
format.</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element maxOccurs="1" minOccurs="1" name="FormatType" type="xsd:string">
<xsd:annotation>
<xsd:documentation>Definition: The format of the analog asset, i.e. Open
Reel, Grooved Disc, DAT, Cassette, VHS, 16mm film, EIAJ,
etc.</xsd:documentation>
<xsd:documentation>Best Practice: The MediaPreserve maintains a list of
controlled vocabularies organized by media type at: www.dontknowyet.com.
However, MP opted to meake this an unrestricted element in the event
that other ogranizations have their own controlled vocabularies in
place.</xsd:documentation>
</xsd:annotation>
</xsd:element>
</xsd:sequence>
<xsd:all>
<xsd:element maxOccurs="1" minOccurs="0" name="StockBrand" type="xsd:string">
<xsd:annotation>
<xsd:documentation>If known definitively</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element maxOccurs="1" minOccurs="0" name="TapeModel" type="xsd:string">
<xsd:annotation>
<xsd:documentation>If applicable. Usually applies to DAT tapes, open reels,
and wire recordings.</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element maxOccurs="1" minOccurs="0" name="TapeWidth" type="xsd:string">
<xsd:annotation>
<xsd:documentation>Typically only applicable for open reel
audio</xsd:documentation>
</xsd:annotation>
</xsd:element>
</xsd:all>
XSDs unfortunately do not allow what you're trying to do (combine <sequence/>
and <all />
inside a single complex type or element). You might be able to achieve something similar with a nested content model, but note you can't nest <all>
except under another <all />
, otherwise you must define it in another element. You can however, nest either <sequence>
or <choice>
under each other.
From my understanding of XSDs, you have 3 viable options.
The first is to nest all the elements you want under <all />
to be contained within their own sub-element:
<xs:complexType name="Original">
<xs:sequence>
<!-- AssetIdentifier to FormatType left out for brevity -->
<xs:element name="Misc">
<xs:complexType>
<xs:all>
<xs:element maxOccurs="1" minOccurs="0" name="StockBrand" type="xs:string" />
<xs:element maxOccurs="1" minOccurs="0" name="TapeModel" type="xs:string" />
<xs:element maxOccurs="1" minOccurs="0" name="TapeWidth" type="xs:string" />
</xs:all>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<!-- For the above, valid XML would be: -->
<Original>
<AssetIdentifier>AssetIdentifier0</AssetIdentifier>
<Title>Title0</Title>
<RecordDate>2006-05-04</RecordDate>
<FormatType>FormatType0</FormatType>
<Misc>
<!-- Optional & order doesn't matter -->
<StockBrand>what</StockBrand>
<TapeWidth>1290</TapeWidth>
<TapeModel>Hey</TapeModel>
</Misc>
</Original>
Second is to nest those elements under another <sequence />
, which allows you to forgo specifying another sub-element, but now requires the elements appear in order as specified in the schema. Note that the nested sequence itself can be optional.
<xs:complexType name="Original">
<xs:sequence>
<!-- AssetIdentifier to FormatType left out for brevity -->
<xs:sequence minOccurs="0">
<xs:element maxOccurs="1" minOccurs="0" name="StockBrand" type="xs:string" />
<xs:element maxOccurs="1" minOccurs="0" name="TapeModel" type="xs:string" />
<xs:element maxOccurs="1" minOccurs="0" name="TapeWidth" type="xs:string" />
</xs:sequence>
</xs:sequence>
</xs:complexType>
<!-- For the above, valid XML would be: -->
<Original>
<AssetIdentifier>AssetIdentifier0</AssetIdentifier>
<Title>Title0</Title>
<RecordDate>2006-05-04</RecordDate>
<FormatType>FormatType0</FormatType>
<!-- Optional below, but must be ordered -->
<StockBrand>what</StockBrand>
<TapeModel>Hey</TapeModel>
<TapeWidth>1290</TapeWidth>
</Original>
There's a third option that is a bit of a 'hack', but still allows specifying elements go unordered, still remain optional, yet still appear adjacent to the other mandatory, in-order elements. This nests a choice (with maxOccurs="3") under sequence, inside the parent sequence (sequence > sequence > choice):
<xs:complexType name="Original">
<xs:sequence>
<!-- AssetIdentifier to FormatType left out for brevity -->
<xs:sequence>
<xs:choice maxOccurs="3" minOccurs="0">
<xs:element name="StockBrand" type="xs:string"/>
<xs:element name="TapeModel" type="xs:string"/>
<xs:element name="TapeWidth" type="xs:string"/>
</xs:choice>
</xs:sequence>
</xs:sequence>
</xs:complexType>
<!-- For the above, valid XML would be: -->
<Original>
<AssetIdentifier>AssetIdentifier0</AssetIdentifier>
<Title>Title0</Title>
<RecordDate>2006-05-04</RecordDate>
<FormatType>FormatType0</FormatType>
<!-- Optional, unordered, but there's a catch: -->
<TapeWidth>1290</TapeWidth>
<StockBrand>what</StockBrand>
<TapeModel>Hey</TapeModel>
</Original>
There's a catch with this 3rd option however, the maxOccurs="3"
on the <choice />
element renders the minOccurs
and maxOccurs
on the child elements (StockBrand
, TapeModel
and TapeWidth
) meaningless; which means those elements, while still remaining optional, can now appear more than once, so long as the cumulative total of elements is still 3 or less:
This becomes valid (2 of the same element + 1 more):
<TapeWidth>1290</TapeWidth>
<TapeWidth>1291</TapeWidth>
<TapeModel>Hey</TapeModel>
And this is still valid (3 of the same):
<TapeWidth>1290</TapeWidth>
<TapeWidth>1291</TapeWidth>
<TapeWidth>1292</TapeWidth>
And also this (just 1 occurence of 1 element):
<StockBrand>1290</StockBrand>
You could probably try to find another option by fiddling with the combination of sequence and choice nesting, but it's best practice to keep your schemas simple. Personally I would recommend the first 2 options over the third purely to keep your schema simple.