I put this code on Github:
This is my code to validate an XBRL file using an XSD and a catalog.xml
listing the local paths to all the XSD files:
package validate;
import java.io.IOException;
import java.nio.file.Path;
import javax.xml.XMLConstants;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import lombok.extern.slf4j.Slf4j;
import org.apache.xerces.util.XMLCatalogResolver;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
@Slf4j
public class XsdValidatorTest {
public static void main(String[] args) throws SAXException, IOException {
// Compute path to the XBRL payload to be validated.
final Path xbrlPath = Path.of(
"/path/to/xbrl.xml"
);
final Source xmlStreamSource = new StreamSource(xbrlPath.toFile());
// Create objects to do the validation.
final SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
final Path xsdPath = Path.of(
"/path/to/sprcnt.0001.conttrans.request.02.02.report.xsd"
);
final Schema schema = schemaFactory.newSchema(xsdPath.toFile());
// Set up a CatalogResolver so that use the local XSD files and not look them up from the internet.
final String[] catalogs = { "xsd/catalog.xml" };
final XMLCatalogResolver resolver = new XMLCatalogResolver(catalogs);
schemaFactory.setResourceResolver(resolver);
// Create the validator.
javax.xml.validation.Validator validator = schema.newValidator();
// Capture the XSD errors so that we can report them back.
final XsdValidator.XsdErrorHandler errorHandler = new XsdValidator.XsdErrorHandler(xbrlPath);
validator.setErrorHandler(errorHandler);
// Validate and extract errors.
validator.validate(xmlStreamSource);
if (errorHandler.getErrors().isEmpty()) {
log.info("'{}' is valid against '{}'.", xbrlPath, xsdPath);
}
for (SAXParseException error : errorHandler.getErrors()) {
log.debug("Error: {}", error.toString());
}
}
}
The XBRL I am trying to validate is here: https://gist.github.com/robertmarkbram/d8c2b3ad7c0e23753e45ca996f561816 (removed because it's not important for my actual problem and the post is too long already).
The XSD file used is the same file from link:schemaRef
in the XBRL, just saved locally: http://sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.conttrans.request.02.02.report.xsd
Finally, the catalog.xml file contains references to all the XSD files I found locally when validating my XBRL file using Arelle.
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
... Removed lines because the post was too long ...
<system systemId="http://sbr.gov.au/icls/py/pyin/pyin.02.17.data" uri="xsd/sbr.gov.au/taxonomy/sbr_au_taxonomy/icls/py/pyin/pyin.02.17.data.xsd" />
<system systemId="http://sbr.gov.au/rprt/sprstrm/sprcnt/sprcnt.0001.private.02.01.module" uri="xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.private.02.01.module.xsd" />
<system systemId="http://sbr.gov.au/rprt/sprstrm/sprcnt/sprcnt.0001.private.02.02.module" uri="xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.private.02.02.module.xsd" />
<system systemId="http://www.w3.org/1999/xlink" uri="xsd/www.xbrl.org/2003/xlink-2003-12-31.xsd" />
<system systemId="http://www.xbrl.org/2003/instance" uri="xsd/www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd" />
<system systemId="http://www.xbrl.org/2003/linkbase" uri="xsd/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd" />
<system systemId="http://xbrl.org/2005/xbrldt" uri="xsd/www.xbrl.org/2005/xbrldt-2005.xsd" />
<system systemId="http://www.xbrl.org/2003/linkbase" uri="xsd/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd" />
<system systemId="http://www.xbrl.org/2003/XLink" uri="xsd/www.xbrl.org/2003/xl-2003-12-31.xsd" />
<system systemId="http://www.w3.org/1999/xlink" uri="xsd/www.xbrl.org/2003/xlink-2003-12-31.xsd" />
</catalog>
This works fine when connected to the internet:
2025-04-14 22:00:53,043 [main] INFO validate.XsdValidatorTest [XsdValidatorTest.java:44] - main - '/path/to/xbrl.xml' is valid against '/path/to/sprcnt.0001.conttrans.request.02.02.report.xsd'.
Now turn the internet off and it fails, not knowing what xbrli:item
is.
Exception in thread "main" org.xml.sax.SAXParseException; systemId: file:/path/to/sprcnt.0001.conttrans.request.02.02.report.xsd; lineNumber: 82; columnNumber: 171; src-resolve: Cannot resolve the name 'xbrli:item' to a(n) 'element declaration' component.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
... snip
at java.xml/javax.xml.validation.SchemaFactory.newSchema(SchemaFactory.java:628)
at validate.XsdValidatorTest.main(XsdValidatorTest.java:30)
FAILURE: Build failed with an exception.
Wednesday 16 April 2025, 12:40:23 pm
In response to @Ghislain Fourny's answer.
I updated catalog.xml
to:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<rewriteURI rewritePrefix="file:/Users/rob.bram/home/dev/superstream/sscore/src/main/resources/xsd/sbr.gov.au/" uriStartString="http://sbr.gov.au/"/>
</catalog>
I verified that path is correct, but I am still getting exactly the same result.
Exception in thread "main" org.xml.sax.SAXParseException; systemId: file:/Users/rob.bram/home/dev/superstream/sscore/src/main/resources/xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.conttrans.request.02.02.report.xsd; lineNumber: 82; columnNumber: 171; src-resolve: Cannot resolve the name 'xbrli:item' to a(n) 'element declaration' component.
How do I verify that the catalog is even being used? When I change rewritePrefix
to have a completely wrong path, the result is still the same. I added <logger name="javax.xml.catalog" level="DEBUG" />
to src/main/resources/logback-spring.xml
and see no logging at all from it.
Thursday 17 April 2025, 02:43:11 pm
I modified code to log each XSD we try to resolve:
final XMLCatalogResolver resolver = new XMLCatalogResolver(catalogs) {
@Override
public LSInput resolveResource(
final String type,
final String namespaceURI,
final String publicId,
final String systemId,
final String baseURI
) {
final LSInput lsInput = super.resolveResource(type, namespaceURI, publicId, systemId, baseURI);
final String resolvedSystemId = (lsInput != null) ? lsInput.getSystemId() : "null";
log.info("Attempted to resolve '{}', resolved to: {}", systemId, resolvedSystemId);
return lsInput;
}
};
And modified catalog.xml
in response to @Ghislain Fourny's most recent suggestion.
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
...
<uri name="http://www.xbrl.org/2003/instance" uri="xsd/www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd" />
</catalog>
I am still getting the same result (works when internet is on, same error when internet is off). I am also not seeing any output from XMLCatalogResolver
, leaving me very unclear about whether the resolver is being used.
Sunday 18 May 2025, 11:30:46 pm
I have marked @Ghislain Fourny's answer as correct, because that solved part of the problem. I needed to use catalog entries of the form: <uri name="..." uri="..."/>
. Thank you very much!
Thanks to a reply on another forum, I needed to change the order in which I created the objects to ensure that the catalog and resource resolver is added to the schema factory before I create a the schema object.
// Create the schema factory.
final SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
// Tell the schema factory to use a catalog and resource resolver to use local XSD files.
final String[] catalogs = {"src/main/resources/xsd/catalog.xml"};
final XMLCatalogResolver resolver = new XMLCatalogResolver(catalogs) {
@Override
public LSInput resolveResource(
final String type,
final String namespaceURI,
final String publicId,
final String systemId,
final String baseURI
) {
final LSInput lsInput = super.resolveResource(type, namespaceURI, publicId, systemId, baseURI);
final String resolvedSystemId = (lsInput != null) ? lsInput.getSystemId() : "null";
System.out.println("Attempted to resolve '" + systemId + "', resolved to: " + resolvedSystemId);
return lsInput;
}
};
schemaFactory.setResourceResolver(resolver);
// Create the validator using the XSD.
final String entryPointXsd = "src/main/resources/xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.conttrans.request.02.02.report.xsd";
final Schema schema = schemaFactory.newSchema(new File(entryPointXsd));
javax.xml.validation.Validator validator = schema.newValidator();
// Add an error handler.
final XsdErrorHandler errorHandler = new XsdErrorHandler(xbrlPath);
validator.setErrorHandler(errorHandler);
// Validate and output errors.
final Source xmlStreamSource = new StreamSource(xbrlPath.toFile());
validator.validate(xmlStreamSource);
if (errorHandler.getErrors().isEmpty()) {
System.out.println("'" + xbrlPath
+ "' is valid against '" + entryPointXsd
+ "'.");
}
for (SAXParseException error : errorHandler.getErrors()) {
System.err.println("Error: " + error.toString());
}
However, this left me with a new set of problems - the catalog.xml entries were not sufficient and needed adjusting. I still have issues with the entry for xbrl-linkbase-2003-12-31.xsd
, but I will make a new question for it because this one already has too detail in it.
The catalog spec supports the element rewriteURI
, which allows mapping the official URIs to local files, like so:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<rewriteURI rewritePrefix="file:///path/to/xsd/sbr.gov.au/" uriStartString="http://sbr.gov.au/"/>
...
</catalog>
These rewrites are the official mechanism used by XBRL Taxonomy/Report Packages.
In this way, the official URIs (absolute or relative) can be used everywhere (such as in the link:schemaRef
element), and the URI-rewrites redirect the processor to read from local copies instead of downloading them. Note that the URIs redirected with this mechanism are the URLs to the schemas and linkbases (such as the URL present in the link:schemaRef
element), not the namespaces themselves.
For example, when Arelle or any other XBRL processor finds this in the instance:
<link:schemaRef xlink:href="http://sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.conttrans.request.02.02.report.xsd"
xlink:type="simple"/>
it will attempt to read the schema located at http://sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.conttrans.request.02.02.report.xsd
. If however there is a rewrite in place that maps http://sbr.gov.au/
to file:///path/to/xsd/sbr.gov.au/
then it will try to read the schema from file:///path/to/xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.conttrans.request.02.02.report.xsd
instead. I imagine this is the redirection that is missing to your current setup right now and that the rewriteURI
element should take care of.
XBRL has its own schema resolution mechanism (called DTS) that works through the link:schemaRef
element and looks for all files in the transitive closure of such links. Thus, if sprcnt.0001.conttrans.request.02.02.report.xsd
contains further links to other schemas and linkbases that are in the http://sbr.gov.au/
domain, they will also all be read locally. It is highly likely that all schemas corresponding to the many namespaces used in the report are indeed in the transitive closure of sprcnt.0001.conttrans.request.02.02.report.xsd
. It is thus not necessary to supply the correspondence between these namespaces and schema file URLs explicitly: the XBRL processor figures it all out via the DTS mechanism.
Finally, rewrites for the core xbrl.org or w3.org domains are usually superfluous because XBRL processors know these servers already and have local copies of the core schemas in their own caches. But if the DTS has files in other domains than these core domains and http://sbr.gov.au/
, then rewrites must be added.
Edit:
I now understand that you are attempting to validate the XML file directly with Xerces, without using an XBRL processor. In this case, the DTS discovery mechanism (which would have used <rewriteURI>
) will be completely ignored because link:schemaRef
is specific to XBRL: Arelle will know about it, but not Xerces.
It is thus important to supply Xerces with the list of schemas, as you suggest in your post. Taking the list from Arelle is sensible. However, I think that this list should be done with <uri>
and not with <system>
as the latter is for DTD system IDs, not for XML Schema. So perhaps something like this helps:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<uri name="http://sbr.gov.au/icls/py/pyin/pyin.02.17.data" uri="xsd/sbr.gov.au/taxonomy/sbr_au_taxonomy/icls/py/pyin/pyin.02.17.data.xsd" />
<uri name="http://sbr.gov.au/rprt/sprstrm/sprcnt/sprcnt.0001.private.02.01.module" uri="xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.private.02.01.module.xsd" />
<uri name="http://sbr.gov.au/rprt/sprstrm/sprcnt/sprcnt.0001.private.02.02.module" uri="xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.private.02.02.module.xsd" />
<uri name="http://www.w3.org/1999/xlink" uri="xsd/www.xbrl.org/2003/xlink-2003-12-31.xsd" />
<uri name="http://www.xbrl.org/2003/instance" uri="xsd/www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd" />
<uri name="http://www.xbrl.org/2003/linkbase" uri="xsd/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd" />
<uri name="http://xbrl.org/2005/xbrldt" uri="xsd/www.xbrl.org/2005/xbrldt-2005.xsd" />
<uri name="http://www.xbrl.org/2003/linkbase" uri="xsd/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd" />
<uri name="http://www.xbrl.org/2003/XLink" uri="xsd/www.xbrl.org/2003/xl-2003-12-31.xsd" />
<uri name="http://www.w3.org/1999/xlink" uri="xsd/www.xbrl.org/2003/xlink-2003-12-31.xsd" />
...
</catalog>
A small warning: validating an XBRL instance file against the relevant schemas guarantees that it is valid in the sense of XML Schema, but does not guarantee that it is valid in the sense of XBRL. XBRL specifications contain additional constraints that cannot be expressed in the schemas. XBRL validity guarantees XML-Schema-validity, but not the other way around. Only an XBRL processor can guarantee that an instance is valid in the sense of XBRL.
Edit:
Here is a standard alternative to provide the schemas recognized by most XML Schema validation processors. I am not sure if it would be suitable in a production environment as it requires modifying the file, but it would be a good intermediate sanity check that Xerces sees these files. Also, modifying the file in this way might break if processed by an XBRL processor, as this XML-Schema discovery mechanism might interfere with the XBRL-specific DTS discovery mechanism.
I formatted it with a (namespace, path) pair per line, but whether newlines or spaces separate them does not play any role as these all count as whitespaces (xsi:schemaLocation
is formally typed as a list of strings/URIs).
<xbrli:xbrl
xmlns:xbrli="http://www.xbrl.org/2003/instance"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
...
xsi:schemaLocation="
http://sbr.gov.au/icls/py/pyin/pyin.02.17.data xsd/sbr.gov.au/taxonomy/sbr_au_taxonomy/icls/py/pyin/pyin.02.17.data.xsd
http://sbr.gov.au/rprt/sprstrm/sprcnt/sprcnt.0001.private.02.01.module xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.private.02.01.module.xsd
http://sbr.gov.au/rprt/sprstrm/sprcnt/sprcnt.0001.private.02.02.module xsd/sbr.gov.au/taxonomy/sbr_au_reports/sprstrm/sprcnt/sprcnt_0001/sprcnt.0001.private.02.02.module.xsd
http://www.w3.org/1999/xlink xsd/www.xbrl.org/2003/xlink-2003-12-31.xsd
http://www.xbrl.org/2003/instance xsd/www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd
http://www.xbrl.org/2003/linkbase xsd/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd
http://xbrl.org/2005/xbrldt xsd/www.xbrl.org/2005/xbrldt-2005.xsd
http://www.xbrl.org/2003/linkbase xsd/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd
http://www.xbrl.org/2003/XLink xsd/www.xbrl.org/2003/xl-2003-12-31.xsd
http://www.w3.org/1999/xlink xsd/www.xbrl.org/2003/xlink-2003-12-31.xsd
..."
>
...
</xbrli:xbrl>