javaxmlschemaxmlcatalog

Can Xerces support XMLCatalogResolver and <xs:include/> at the same time?


Xerces claims to allow XML Catalog support to be added to a reader like this:

XMLCatalogResolver resolver = new XMLCatalogResolver();
resolver.setPreferPublic(true);
resolver.setCatalogList(catalogs);

XMLReader reader = XMLReaderFactory.createXMLReader(
    "org.apache.xerces.parsers.SAXParser");
reader.setProperty("http://apache.org/xml/properties/internal/entity-resolver",
    resolver);

But as soon as I do this then any <xs:include/> tags in my schemas are no longer processed. It seems like the XMLCatalogResolver becomes the only go-to place for entity resolution once it's added, so includes can't work anymore. Eclipse OTOH successfully validates using the same catalog, so it should be possilbe.

Is there a way around this, or are there any other Java based validators that support catalogs?

Thanks, Dominic.


Solution

  • I finally solved this by overriding the XMLCatalogResolver and logging the various calls made to the resolveEntity() method. I observed 3 types of call being made, only one of which made sense to be resolved using the XML catalog. So, I merely returned a FileInputStream directly for the other two call types.

    Here is the code I used inside my custom XMLCatalogResolver class:

    public XMLInputSource resolveEntity(XMLResourceIdentifier resourceIdentifier)
        throws IOException
    {
        if(resourceIdentifier.getExpandedSystemId() != null)
        {
            return new XMLInputSource(resourceIdentifier.getPublicId(),
                resourceIdentifier.getLiteralSystemId(),
                resourceIdentifier.getBaseSystemId(),
                new FileReader(getFile(resourceIdentifier.getExpandedSystemId())),
                "UTF-8");
        }
        else if((resourceIdentifier.getBaseSystemId() != null) &&
            (resourceIdentifier.getNamespace() == null))
        {
            return new XMLInputSource(resourceIdentifier.getPublicId(),
                resourceIdentifier.getLiteralSystemId(),
                resourceIdentifier.getBaseSystemId(),
                new FileReader(getFile(resourceIdentifier.getBaseSystemId())),
                "UTF-8");
        }
        else
        {
            return super.resolveEntity(resourceIdentifier);
        }
    }
    
    private File getFile(String urlString) throws MalformedURLException
    {
        URL url = new URL(urlString);
        return new File(url.toURI());
    }
    

    I'm not sure why this wouldn't be done by default within Xerces, but hopefully this helps the next person that encounters this problem.