Migration of blobs from database to the file system in jackrabbit

As being proposed in the previous discussion Using file system instead of database to store pdf files in jackrabbit we can use FileDataStore to store blob files in the file system instead of database (i my case have stored ~ 100 kb size pdfs).

The following problem I have faced is dealing with files that have been previously stored in blobstore and I want them to be accessible after switching to FileDataStore.

After adding FileDataStore support to the repository.xml when using JcrUtils method getOrAddNode i get ItemExistsException:

public static Node getOrAddNode(Node parent, String name)
        throws RepositoryException {
    if (parent.hasNode(name)) {
        return parent.getNode(name);
    } else {
        return parent.addNode(name);
    }
}

e.g. parent.hasNode(name) returns false (it seems the item doesn't exist) but then we fall in to the code parent.addNode(name) which consequently throws ItemExistsException.

Any help?

Is it necessary to proceed the migration of blobs to the FileDataStore or there is kind of configuration that jackrabbit could search for blobs in different locations at the same time: in my case mysql database and filesystem.

Some comments:

I have found at least several ways that could help do the migration job:

spec http://wiki.apache.org/jackrabbit/BackupAndMigration tells about using JCR API (Session.exportSystemView(..) and then Session.importXML(..) ), using RepositoryCopier API etc.
jackrabbit-jcr-import-export-tool (see http://svn.apache.org/repos/asf/jackrabbit/sandbox/jackrabbit-jcr-import-export-tool/README.txt)
using jackrabbit standalone server (http://jackrabbit.apache.org/standalone-server.html)

Solution

It might be possible that there is a repository corruption. That is, the node contains a child node entry for the given name (the node you want to add), but the child node itself doesn't exist. Specially in older version of Jackrabbit you could get into this situation if multiple sessions concurrently tried to change the same nodes.

To fix such corruption problems, the bundle db persistence managers support a consistency check & fix feature. You would need to set those options in the repository.xml and workspace.xml files, and restart Jackrabbit. Once fixed, you can disable those options again.

There is also a way to fix such problems at runtime, by setting the system property org.apache.jackrabbit.autoFixCorruptions to true, and then traverse over all nodes in the repository.