javatiffimage-segmentationopenimajtwelvemonkeys

OpenIMAJ library cannot read tiff files?


I am using OpenIMAJ library, it is working well on "JPEG" and "PNG" files but on tiff files it is giving me an error. Here is the code:

import org.openimaj.image.ImageUtilities;
import org.openimaj.image.MBFImage;

....

File file = new File("/home/mosab/Desktop/input/tif.tif");
MBFImage input = ImageUtilities.readMBF(file);

And here is the error:

Exception in thread "main" java.io.IOException: org.apache.sanselan.ImageReadException: Tiff: unknown compression: 7
    at org.openimaj.image.ExtendedImageIO.read(ExtendedImageIO.java:189)
    at org.openimaj.image.ExtendedImageIO.read(ExtendedImageIO.java:126)
    at org.openimaj.image.ImageUtilities.readMBF(ImageUtilities.java:355)
    at org.mosab.TestOpenIMAJ.TestKmeans.main(TestKmeans.java:49)
Caused by: org.apache.sanselan.ImageReadException: Tiff: unknown compression: 7
    at org.apache.sanselan.formats.tiff.datareaders.DataReader.decompress(DataReader.java:135)
    at org.apache.sanselan.formats.tiff.datareaders.DataReaderStrips.readImageData(DataReaderStrips.java:96)
    at org.apache.sanselan.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:505)
    at org.apache.sanselan.formats.tiff.TiffDirectory.getTiffImage(TiffDirectory.java:163)
    at org.apache.sanselan.formats.tiff.TiffImageParser.getBufferedImage(TiffImageParser.java:441)
    at org.apache.sanselan.Sanselan.getBufferedImage(Sanselan.java:1264)
    at org.apache.sanselan.Sanselan.getBufferedImage(Sanselan.java:1163)
    at org.apache.sanselan.Sanselan.getBufferedImage(Sanselan.java:1136)
    at org.openimaj.image.ExtendedImageIO.read(ExtendedImageIO.java:187)
    ... 3 more

This is the tiff file (GeoTiff specifically) that I am using:

"https://drive.google.com/file/d/0ByKaCojxzNa9MWxPTUJjZURHR1E/view?usp=sharing"

Does it mean that OpenIMAJ library doesn't support tiff format/GeoTiff?

I supposed that OpenIMAJ doesn't support tiff so I tried "TwelveMonkeys" library to read that file. "TwelveMonkeys" library separately/alone is able to read the file. Therefore, I imported the TwelveMonkeys library to work together with OpenIMAJ and hence support tiff files and that worked for some tiff files but for that file It didn't work (Although "TwelveMonkeys" was able to read it alone in seperate project) and I got this exception:

Exception in thread "main" java.io.IOException: Resetting to invalid mark
at java.io.BufferedInputStream.reset(BufferedInputStream.java:448)
at org.openimaj.image.ExtendedImageIO.read(ExtendedImageIO.java:185)
at org.openimaj.image.ExtendedImageIO.read(ExtendedImageIO.java:126)
at org.openimaj.image.ImageUtilities.readMBF(ImageUtilities.java:355)
at org.mosab.TestOpenIMAJ.TestKmeans.main(TestKmeans.java:49)

Later when I tracked the error message I found something may be related to the size of the file because it is around 26mb and I noticed that the error originates from the method "read" of class "org.openimaj.image.ExtendedImageIO" which I think it uses size of maximum 10mb:

public static BufferedImage read(InputStream input) throws IOException {
    if (input == null) {
        throw new IllegalArgumentException("input == null!");
    }

    final NonClosableInputStream buffer = new NonClosableInputStream(input);
    buffer.mark(10 * 1024 * 1024); // 10mb I think here is the problem

    BufferedImage bi;
    try {
        bi = readInternal(buffer);
    } catch (final Exception ex) {
        bi = null;
    }

    if (bi == null) {
        buffer.reset();
        try {
            bi = Sanselan.getBufferedImage(buffer);
        } catch (final Throwable e) {
            throw new IOException(e);
        }
    }

    return bi;
}

So how can I fix this issue and read that tiff file in OpenIMAJ(To further apply facilities, OpenIMAJ provides, on it like clustering/segmentation)?


Solution

  • TIFF is a horrible format as it has many custom extensions which are not always supported by libraries. OpenIMAJ tries to get around some of these issues by using a batch of different libraries in order to read all sorts of different images, however in this case it's failing. As you've noticed there is a 10mb buffer limit which is causing an issue - increasing it to 100mb allows the image you linked to be loaded. I'll update the code to address this (as it's only a limit, it would seem that the underlying buffer is much smaller so this shouldn't cause any problems).

    As a quick work-around until the new snapshot is deployed, you can load the image you linked with this:

    MBFImage img = ImageUtilities.
        createMBFImage(Sanselan.getBufferedImage(new File("tif.tif")), false);
    

    There would appear to be a separate issue that Sanselan doesn't seem to be able to load all your images (on the basis of the stack trace referring to an unknown image compression). If you can provide a link to such an image on the GitHub bug report (https://github.com/openimaj/openimaj/issues/119) then it might be possible to code in a fallback that uses TwelveMonkeys for such images or we can see if a newer version of Sanselan fixes that issue. Again, in the meantime, you could use TwelveMonkeys directly for those images in your code and convert to MBFImage using ImageUtilities as above.