apache-tika.net-5ikvm

TikaOnDotNet with .NET Core 3.1 / .NET 5 fails with Method not found: 'Void System.IO.FileStream..ctor


With .NET Core 3.1 and .NET 5 I tried to reference the TikaOnDotNet NuGet package.

Sample code is as follows.

using System;
using System.IO;
using TikaOnDotNet.TextExtraction;

    namespace tika
    {
        class Program
        {
            static void Main(string[] args)
            {
    
                var textExtractor = new TextExtractor();
    
                var original = new FileInfo(Path.Combine(Directory.GetCurrentDirectory(), @"pptexamples.ppt"));
    
                var wordDocContents = textExtractor.Extract(original.FullName);
    
            }
        }
    }

In the textExtractor.Extract method it throws below exception.

TikaOnDotNet.TextExtraction.TextExtractionException: "Extraction of text from the file '/Users/serhatonal/Projects/tika/tika/bin/Debug/netcoreapp3.1/pptexamples.ppt' failed." ---> TikaOnDotNet.TextExtraction.TextExtractionException: "Extraction failed." ---> System.MissingMethodException: "Method not found: 'Void System.IO.FileStream..ctor(System.String, System.IO.FileMode, System.Security.AccessControl.FileSystemRights, System.IO.FileShare, Int32, System.IO.FileOptions)'."
  at Java_java_io_FileDescriptor.open(String name, FileMode fileMode, FileAccess fileAccess)
  at java.io.FileDescriptor.open(String , FileMode , FileAccess )
  at java.io.FileDescriptor.open(String , Int32 , Int32 )
  at java.io.FileDescriptor.openReadOnly(String )
  at Java_java_io_RandomAccessFile.open0(Object _this, String name, Int32 mode, FileDescriptor fd, Int32 O_RDWR)
  at java.io.RandomAccessFile.open0(String , Int32 )
  at java.io.RandomAccessFile.open(String , Int32 )
  at java.io.RandomAccessFile..ctor(File file, String mode)
  at java.util.zip.ZipFile..ctor(File file, Int32 mode, Charset charset)
  at java.util.zip.ZipFile..ctor(File file, Int32 mode)
  at java.util.jar.JarFile..ctor(File file, Boolean verify, Int32 mode)
  at java.util.jar.JarFile..ctor(String name)
  at IKVM.NativeCode.ikvm.runtime.AssemblyClassLoader.lazyDefinePackages(ClassLoader _this)
  at ikvm.runtime.AssemblyClassLoader.lazyDefinePackages()
  at ikvm.runtime.AssemblyClassLoader.lazyDefinePackagesCheck()
  at ikvm.runtime.AssemblyClassLoader.getPackage(String name)
  at java.lang.Package.getPackage(Class )
  at java.lang.Class.getPackage()
  at org.apache.tika.mime.MimeTypesFactory.create(String coreFilePath, String extensionFilePath, ClassLoader classLoader)
  at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(ClassLoader classLoader)
  at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(ClassLoader )
  at org.apache.tika.config.TikaConfig..ctor()
  at org.apache.tika.config.TikaConfig.getDefaultConfig()
  at org.apache.tika.parser.AutoDetectParser..ctor()
  at TikaOnDotNet.TextExtraction.Stream.StreamTextExtractor.Extract(Func`2 streamFactory, Stream outputStream)
  --- End of inner exception stack trace ---
  at TikaOnDotNet.TextExtraction.Stream.StreamTextExtractor.Extract(Func`2 streamFactory, Stream outputStream)
  at TikaOnDotNet.TextExtraction.TextExtractor.Extract[TExtractionResult](Func`2 streamFactory, Func`3 extractionResultAssembler)
  at TikaOnDotNet.TextExtraction.TextExtractor.Extract[TExtractionResult](String filePath, Func`3 extractionResultAssembler)
  --- End of inner exception stack trace ---
  at TikaOnDotNet.TextExtraction.TextExtractor.Extract[TExtractionResult](String filePath, Func`3 extractionResultAssembler)
  at TikaOnDotNet.TextExtraction.TextExtractor.Extract(String filePath)
  at tika.Program.Main(String[] args) in /Users/serhatonal/Projects/tika/tika/Program.cs:16

Even though I found out that issue "System.MissingMethodException: "Method not found: 'Void System.IO.FileStream..ctor(System.String, System.IO.FileMode, System.Security.AccessControl.FileSystemRights, System.IO.FileShare, Int32, System.IO.FileOptions)'."" is considered fixed with the .NET 5 release according to below issue. But the problem still persists.

https://github.com/dotnet/runtime/issues/30435

Anyone having the same issue?


Solution

  • IMVM the basis of the library, as this is a java port, is not dotnet core compatible.

    https://github.com/KevM/tikaondotnet/issues/136#issuecomment-583695410

    Unfortunately, this is why.