javaparsingapache-tika

How to enable PDFParser in new Tika v2.9.0?


I use tika-core and tika-parsers-standard-package (v 2.9.0)

I want to parse pdf file. When I process my pdf file, I see that tika has correctly identified the type (application/pdf).

And I expect that this type will be processed by PDFParser. But when I see the processing in EmptyParser.

Next, I decided to check whether Tika created my PdfParser, and whether it was inside it.

I go into the CompositeParser class and watch how types and parsers are added (in the MediaTypeRegistry registry and List<Parser> parsers variables)

And I see that my parser PDFParser is not added to this variable.

So if Tika processes a pdf file, it will not be able to find the PDFParser and will not be able to process the file.

Next, I decided to check what types and parsers were added. And I saw that this parser was really not added.

    org.apache.tika.parser.microsoft.chm.ChmParser@3ef5992e = application/vnd.ms-htmlhelp
    org.apache.tika.parser.mail.RFC822Parser@681e1dee = message/rfc822
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/vnd.visio
    org.apache.tika.parser.feed.FeedParser@6d16750 = application/atom+xml
    org.apache.tika.parser.image.ImageParser@14ffa24d = image/x-xcf
    org.apache.tika.parser.microsoft.WMFParser@c962c51 = image/wmf
    org.apache.tika.parser.audio.MidiParser@44a2894a = audio/midi
    org.apache.tika.parser.mat.MatParser@49449d69 = application/x-matlab-data
    org.apache.tika.parser.external.CompositeExternalParser@20c0a3c6 = video/x-msvideo
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/deflate64
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-powerpoint.slideshow.macroenabled.12
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.openxmlformats-officedocument.presentationml.slide
    org.apache.tika.parser.iwork.IWorkPackageParser@20f3b603 = application/vnd.apple.keynote
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.spreadsheet-template
    org.apache.tika.parser.mp4.MP4Parser@576ae06d = audio/mp4
    org.apache.tika.parser.xliff.XLIFF12Parser@14307e14 = application/x-xliff+xml
    org.apache.tika.parser.wordperfect.QuattroProParser@29ce813f = application/x-quattro-pro; version=9
    org.apache.tika.parser.epub.EpubParser@6b7e49f1 = application/x-ibooks+zip
    org.apache.tika.parser.apple.PListParser@6b97b57b = application/x-plist
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-word.document.macroenabled.12
    org.apache.tika.parser.iwork.iwana.IWork13PackageParser@7c40a6c3 = application/vnd.apple.unknown.13
    org.apache.tika.parser.mp4.MP4Parser@576ae06d = application/mp4
    org.apache.tika.parser.audio.MidiParser@44a2894a = application/x-midi
    org.apache.tika.parser.feed.FeedParser@6d16750 = application/rss+xml
    org.apache.tika.parser.html.HtmlParser@21da170d = text/html
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-visio.template
    org.apache.tika.parser.csv.TextAndCSVParser@38f2e306 = text/csv
    org.apache.tika.parser.image.ImageParser@14ffa24d = image/vnd.microsoft.icon
    org.apache.tika.parser.mp4.MP4Parser@576ae06d = video/quicktime
    org.apache.tika.parser.mp4.MP4Parser@576ae06d = video/mp4
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/zlib
    org.apache.tika.parser.wacz.WACZParser@6e18b50e = application/x-wacz
    org.apache.tika.parser.iwork.IWorkPackageParser@20f3b603 = application/vnd.apple.numbers
    org.apache.tika.parser.pkg.PackageParser@171760b0 = application/x-archive
    org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser@66846f20 = application/vnd.ms-word2006ml
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/x-tika-msoffice
    org.apache.tika.parser.font.AdobeFontMetricParser@49c15a25 = application/x-font-adobe-metric
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-visio.drawing
    org.apache.tika.parser.mp4.MP4Parser@576ae06d = video/x-m4v
    org.apache.tika.parser.pkg.PackageParser@171760b0 = application/java-archive
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/sldworks
    org.apache.tika.parser.http.HttpParser@625a36d6 = application/x-httpresponse
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/x-tika-ooxml-protected
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-excel.sheet.macroenabled.12
    org.apache.tika.parser.image.HeifParser@4e64d8cd = image/heif
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-word.template.macroenabled.12
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/vnd.ms-outlook
    org.apache.tika.parser.image.HeifParser@4e64d8cd = image/heic
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-compress
    org.apache.tika.parser.dwg.DWGParser@bdc534e = image/vnd.dwg
    org.apache.tika.parser.code.SourceCodeParser@480f2062 = text/x-groovy
    org.apache.tika.parser.mp4.MP4Parser@576ae06d = video/3gpp
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.openxmlformats-officedocument.spreadsheetml.template
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = model/vnd.dwfx+xps
    org.apache.tika.parser.external.CompositeExternalParser@20c0a3c6 = video/mpeg
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.chart-template
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
    org.apache.tika.parser.pkg.PackageParser@171760b0 = application/zip
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.text-master
    org.apache.tika.parser.odf.FlatOpenDocumentParser@7bd04333 = application/vnd.oasis.opendocument.tika.flat.document
    org.apache.tika.parser.image.PSDParser@139a5e34 = image/vnd.adobe.photoshop
    org.apache.tika.parser.image.ImageParser@14ffa24d = image/gif
    org.apache.tika.parser.executable.ExecutableParser@7ab5292a = application/x-sharedlib
    org.apache.tika.parser.asm.ClassParser@4b880a2e = application/java-vm
    org.apache.tika.parser.image.WebPParser@32b801b7 = image/webp
    org.apache.tika.parser.microsoft.activemime.ActiveMimeParser@7392922a = application/x-activemime
    org.apache.tika.parser.indesign.IDMLParser@203d190b = application/vnd.adobe.indesign-idml-package
    org.apache.tika.parser.html.HtmlParser@21da170d = application/vnd.wap.xhtml+xml
    org.gagravarr.tika.OggParser@2b6fbe58 = video/ogg
    org.apache.tika.parser.apple.AppleSingleFileParser@5928a73b = application/applefile
    org.apache.tika.parser.audio.AudioParser@3ecc6df6 = audio/x-aiff
    org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser@76f1ce65 = application/vnd.ms-spreadsheetml
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/msword
    org.apache.tika.parser.iwork.iwana.IWork13PackageParser@7c40a6c3 = application/vnd.apple.numbers.13
    org.apache.tika.parser.apple.PListParser@6b97b57b = application/x-bplist-memgraph
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-java-pack200
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.image-template
    org.apache.tika.parser.warc.WARCParser@20ba239d = application/warc
    org.apache.tika.parser.microsoft.rtf.RTFParser@3cb759ce = application/rtf
    org.apache.tika.parser.image.BPGParser@61c93073 = image/bpg
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.text
    org.apache.tika.parser.microsoft.TNEFParser@19b358c9 = application/x-tnef
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-xz
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-powerpoint.template.macroenabled.12
    org.apache.tika.parser.image.ImageParser@14ffa24d = image/vnd.wap.wbmp
    org.apache.tika.parser.crypto.Pkcs7Parser@686426a9 = application/pkcs7-mime
    org.apache.tika.parser.executable.ExecutableParser@7ab5292a = application/x-executable
    org.apache.tika.parser.executable.ExecutableParser@7ab5292a = application/x-coredump
    org.apache.tika.parser.microsoft.JackcessParser@6e1b6b42 = application/x-msaccess
    org.apache.tika.parser.iwork.iwana.IWork18PackageParser@4679945a = application/vnd.apple.numbers.18
    org.apache.tika.parser.csv.TextAndCSVParser@38f2e306 = text/plain
    org.apache.tika.parser.image.ImageParser@14ffa24d = image/png
    org.apache.tika.parser.microsoft.pst.OutlookPSTParser@450fe94c = application/vnd.ms-outlook-pst
    org.apache.tika.parser.pkg.PackageParser@171760b0 = application/x-cpio
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/x-tika-msworks-spreadsheet
    org.apache.tika.parser.iwork.IWorkPackageParser@20f3b603 = application/vnd.apple.pages
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-xpsdocument
    org.gagravarr.tika.VorbisParser@c97ce2a = audio/ogg
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-visio.template.macroenabled.12
    org.apache.tika.parser.pkg.PackageParser@171760b0 = application/x-tar
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.presentation-template
    org.apache.tika.parser.apple.PListParser@6b97b57b = application/x-bplist
    org.apache.tika.parser.image.ImageParser@14ffa24d = image/x-jbig2
    org.apache.tika.parser.dbf.DBFParser@39e3f581 = application/x-dbf
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-excel.template.macroenabled.12
    org.apache.tika.parser.mbox.MboxParser@33326da3 = application/mbox
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.formula
    org.apache.tika.parser.microsoft.chm.ChmParser@3ef5992e = application/chm
    org.apache.tika.parser.microsoft.OldExcelParser@9cda700 = application/vnd.ms-excel.workspace.3
    org.apache.tika.parser.microsoft.OldExcelParser@9cda700 = application/vnd.ms-excel.workspace.4
    org.apache.tika.parser.image.BPGParser@61c93073 = image/x-bpg
    org.apache.tika.parser.xliff.XLZParser@3b99ce8d = application/x-xliff+zip
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.openxmlformats-officedocument.wordprocessingml.template
    org.apache.tika.parser.iwork.IWorkPackageParser@20f3b603 = application/vnd.apple.iwork
    org.apache.tika.parser.image.HeifParser@4e64d8cd = image/heic-sequence
    org.apache.tika.parser.microsoft.OldExcelParser@9cda700 = application/vnd.ms-excel.sheet.2
    org.apache.tika.parser.microsoft.OldExcelParser@9cda700 = application/vnd.ms-excel.sheet.3
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-powerpoint.presentation.macroenabled.12
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-brotli
    org.apache.tika.parser.dif.DIFParser@1df07b82 = application/dif+xml
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/vnd.ms-excel
    org.apache.tika.parser.microsoft.OldExcelParser@9cda700 = application/vnd.ms-excel.sheet.4
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/x-tika-ole-drm-encrypted
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/vnd.ms-project
    org.apache.tika.parser.dgn.DGN8Parser@3d0f52ea = image/vnd.dgn; version=8
    org.apache.tika.parser.epub.EpubParser@6b7e49f1 = application/epub+zip
    org.apache.tika.parser.sas.SAS7BDATParser@591e17ec = application/x-sas-data
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-snappy
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.text-template
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.openxmlformats-officedocument.presentationml.presentation
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-visio.stencil
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-visio.stencil.macroenabled.12
    org.apache.tika.parser.apple.PListParser@6b97b57b = application/x-bplist-webarchive
    org.apache.tika.parser.xml.DcXMLParser@2c0657e4 = application/xml
    org.apache.tika.parser.odf.FlatOpenDocumentParser@7bd04333 = application/vnd.oasis.opendocument.flat.presentation
    org.apache.tika.parser.image.ImageParser@14ffa24d = image/bmp
    org.apache.tika.parser.wordperfect.WordPerfectParser@32b4058d = application/vnd.wordperfect; version=6.x
    org.apache.tika.parser.html.HtmlParser@21da170d = application/xhtml+xml
    org.apache.tika.parser.crypto.Pkcs7Parser@686426a9 = application/pkcs7-signature
    org.apache.tika.parser.wordperfect.WordPerfectParser@32b4058d = application/vnd.wordperfect; version=5.1
    org.apache.tika.parser.wordperfect.WordPerfectParser@32b4058d = application/vnd.wordperfect; version=5.0
    org.apache.tika.parser.code.SourceCodeParser@480f2062 = text/x-java-source
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.sun.xml.writer
    org.apache.tika.parser.audio.AudioParser@3ecc6df6 = audio/basic
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.formula-template
    org.apache.tika.parser.tmx.TMXParser@75eb9933 = application/x-tmx
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-powerpoint.addin.macroenabled.12
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/vnd.ms-powerpoint
    org.apache.tika.parser.crypto.TSDParser@5ce35115 = application/timestamped-data
    org.apache.tika.parser.code.SourceCodeParser@480f2062 = text/x-c++src
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.openxmlformats-officedocument.presentationml.template
    org.apache.tika.parser.apple.PListParser@6b97b57b = application/x-bplist-itunes
    org.apache.tika.parser.image.TiffParser@42c4941a = image/tiff
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-excel.addin.macroenabled.12
    org.apache.tika.parser.microsoft.xml.WordMLParser@50ce79a2 = application/vnd.ms-wordml
    org.apache.tika.parser.executable.ExecutableParser@7ab5292a = application/x-object
    org.apache.tika.parser.html.HtmlParser@21da170d = application/x-asp
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/x-mspublisher
    org.apache.tika.parser.hwp.HwpV5Parser@79455089 = application/x-hwp-v5
    org.apache.tika.parser.pkg.RarParser@3f68dd4d = application/x-rar-compressed
    org.apache.tika.parser.image.HeifParser@4e64d8cd = image/heif-sequence
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.graphics-template
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.openxmlformats-officedocument.wordprocessingml.document
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-powerpoint.slide.macroenabled.12
    org.apache.tika.parser.microsoft.TNEFParser@19b358c9 = application/vnd.ms-tnef
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.text-web
    org.apache.tika.parser.mif.MIFParser@59eaeafe = application/x-maker
    org.apache.tika.parser.microsoft.EMFParser@7aca8b08 = image/emf
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-bzip
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.graphics
    org.apache.tika.parser.iptc.IptcAnpaParser@29d85dc0 = text/vnd.iptc.anpa
    org.apache.tika.parser.iwork.iwana.IWork18PackageParser@4679945a = application/vnd.apple.keynote.18
    org.apache.tika.parser.microsoft.OfficeParser@1e7ad0d1 = application/x-tika-msoffice-embedded; format=ole10_native
    org.apache.tika.parser.pkg.PackageParser@171760b0 = application/x-arj
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-lzma
    org.apache.tika.parser.mp4.MP4Parser@576ae06d = video/3gpp2
    org.apache.tika.parser.mp3.Mp3Parser@50a37f03 = audio/mpeg
    org.apache.tika.parser.iwork.iwana.IWork13PackageParser@7c40a6c3 = application/vnd.apple.keynote.13
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-lz4
    org.apache.tika.parser.odf.FlatOpenDocumentParser@7bd04333 = application/vnd.oasis.opendocument.flat.spreadsheet
    org.apache.tika.parser.audio.AudioParser@3ecc6df6 = audio/vnd.wave
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.presentation
    org.apache.tika.parser.mif.MIFParser@59eaeafe = application/vnd.mif
    org.apache.tika.parser.pkg.PackageParser@171760b0 = application/x-7z-compressed
    org.apache.tika.parser.image.JXLParser@5e10d2f3 = image/jxl
    org.apache.tika.parser.executable.ExecutableParser@7ab5292a = application/x-msdownload
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.chart
    org.apache.tika.parser.image.JpegParser@70ac80f2 = image/jpeg
    org.apache.tika.parser.image.ICNSParser@1bf1e57a = image/icns
    org.gagravarr.tika.VorbisParser@c97ce2a = audio/vorbis
    org.gagravarr.tika.OggParser@2b6fbe58 = application/ogg
    org.apache.tika.parser.xml.DcXMLParser@2c0657e4 = image/svg+xml
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-excel.sheet.binary.macroenabled.12
    org.apache.tika.parser.warc.WARCParser@20ba239d = application/warc+gz
    org.apache.tika.parser.microsoft.onenote.OneNoteParser@50e1da60 = application/onenote; format=one
    org.apache.tika.parser.video.FLVParser@8be1dc5 = video/x-flv
    org.apache.tika.parser.microsoft.MSOwnerFileParser@1c01b730 = application/x-ms-owner
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/gzip
    org.apache.tika.parser.pkg.PackageParser@171760b0 = application/x-tika-unix-dump
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.spreadsheet
    org.apache.tika.parser.iwork.iwana.IWork18PackageParser@4679945a = application/vnd.apple.pages.18
    org.apache.tika.parser.odf.OpenDocumentParser@588ff48 = application/vnd.oasis.opendocument.image
    org.apache.tika.parser.pkg.CompressorParser@1d89e72b = application/x-bzip2
    org.apache.tika.parser.iwork.iwana.IWork13PackageParser@7c40a6c3 = application/vnd.apple.pages.13
    org.apache.tika.parser.xml.FictionBookParser@84f4bff = application/x-fictionbook+xml
    org.apache.tika.parser.odf.FlatOpenDocumentParser@7bd04333 = application/vnd.oasis.opendocument.flat.text
    org.apache.tika.parser.executable.ExecutableParser@7ab5292a = application/x-elf
    org.apache.tika.parser.csv.TextAndCSVParser@38f2e306 = text/tsv
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.ms-visio.drawing.macroenabled.12
    org.apache.tika.parser.microsoft.ooxml.OOXMLParser@4db1d509 = application/vnd.openxmlformats-officedocument.presentationml.slideshow
    org.apache.tika.parser.microsoft.chm.ChmParser@3ef5992e = application/x-chm
    org.apache.tika.parser.font.TrueTypeParser@5623a24c = application/x-font-ttf
    org.apache.tika.parser.prt.PRTParser@2e0bcd78 = application/x-prt

tika 2.9.0

If you look at this data, it does not contain the required PDFParser. And it should be the default, because after the build I see a jar file inside my lib directory. (tika-parser-pdf-module-2.9.0.jar)

But if you try tika 1.25, you will see a similar list, only there will already be a PDFParser inside.

My pom.xml file with Tika 2.9.0:

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-bom</artifactId>
            <version>2.9.0</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <dependency>
        <groupId>org.apache.tika</groupId>
        <artifactId>tika-core</artifactId>
        <type>jar</type>
    </dependency>
    <dependency>
       <groupId>org.apache.tika</groupId>
       <artifactId>tika-parsers-standard-package</artifactId>
    </dependency>

   ...
<dependencies>

Tell me how I can add or activate PDFParser, which should work out of the box by default.

Thank you in advance.


Solution

  • I didn't notice how I inserted a dependency with pdfbox (2.0.21) into my pom.xml. And it was in this version that some package was missing.

    Therefore, an error occurred if you manually added this library.

    When I saw that tika (2.9.0) was already using pdfbox with version 2.0.29, I removed my dependency.

    So that as a result, I don’t load the old version 2.0.21, but the new 2.0.29.

    After that I saw that tika was able to create a PDFParser.

    Therefore, you need to check whether there was an overlap of your dependencies instead of tika dependencies.