I'm trying to extract text from PDF files using an iFilter.
The Adobe PDF iFilter that is distributed with Adobe Reader is awful, returning HRESULT E_FAIL
messages for many PDF documents.
The FoxIt PDF IFilter works beautifully on virtually all of the PDFs I've been using for testing.
The problem is that every time the Adobe Updater runs, it replaces the awesome FoxIt IFilter with the crappy Adobe IFilter.
I've been using the LoadIFilter method to get the registered IFilter for PDF files. Is there a way to force the Win32 API to load the FoxIt IFilter instead of the Adobe IFilter?
NOTE: This question about determining which IFilters are installed asks a related -- but not identical -- question.
The IFilter seems to be registered as a COM Object with windows, so you should be able to just create an instance of it using COM.
From http://msdn.microsoft.com/en-us/library/ms692565 : The structure of the DLL is that it has a IFilter and a IClassFactory
You should be able to instantiate the IClassFactory (given the CLSID)
check out http://msdn.microsoft.com/en-us/library/ms684007 http://msdn.microsoft.com/en-us/library/ms680760