Is there a good library for extracting text from a PDF? I'm willing to pay for it if I have to.
Something that works with C# or classic ASP (VBScript) would be ideal and I also need to be able to separate the pages from the PDF.
This question had some interesting stuff, especially pdftotext but I'd like to avoid calling to an external command-line app if I can.
You can use the IFilter interface built into Windows to extract text and properties (author, title, etc.) from any supported file type. It's a COM interface so you would have use the .NET interop facilities.
You'd also have to download the free PDF IFilter driver from Adobe.