pdfgoogle-search-appliance

How do you index large files in Google Search Appliance?


I have a customer that primarily has PDF documents of scanned contracts and documents. The PDF's have been OCR'd and text inserted as body text. We are having an issue where documents over 100MB are creating a convert text error and are not indexing the text content within the GSA.

We are using the external File Share Connector to feed and crawl the documents.

How can we increase the maximum size and process PDF documents in excess of 100MB?


Solution

  • As per the documentation, you can change these maximums on the Host Load settings page of the admin console.

    Maximum Files Size Setting