google-cloud-platformgoogle-cloud-pythoncloud-document-ai

Upload PDF from Local to Document AI


I now create a web app and I want to ask about how to upload PDFs from my laptop/local to Document AI - Summarizer Processor?

with open(file_path, "rb") as image: image_content = image.read()

uploaded_file = st.file_uploader('Choose your .pdf file', type="pdf")

process_document_sample(project_id="XXXX",location="us",processor_id="XXX", file_path=uploaded_file,mime_type="application/pdf" )

https://i.imgur.com/fqf30Xn.png

I want upload PDF from local/my laptop use Streamlit (uploaded_file) and can read my PDF with with open(file_path, "rb") as image function.

I expect I have idea - I want upload PDF from local to Document AI Python SDK use Streamlit (web app library) and can read PDF.


Solution

  • The Document AI API for online processing requests requires the input file to be encoded in base64 as a string, which the default Python File I/O does when exporting the bytes read.

    For Streamlit, you'll need to get the bytes of the uploaded file and input that value directly in the API request, rather than passing it to

    with open(file_path, "rb") as image:
    

    In the Streamlit documentation, it looks like you are able to get the bytes data from an uploaded file. I'm not familiar with this framework, but you should be able to do something like this, using the code sample from Send a processing request.

    from typing import Optional
    
    from google.api_core.client_options import ClientOptions
    from google.cloud import documentai
    
    # TODO(developer): Uncomment these variables before running the sample.
    # project_id = "YOUR_PROJECT_ID"
    # location = "YOUR_PROCESSOR_LOCATION" # Format is "us" or "eu"
    # processor_id = "YOUR_PROCESSOR_ID" # Create processor before running sample
    # mime_type = "application/pdf" # Refer to https://cloud.google.com/document-ai/docs/file-types for supported file types
    # field_mask = "text,entities,pages.pageNumber"  # Optional. The fields to return in the Document object.
    # processor_version_id = "YOUR_PROCESSOR_VERSION_ID" # Optional. Processor version to use
    
    
    def process_document_sample(
        project_id: str,
        location: str,
        processor_id: str,
        mime_type: str,
        field_mask: Optional[str] = None,
        processor_version_id: Optional[str] = None,
    ) -> None:
        # You must set the `api_endpoint` if you use a location other than "us".
        opts = ClientOptions(api_endpoint=f"{location}-documentai.googleapis.com")
    
        client = documentai.DocumentProcessorServiceClient(client_options=opts)
    
        if processor_version_id:
            # The full resource name of the processor version, e.g.:
            # `projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}`
            name = client.processor_version_path(
                project_id, location, processor_id, processor_version_id
            )
        else:
            # The full resource name of the processor, e.g.:
            # `projects/{project_id}/locations/{location}/processors/{processor_id}`
            name = client.processor_path(project_id, location, processor_id)
    
        # Read the file into memory
        uploaded_file = st.file_uploader('Choose your .pdf file', type="pdf")
    
        # Load binary data
        raw_document = documentai.RawDocument(content=uploaded_file.getvalue(), mime_type=mime_type)
    
        # Configure the process request
        request = documentai.ProcessRequest(
            name=name, raw_document=raw_document, field_mask=field_mask
        )
    
        result = client.process_document(request=request)
    
        # For a full list of `Document` object attributes, reference this page:
        # https://cloud.google.com/document-ai/docs/reference/rest/v1/Document
        document = result.document
    
        # Read the text recognition output from the processor
        print("The document contains the following text:")
        print(document.text)