pythonadobe

ValueError: client_id must not be blank using PDF Extract API


I have followed the steps in Getting Started with PDF Extract API (Python) successfully. The required libraries have been installed in the virtual environment (venv_smed). The source code has no syntax errors. However, when I run the python extract.py command the following error is displayed:

(venv_smed) bash-3.2$ python extract.py
Traceback (most recent call last):
  File "/Users/username/Documents/allAboutPython/python3.11/myapp/pkg_app/adobe_pdfservices_sdk/extract.py", line 56, in <module>
    credentials = Credentials.service_principal_credentials_builder().with_client_id(os.getenv('PDF_SERVICES_CLIENT_ID')).with_client_secret(os.getenv('PDF_SERVICES_CLIENT_SECRET')).build();
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/Documents/allAboutPython/python3.11/myapp/venv_smed/lib/python3.11/site-packages/adobe/pdfservices/operation/auth/service_principal_credentials.py", line 82, in build
    return ServicePrincipalCredentials(self._client_id, self._client_secret)
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/Documents/allAboutPython/python3.11/myapp/venv_smed/lib/python3.11/site-packages/adobe/pdfservices/operation/auth/service_principal_credentials.py", line 28, in __init__
self._client_id = _is_valid(client_id, 'client_id')
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/Documents/allAboutPython/python3.11/myapp/venv_smed/lib/python3.11/site-packages/adobe/pdfservices/operation/auth/credentials.py", line 17, in _is_valid
raise ValueError(f'{name} must not be blank')
ValueError: client_id must not be blank
(venv_smed) bash-3.2$ 

The example code has been taken from the following link: https://developer.adobe.com/document-services/docs/overview/pdf-extract-api/quickstarts/python/

I have reviewed the documentation in Adobe Developer and it seems ServiceAccountCredentials.Builder is deprecated.

The section of the code that has problems is the following:

credentials = Credentials.service_principal_credentials_builder().with_client_id(os.getenv('PDF_SERVICES_CLIENT_ID')).with_client_secret(os.getenv('PDF_SERVICES_CLIENT_SECRET')).build();

I have modified the format of the line of code to obtain the credentials to access the API,

credentials = Credentials.service_principal_credentials_builder().with_client_id(
os.getenv('PDF_SERVICES_CLIENT_ID')).with_client_secret(
    os.getenv('PDF_SERVICES_CLIENT_SECRET')).build();

When executing the python extract.py command I get the same error, however apparently the problem is in obtaining the value of the environment variable PDF_SERVICES_CLIENT_SECRET,

    Traceback (most recent call last):
  File "/Users/username/Documents/allAboutPython/python3.11/myapp/pkg_app/adobe_pdfservices_sdk/extract.py", line 58, in <module>
    os.getenv('PDF_SERVICES_CLIENT_SECRET')).build();
                                             ^^^^^^^
  File "/Users/username/Documents/allAboutPython/python3.11/myapp/venv_smed/lib/python3.11/site-packages/adobe/pdfservices/operation/auth/service_principal_credentials.py", line 82, in build
    return ServicePrincipalCredentials(self._client_id, self._client_secret)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/Documents/allAboutPython/python3.11/myapp/venv_smed/lib/python3.11/site-packages/adobe/pdfservices/operation/auth/service_principal_credentials.py", line 28, in __init__
    self._client_id = _is_valid(client_id, 'client_id')
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/Documents/allAboutPython/python3.11/myapp/venv_smed/lib/python3.11/site-packages/adobe/pdfservices/operation/auth/credentials.py", line 17, in _is_valid
    raise ValueError(f'{name} must not be blank')
ValueError: client_id must not be blank

The pdfservices-api-credentials.json file has the following content in JSON format,

{
 "client_credentials": {
  "client_id": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "client_secret": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
 },
 "service_principal_credentials": {
 "organization_id": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
 }
}

Any guide to resolve the error would be appreciated:

ValueError: client_id must not be blank*

Solution:

"""
Read JSON
"""

# Opening JSON file
f = open('pdfservices-api-credentials.json')

# returns JSON object as 
# a dictionary
data = json.load(f)

# Get client ID and credentials
CLIENT_ID     = data["client_credentials"]["client_id"]
CLIENT_SECRET = data["client_credentials"]["client_secret"]

print(CLIENT_ID)
print(CLIENT_SECRET)

# Closing file
f.close()

credentials = Credentials.service_principal_credentials_builder().with_client_id(
CLIENT_ID).with_client_secret(
    CLIENT_SECRET).build();

The result of executing the script is as expected:

(venv_smed) bash-3.2$ python extract.py
"your client_id"
"your client_secret"
Structured Information Output Format 
Introduction 
List of key components 
(venv_smed) bash-3.2$ 

Solution

  • Your JSON file does not contain an environment variable in the way that your code is trying to read it. With what your code is doing right now, it is expecting within the (assuming you are running this in a shell) shell, an environment variable to be set.

    What this means is, when running the following (in your shell):

    export PDF_SERVICES_CLIENT_SECRET
    

    the expectation is that you should have something returned. This is what your current Python code is expecting when you run this:

    os.getenv("PDF_SERVICES_CLIENT_SECRET")
    

    That os.getenv is trying to read the value from your shell environment. It has nothing to do with the JSON file.

    So, the error you receive ValueError: client_id must not be blank is in fact correct in the sense that the client_id was never read as expected by your code in the first place. So it is in fact an empty client_id when passed to your function.

    You seem to want to read from the JSON file. The JSON file does not contain an environment variable. Your JSON file is simply data that you need to load in to memory and parse through. You can do this through the json library. Below is a sample code:

    import json
    
    with open("pdfservices-api-credentials.json") as f:
        data = json.load(f)
     
    print(data["client_credentials"]["client_id"])
    

    What is happening in the above code is that the json library is used to load a file object (per the open method), and once loaded you will have a Python dictionary (assuming your data structure provided is as you have it). From there you simply access what you need per the print call.