google-cloud-vertex-aivertex-ai-searchunstructured-data

Gives INVALID_ARGUMENT error when pushing a new unstructured data into datastore



Hi All, So I am new to google vertex AI search, So I was doing some hands-on of creating a new unstructured document(pdf in my case) and push that document to already created data store which already have some unstructured documents.

so When I hit the below custom API from postman https://discoveryengine.googleapis.com/v1alpha/projects/548443691128/locations/global/collections/default_collection/dataStores/kaggle-movie_1698160258041/branches/0/documents

and added documentId = 01 in params then I get INVALID_ARGUMENT error postman error

Below is the format of the unstructured document that I created by following

https://cloud.google.com/discovery-engine/docs/reference/rest/v1alpha/projects.locations.collections.dataStores.branches.documents#content

{
    "name": "projects/548443691128/locations/global/collections/default_collection/dataStores/alphabet-investor_1698161197344/branches/0/documents/01",
    "id": "01",
    "schemaId": "default_schema",
    "structData": {},
    "parentDocumentId": "01",
    "content": {
        "mimeType": "application/pdf",
        "uri": "gs://personal-beta/testing-doc/Global iJobs Policy.pdf"
    }
}

In this case the document name Global iJobs Policy.pdf is present in the google cloud storage bucket. I think there might be some problem with the format of the document, but I am not able to figure out.

I tried following the officials documentation but did not get any hint about, what could have gone wrong.


Solution

  • Here's the REST API sample for how to import data from Cloud Storage after creating a data store using the Cloud Console.

    You will need to use the import method of the projects.locations.collections.dataStores.branches.documents resource.

    https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#discoveryengine_v1_generated_DocumentService_ImportDocuments_sync-drest