arangodbpyarango

Batch Requests into ArangoDB failing


I am trying to import many thousands of records into Arango. I am attempting to use the batch/bulk import feature of Arango described at: https://docs.arangodb.com/3.11/develop/http/batch-requests/ to do a combination of PUT and POST requests to either insert new records, or update existing records if they already exist. My end solution needs to run from a Python script, presumably using pyArango. I have created a sample HTTP request

POST http://<arango_server>:8529/_db/myDB/_api/batch

that looks something like the following:

Content-Type: multipart/form-data; boundary=P1X7QNCB
Content-Length: <calculated by python or REST Client>
Authorization: Basic <calculated by python requests session or REST Client>

--P1X7QNCB
Content-type: application/x-arango-batchpart
Content-Id: 1

POST /_api/document/model/foo HTTP/1.1


{"data": "bar"}
--P1X7QNCB

I have not been able to get this to process successfully in Arango. I have tried using python similar to the following (that generates the above request, even if my approximation of the code below has typos):

url = "/_api/document/" + collection + "/" + nodeKey + " HTTP/1.1"
postString = ("--P1X7QNCB\r\n"
              "Content-type: application/x-arango-batchpart\r\n"
              "Content-Id: " + str(counter) +  "\r\n"
              "\r\n"
              "\r\n"
              "PUT " + url+ "\r\n\r\n\r\n" + json.dumps(nodeData) + "\r\n")
batchHeaders = {"Content-Type": "multipart/form-data; boundary=P1X7QNCB"}
response = self.db.connection.session.post(self.db.URL + "/batch", data=postString, headers=batchHeaders)

and using a REST client where I manually post the content. In both cases I get the following response back:

{"error":true,"errorMessage":"invalid multipart message received","code":400,"errorNum":400}

And the following is logged in the arango log file:

WARNING received a corrupted multipart message

Is it obvious to anyone what I am doing wrong, or where I can look for more details on why ArangoDB is rejecting the requests?

Thanks!


Solution

  • ArangoDB will throw this error when it tries to extract the next part of a multipart mime container and fails to.

    You should inspect your boundary strings, and check that the last string properly terminates the container with two trailing dashes (--)

    NGrep or Wireshark tend to be very usefull to inspect whats really sent by programs - it may sometimes not be what you think - or even get samples how to do it from other programs.