pdfibm-cloudibm-watsondocument-conversion

Bluemix PDF Document Conversion


I'm trying to convert a PDF document but I am having problems regarding the accents in words. The PDF is in Portuguese-Brazil language.

This is the command i'm running:

curl -X POST -u "OMITTED":"OMITTED" -F config="{\"conversion_target\":\"answer_units\"}" -F file=@876.pdf "https://gateway.watsonplatform.net/document-conversion/api/v1/convert_document?version=2015-12-15"

And this is the output i get:

 "id":"8ade6b42-e619-4aa8-b8bb-333f3e659874",
  "type":"h4",
  "parent_id":"c34123d4-675b-48b7-111a-1cc7e4ac32ec",
  "title":"5.3 Exce├º├Áes n├úo tratadas pelo sistema",
  "direction":"ltr",
  "content":[
    {
      "media_type":"text/plain",
      "text":"Alguns sistemas n├úo fazem o tratamento completo de exce├º├Áes (cancelamentos de notas fiscais, ordens de servi├ºo, devolu├º├Áes, fechamentos etc.), gerando a necessidade de interven├º├úo da ├írea de inform├ítica, por meio de programas \"quebra-galho\" ou por manipula├º├úo direta de bases de dados, o que pode causar atrasos no processo e desvio de fun├º├úo. Normalmente s├úo necess├írios os estornos cont├íbeis feitos por meio de lan├ºamentos manuais (n├úo autom├íticos)."
    }
  ]
},

The letters which have accents are coming out all misconfigured. Does it have any option that I can change to fix this? Already I tested with multiple PDF files and they all give the same result.

Thank you!


Solution

  • Solved,

    The problem was with the terminal Character settings and not the output from the service. I had to run curl command again and have the output go to a file. Using this command with the -o option to write the output to a file:

    curl -X POST -u "USER":"PASS" -F config="{\"conversion_target\":\"answer_units\"}" -F file=@"Text.pdf" "https://gateway.watsonplatform.net/document-conversion/api/v1/convert_document?version=2015-12-15" -o text1.json

    After that everything went perfectly !!

    I thank Jeff L. from IBM Bluemix support for having discovered and solved the problem.