I'm trying to convert a PDF document but I am having problems regarding the accents in words. The PDF is in Portuguese-Brazil language.
This is the command i'm running:
curl -X POST -u "OMITTED":"OMITTED" -F config="{\"conversion_target\":\"answer_units\"}" -F file=@876.pdf "https://gateway.watsonplatform.net/document-conversion/api/v1/convert_document?version=2015-12-15"
And this is the output i get:
"id":"8ade6b42-e619-4aa8-b8bb-333f3e659874",
"type":"h4",
"parent_id":"c34123d4-675b-48b7-111a-1cc7e4ac32ec",
"title":"5.3 Exce├º├Áes n├úo tratadas pelo sistema",
"direction":"ltr",
"content":[
{
"media_type":"text/plain",
"text":"Alguns sistemas n├úo fazem o tratamento completo de exce├º├Áes (cancelamentos de notas fiscais, ordens de servi├ºo, devolu├º├Áes, fechamentos etc.), gerando a necessidade de interven├º├úo da ├írea de inform├ítica, por meio de programas \"quebra-galho\" ou por manipula├º├úo direta de bases de dados, o que pode causar atrasos no processo e desvio de fun├º├úo. Normalmente s├úo necess├írios os estornos cont├íbeis feitos por meio de lan├ºamentos manuais (n├úo autom├íticos)."
}
]
},
The letters which have accents are coming out all misconfigured. Does it have any option that I can change to fix this? Already I tested with multiple PDF files and they all give the same result.
Thank you!
Solved,
The problem was with the terminal Character settings and not the output from the service. I had to run curl command again and have the output go to a file. Using this command with the -o option to write the output to a file:
curl -X POST -u "USER":"PASS" -F config="{\"conversion_target\":\"answer_units\"}" -F file=@"Text.pdf" "https://gateway.watsonplatform.net/document-conversion/api/v1/convert_document?version=2015-12-15" -o text1.json
After that everything went perfectly !!
I thank Jeff L. from IBM Bluemix support for having discovered and solved the problem.