My task is to download json-file from website (pubchem) using only the query string (h2o for example) and JS. I know it's possible to do with parsing, but this is too much code because of number of pages i need to parse for getting destination. Is there any other options to solve the problem? Using google didnt give me any of idea ):
You will still need to do some parsing if you really want to automate this, since only using a query parameter will get you to the main page that lists the 'articles' and you need to go in to find the URL that will give you the JSON format. But! I think you can "reverse engineer" it since the URLS for the article and its JSON format are very similar.
I checked out the website and tried to download one of the files that they have for https://pubchem.ncbi.nlm.nih.gov/compound/3076959 and it turns out to get the JSON representation this was the URL https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/748328/JSON/
As you can see they are very similar and you might be able to figure out how different topics such as compound
for example construct the JSON output endpoint.
To download the JSON files using NodeJS is to use the node-fetch
module or axios
library to send your http requests to the JSON endpoint and from there you can save the response to a file on your machine.
Here is an example of how you can do this with axios
and the NodeJS fs
module in order to save the file to your machine.
const fs = require("fs");
const fetch = require("node-fetch");
async function downloadASJson(url, fileName) {
const response = await fetch(url);
const jsonContent = await response.buffer();
fs.writeFile(`${fileName}.json`, jsonContent, "utf8", function (err) {
if (err) {
console.log("An error occured while writing JSON Object to File.");
return console.log(err);
}
console.log("JSON file has been saved.");
});
}
try {
downloadASJson(
"https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/748328/JSON/",
"2-Methyl-3-(5'-bromobenzofuroyl-2')-4-dimethylaminomethyl-5-hydroxybenzofuran HCl H20"
);
} catch (err) {
console.log(error);
}
You save the following code in a file called app.js
for example, and you can use node app.js
to run it. Don't forget to install the dependencies.