google-sheetsgoogle-apps-scriptweb-scraping

Scraping last 12 month's data for a webpage with dynamic form fields using Google Apps Script


Tanaike helped me get a solution to my problem of scraping data into Google Sheets in this earlier question here: Using IMPORTXML to scrape dropdown data from webpage into Google Sheets.

However, now I am having trouble using Google Apps Script scraping data for the webpage (https://www.scstrade.com/stockscreening/SS_CompanySnapShotHP.aspx?symbol=ABOT) which carries a table. I would like to scrape all the data from the table into Google Sheets using Google Apps Script but have to pick dates in the form such that I have data for the past 12 months.


Solution

  • Although I'm not sure whether I could correctly understand your expected values, how about the following approach? When I check the site https://www.scstrade.com/stockscreening/SS_CompanySnapShotHP.aspx?symbol=ABOT, it seems that the values are retrieved using the POST method with the JSON payload. In that case, the built-in function cannot be used. So, in this answer, I used Google Apps Script. The sample script is as follows.

    Sample script:

    Please copy and paste the following script into the script editor of Google Spreadsheet and save the script.

    When you use this script, please put a custom function like =SAMPLE("01/01/2024","06/09/2025") into a cell. By this, the script works.

    function SAMPLE(from = "01/01/2024", to = "06/09/2025", sort = "asc") {
      const url = "https://www.scstrade.com/stockscreening/SS_CompanySnapShotHP.aspx/chart";
      const obj = { "par": "ABOT", "date1": from, "date2": to, "sidx": "trading_Date" };
      const res = UrlFetchApp.fetch(url, { contentType: "application/json", payload: JSON.stringify(obj), muteHttpExceptions: true });
      if (res.getResponseCode() != 200) {
        return [["No values were obtained."]];
      }
      const resObj = JSON.parse(res.getContentText());
      const keys = Object.keys(resObj.d[0]);
      const values = resObj.d.map(o => {
        return keys.map(k => {
          if (k == "trading_Date") {
            const r = o[k].match(/\((\d*)\)/);
            return r ? new Date(Number(r[1])) : null;
          }
          return o[k];
        });
      });
      if (sort == "asc") {
        values.sort((a, b) => a[0].getTime() > b[0].getTime() ? 1 : -1);
      }
      return [keys, ...values];
    }
    

    Testing:

    When this script is run, the following result is obtained.

    enter image description here

    Note: