google-sheetsgoogle-apps-scriptweb-scrapingdynamicdrop-down-menu

Importing a table from website into google sheets


I am trying to produce a spreadsheet which gives me football teams recent performance, much quicker than searching different HTML pages for each league.

I found a previous website that showed each leagues last 4 results, and started to use sheets importHTML function, which worked perfectly and I started to create a tidy little sheet with league tables

I then found a website which can go as low as results for last 3 games instead of last 4/5 (here, if it helps). I imported the HTML table without problems using :

=IMPORTHTML("https://fcstats.com/table,eliteserien-norway,50,5.php","table",1)

However, I get the full table and cannot get Google Sheets to recognise that I have selected last 3 recent games from the drop down box on the website, even if I use this XPath in cell A20 of the sheet

=IMPORTXML("https://fcstats.com/table,eliteserien-norway,50,5.php","/html/body/div[1]/div[6]/div[2]/div/div[3]")

I've watched many api, json, scraping videos, I understand basics how to use devtools, but it's all getting pretty confusing for me. Can anyone help me with an import function or appscript?


Solution

  • I believe your expected result is as follows.

    When I check the site, I confirmed that the values when the amount is changed are loaded by giving the values of custom_type and custom_amount as the form data. This is the request of the POST method. In this case, the built-in functions cannot be achieved. But when Google Apps Script is used, your expected values can be obtained. In this answer, I would like to suggest a sample script for retrieving the values. And, the script is used as a custom function on Google Sheets.

    Sample script:

    Please copy and paste the following script into the script editor of Google Sheets and save the script.

    function SAMPLE(amount = 5, type = "recent") {
      const url = "https://fcstats.com/table,eliteserien-norway,50,5.php";
      const res = UrlFetchApp.fetch(url, { payload: `custom_type=${type}&custom_amount=${amount}` });
      const text = res.getContentText();
      const table = text.match(/<table class\="table evenOdd sortable tips">[\w\s\S]*?<\/table>/);
      if (!table) {
        return [["No table"]];
      }
      try {
        return XmlService.parse(table[0]).getRootElement().getChildren().flatMap(c => c.getChildren().map(d => d.getChildren().map(e => e.getValue().trim())));
      } catch ({ message }) {
        console.error(message);
        return [["The table couldn't be parsed."]];
      }
    }
    

    Testing:

    When =SAMPLE(5) is put into "A1", the following result is obtained. In this case, 5 is the number 5 of the dropdown list. This is the same with your current situation.

    enter image description here

    From I have selected last 3 recent games from the drop down box on the website, in this case, please put a custom function of =SAMPLE(3) into "A1". 3 is the number 3 of the dropdown list. By this, the following result is obtained. This is the same with the following site.

    enter image description here

    enter image description here

    When you want to retrieve other tables, please change the number of =SAMPLE(3). By the way, the 2nd argument is the type of the dropdown list to the left of the number. It seems that recent and first can be used. The default is recent.

    Note:

    Reference: