How can I extract a variable from a script tag of the page from a returned HTML Page in Javasc./Typescript?
My API request to the Server:
const response = await fetch( ... )
The response contains a big HTML Page, here just an example:
<h1>Willkommen auf der Seite für Steam App Daten</h1>
<script type="text/javascript">
var g_rgAppContextData = {
"730": {
"appid": 730,
"name": "Counter-Strike 2",
"icon": "https://cdn.fastly.steamstatic.com/steamcommunity/public/images/apps/730/8dbc71957312bbd3baea65848b545be9eae2a355.jpg",
"link": "https://steamcommunity.com/app/730"
}
};
var g_rgCurrency = [];
</script>
I only want to extract the Variable g_rgAppContextData without anything else. I know, that i can select the script tag with getElementsByTagName("script") but what if there are 2 script tags? And how to select only the Variable?
Since the pages you want to scrape follow a certain pattern, it seems possible to make a number of simplifying assumptions about the structure of the returned HTML:
"730"
are quoted).}
.};
.Let me know if these assumptions are not justified in your case.
Under these assumptions, you can extract the variable value with a regular expression and parse it as JSON:
const response = await fetch("...");
const html = await response.text();
const g_rgAppContextData = JSON.parse(
html.match(/g_rgAppContextData\s*=\s*(\{.*?\});/s)[1]
);