[SOLVED] Programmatically get movie name from hulu url?

Programmatically get movie name from hulu url?

I am using JavaScript.

Is there any programmatic way to fetch the movie name from a hulu url?

For example for the url

https://www.hulu.com/watch/78974b54-1feb-43ce-9a99-1c1e9e5fce3f

The response should be

My Favorite Girlfriend

The URL itself is just a uuid. I tried to fetch the page and look at the http response headers, html meta tag, but there is nothing useful.

Solution

Looking at the document returned from that URL there is a script tag that contains the information you need:

<script type="application/ld+json"> 
   {"@context":"http://schema.org","@type":"Movie","name":"My Favorite Girlfriend","description":"A chef's life gets complicated when he falls for a beautiful young woman who has multiple personalities.",
...
</script>

Using the npm package cheerio and some javascript to parse this:

const cheerio = require('cheerio');

const getMovieName = async (url) => {

    const htmlContent = await (await fetch(url)).text();

    // Load the HTML content into cheerio
    const $ = cheerio.load(htmlContent);

    // Find the script element with type "application/ld+json"
    const scriptElement = $('script[type="application/ld+json"]').first();

    if (scriptElement) {
        try {
            // Parse the JSON content
            const jsonData = JSON.parse(scriptElement.html());

            // Access the parsed data
            console.log(jsonData.name);
            return jsonData.name;

            // You can access other properties as well
            // For example: jsonData['@context'], jsonData['@type']
        } catch (error) {
            console.error('Error parsing JSON:', error);
        }
    } else {
        console.error('Script element not found');
    }
}

getMovieName("https://www.hulu.com/watch/78974b54-1feb-43ce-9a99-1c1e9e5fce3f")