I am using JavaScript.
Is there any programmatic way to fetch the movie name from a hulu url?
For example for the url
https://www.hulu.com/watch/78974b54-1feb-43ce-9a99-1c1e9e5fce3f
The response should be
My Favorite Girlfriend
The URL itself is just a uuid. I tried to fetch the page and look at the http response headers, html meta tag, but there is nothing useful.
Looking at the document returned from that URL there is a script tag that contains the information you need:
<script type="application/ld+json">
{"@context":"http://schema.org","@type":"Movie","name":"My Favorite Girlfriend","description":"A chef's life gets complicated when he falls for a beautiful young woman who has multiple personalities.",
...
</script>
Using the npm package cheerio and some javascript to parse this:
const cheerio = require('cheerio');
const getMovieName = async (url) => {
const htmlContent = await (await fetch(url)).text();
// Load the HTML content into cheerio
const $ = cheerio.load(htmlContent);
// Find the script element with type "application/ld+json"
const scriptElement = $('script[type="application/ld+json"]').first();
if (scriptElement) {
try {
// Parse the JSON content
const jsonData = JSON.parse(scriptElement.html());
// Access the parsed data
console.log(jsonData.name);
return jsonData.name;
// You can access other properties as well
// For example: jsonData['@context'], jsonData['@type']
} catch (error) {
console.error('Error parsing JSON:', error);
}
} else {
console.error('Script element not found');
}
}
getMovieName("https://www.hulu.com/watch/78974b54-1feb-43ce-9a99-1c1e9e5fce3f")