pythonbeautifulsoupimdb

WebScraping Storyline Section from IMDb using BeautifulSoup


I am trying to scrape IMDb's storyline section but I am not able to do so with following piece of code. Can someone please help me out?

import imdb
import requests
from tqdm import tqdm
from bs4 import BeautifulSoup

ia = imdb.IMDb()
import re
def get_summary(url):
    r = requests.get(url=url)
    soup = BeautifulSoup(r.text, 'html.parser')

    summ = soup.find_all("div", attrs = {'data-testid': 'storyline-plot-summary'})
    print(summ)
    return summ
get_summary("https://www.imdb.com/title/tt0114709/")

Solution

  • You can use - Cinemagoer. Just add cinema title to search

    from imdb import Cinemagoer
    
    
    ia = Cinemagoer()
    movies = ia.search_movie('Toy Story')
    if movies:
        movie = ia.get_movie(movies[0].movieID)
        print(movie['plot outline'])
    

    OUTPUT:

    A little boy named Andy loves to be in his room, playing with his toys, especially his doll named "Woody". But, what do the toys do when Andy is not with them, they come to life. Woody believes that his life (as a toy) is good. However, he must worry about Andy's family moving, and what Woody does not know is about Andy's birthday party. Woody does not realize that Andy's mother gave him an action figure known as Buzz Lightyear, who does not believe that he is a toy, and quickly becomes Andy's new favorite toy. Woody, who is now consumed with jealousy, tries to get rid of Buzz. Then, both Woody and Buzz are now lost. They must find a way to get back to Andy before he moves without them, but they will have to pass through a ruthless toy killer, Sid Phillips.