movieimdbpy

IMDBpy - Director name is coming with the characters all divided


I'm trying to get some details about movies from IMDB.

For that I'm using IMDBpy with following code:

import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
    # First, retrieve the movie object using its ID
    movie = ia.get_movie(topmovie.movieID)
    cast = movie.get('cast')
    topActors = 3
    i = i+1;
    actor_names = [actor['name'] for actor in cast[:topActors]]
    #director_name = [director['director'] for director in cast[:topActors]]
    if i <= 10:
          print(movie,  ';', ' | '.join(movie['genres']),
                        ';', ' | '.join(actor_names),
                        ';', ' | '.join(str(movie['director']))
                );
    else:
         break;

However when I run my code I am getting my results with this format:

The Shawshank Redemption ; Drama ; Tim Robbins | Morgan Freeman | Bob Gunton ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 1 | 1 | 0 | 4 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | D | a | r | a | b | o | n | t | , |   | F | r | a | n | k | _ | > | ]
The Godfather ; Crime | Drama ; Marlon Brando | Al Pacino | James Caan ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 3 | 3 | 8 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | C | o | p | p | o | l | a | , |   | F | r | a | n | c | i | s |   | F | o | r | d | _ | > | ]
The Godfather: Part II ; Crime | Drama ; Al Pacino | Robert Duvall | Diane Keaton ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 3 | 3 | 8 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | C | o | p | p | o | l | a | , |   | F | r | a | n | c | i | s |   | F | o | r | d | _ | > | ]
The Dark Knight ; Action | Crime | Drama | Thriller ; Christian Bale | Heath Ledger | Aaron Eckhart ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 6 | 3 | 4 | 2 | 4 | 0 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | N | o | l | a | n | , |   | C | h | r | i | s | t | o | p | h | e | r | _ | > | ]
12 Angry Men ; Crime | Drama ; Martin Balsam | John Fiedler | Lee J. Cobb ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 1 | 4 | 8 | 6 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | L | u | m | e | t | , |   | S | i | d | n | e | y | _ | > | ]
Schindler's List ; Biography | Drama | History ; Liam Neeson | Ben Kingsley | Ralph Fiennes ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 2 | 2 | 9 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | S | p | i | e | l | b | e | r | g | , |   | S | t | e | v | e | n | _ | > | ]
The Lord of the Rings: The Return of the King ; Action | Adventure | Drama | Fantasy ; Noel Appleby | Ali Astin | Sean Astin ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 1 | 3 | 9 | 2 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | J | a | c | k | s | o | n | , |   | P | e | t | e | r | _ | > | ]
Pulp Fiction ; Crime | Drama ; Tim Roth | Amanda Plummer | Laura Lovelace ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 2 | 3 | 3 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | T | a | r | a | n | t | i | n | o | , |   | Q | u | e | n | t | i | n | _ | > | ]
The Good, the Bad and the Ugly ; Western ; Eli Wallach | Clint Eastwood | Lee Van Cleef ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 1 | 4 | 6 | 6 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | L | e | o | n | e | , |   | S | e | r | g | i | o | _ | > | ]
Fight Club ; Drama ; Edward Norton | Brad Pitt | Meat Loaf ; [ | < | P | e | r | s | o | n |   | i | d | : | 0 | 0 | 0 | 0 | 3 | 9 | 9 | [ | h | t | t | p | ] |   | n | a | m | e | : | _ | F | i | n | c | h | e | r | , |   | D | a | v | i | d | _ | > | ]

As you can see the columns for Director is returning with multipli characters...

How can I solve this?

Already solve this issue:

import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
    # First, retrieve the movie object using its ID
    movie = ia.get_movie(topmovie.movieID)
    cast = movie.get('cast')
    directors = movie.get('director')
    topActors = 3
    i = i+1;
    actor_names = [actor['name'] for actor in cast[:topActors]]
    director_names = [director['name'] for director in directors[:1]]
    if i <= 10:
          print(movie,  '   ;    ', ' | '.join(movie['genres']),
                        '   ;    ', ' | '.join(actor_names),
                        '   ;    ', ' | '.join(director_names)
                );
    else:
         break;

Thanks!


Solution

  • movie['director'] is a list of Movie objects; casting it to str you will get something like "[<Object1>, <Object2>]" and then you use this string as an iterable for the join method.

    You should get the directors' names exactly like you do with the cast names.

    For example: print(movie, ';', ' | '.join(movie['genres']), ';', ' | '.join(actor_names), ';', ' | '.join([d['name'] for d in movie['director']]) );