I'm trying to get some details about movies from IMDB.
For that I'm using IMDBpy with following code:
import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
# First, retrieve the movie object using its ID
movie = ia.get_movie(topmovie.movieID)
cast = movie.get('cast')
topActors = 3
i = i+1;
actor_names = [actor['name'] for actor in cast[:topActors]]
#director_name = [director['director'] for director in cast[:topActors]]
if i <= 10:
print(movie, ';', ' | '.join(movie['genres']),
';', ' | '.join(actor_names),
';', ' | '.join(str(movie['director']))
);
else:
break;
However when I run my code I am getting my results with this format:
The Shawshank Redemption ; Drama ; Tim Robbins | Morgan Freeman | Bob Gunton ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 1 | 1 | 0 | 4 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | D | a | r | a | b | o | n | t | , | | F | r | a | n | k | _ | > | ]
The Godfather ; Crime | Drama ; Marlon Brando | Al Pacino | James Caan ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 3 | 3 | 8 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | C | o | p | p | o | l | a | , | | F | r | a | n | c | i | s | | F | o | r | d | _ | > | ]
The Godfather: Part II ; Crime | Drama ; Al Pacino | Robert Duvall | Diane Keaton ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 3 | 3 | 8 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | C | o | p | p | o | l | a | , | | F | r | a | n | c | i | s | | F | o | r | d | _ | > | ]
The Dark Knight ; Action | Crime | Drama | Thriller ; Christian Bale | Heath Ledger | Aaron Eckhart ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 6 | 3 | 4 | 2 | 4 | 0 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | N | o | l | a | n | , | | C | h | r | i | s | t | o | p | h | e | r | _ | > | ]
12 Angry Men ; Crime | Drama ; Martin Balsam | John Fiedler | Lee J. Cobb ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 1 | 4 | 8 | 6 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | L | u | m | e | t | , | | S | i | d | n | e | y | _ | > | ]
Schindler's List ; Biography | Drama | History ; Liam Neeson | Ben Kingsley | Ralph Fiennes ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 2 | 2 | 9 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | S | p | i | e | l | b | e | r | g | , | | S | t | e | v | e | n | _ | > | ]
The Lord of the Rings: The Return of the King ; Action | Adventure | Drama | Fantasy ; Noel Appleby | Ali Astin | Sean Astin ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 1 | 3 | 9 | 2 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | J | a | c | k | s | o | n | , | | P | e | t | e | r | _ | > | ]
Pulp Fiction ; Crime | Drama ; Tim Roth | Amanda Plummer | Laura Lovelace ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 2 | 3 | 3 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | T | a | r | a | n | t | i | n | o | , | | Q | u | e | n | t | i | n | _ | > | ]
The Good, the Bad and the Ugly ; Western ; Eli Wallach | Clint Eastwood | Lee Van Cleef ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 1 | 4 | 6 | 6 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | L | e | o | n | e | , | | S | e | r | g | i | o | _ | > | ]
Fight Club ; Drama ; Edward Norton | Brad Pitt | Meat Loaf ; [ | < | P | e | r | s | o | n | | i | d | : | 0 | 0 | 0 | 0 | 3 | 9 | 9 | [ | h | t | t | p | ] | | n | a | m | e | : | _ | F | i | n | c | h | e | r | , | | D | a | v | i | d | _ | > | ]
As you can see the columns for Director is returning with multipli characters...
How can I solve this?
Already solve this issue:
import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
# First, retrieve the movie object using its ID
movie = ia.get_movie(topmovie.movieID)
cast = movie.get('cast')
directors = movie.get('director')
topActors = 3
i = i+1;
actor_names = [actor['name'] for actor in cast[:topActors]]
director_names = [director['name'] for director in directors[:1]]
if i <= 10:
print(movie, ' ; ', ' | '.join(movie['genres']),
' ; ', ' | '.join(actor_names),
' ; ', ' | '.join(director_names)
);
else:
break;
Thanks!
movie['director'] is a list of Movie objects; casting it to str you will get something like "[<Object1>, <Object2>]" and then you use this string as an iterable for the join method.
You should get the directors' names exactly like you do with the cast names.
For example:
print(movie, ';', ' | '.join(movie['genres']),
';', ' | '.join(actor_names),
';', ' | '.join([d['name'] for d in movie['director']])
);