So, I'm trying to program a Goodreads Information Fetcher App in Python using Goodreads' API. I'm currently working on the first function of the app which will fetch information from the API, the API returns an XML file.
I parsed the XML file and converted it to a JSON file, then I further converted it to a dictionary. but I still can't seem to extract the information from it, I've looked up other posts here, but nothing works.
main.py
def get_author_books(authorId):
url = "https://www.goodreads.com/author/list/{}?format=xml&key={}".format(authorId, key)
r = requests.get(url)
xml_file = r.content
json_file = json.dumps(xmltodict.parse(xml_file))
data = json.loads(json_file)
print("Book Name: " + str(data[0]["GoodreadsResponse"]["author"]["books"]["book"]))
I expect the output to give me the name of the first book in the dictionary.
Here is a sample XML file provided by Goodreads.
I think you lack understanding of how xml works, or at the very least, how the response you're getting is formatted.
The xml file you linked to has the following format:
<GoodreadsResponse>
<Request>...</Request>
<Author>
<id>...</id>
<name>...</name>
<link>...</link>
<books>
<book> [some stuff about the first book] </book>
<book> [some stuff about the second book] </book>
[More books]
</books>
</Author>
</GoodreadsResponse>
This means that in your data
object, data["GoodreadsResponse"]["author"]["books"]["book"]
is a collection of all the books in the response (all the elements surrounded by the <book>
tags). So:
data["GoodreadsResponse"]["author"]["books"]["book"][0]
is the first book.data["GoodreadsResponse"]["author"]["books"]["book"][1]
is the second book, and so on.Looking back at the xml, each book
element has an id
, isbn
, title
, description
, among other tags. So you can print the title of the first book by printing:
data["GoodreadsResponse"]["author"]["books"]["book"][0]["title"]
For reference, I'm running the following code using the xml file you linked to, you'd normally fetch this from the API:
import json
import xmltodict
f = open("source.xml", "r") # xml file in OP
xml_file = f.read()
json_file = json.dumps(xmltodict.parse(xml_file))
data = json.loads(json_file)
books = data["GoodreadsResponse"]["author"]["books"]["book"]
print(books[0]["title"]) # The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary