The below code is from chapter 3 of Python Data Science Handbook by Jake VanderPlas. Each line in the file is a valid JSON. While I don't think the specifics of the file are critical to answering this question, the url for the file is https://github.com/fictivekin/openrecipes.
# read the entire file into a Python array
with open('recipeitems-latest.json', 'r') as f:
# Extract each line
data = (line.strip() for line in f)
# Reformat so each line is the element of a list
data_json = "[{0}]".format(','.join(data))
# read the result as a JSON
recipes = pd.read_json(data_json)
Two questions:
You have two questions in here:
>>> f = open('things_which_i_should_know')
>>> data = (line.strip() for line in f)
>>> type(data)
<class 'generator'>
>>> data = [line.strip() for line in f]
>>> type(data)
<class 'list'>
>>>
Please see official documentation for more info.
With a list comprehension, you get back a Python list; stripped_list is a list containing the resulting lines, not an iterator. Generator expressions return an iterator that computes the values as necessary, not needing to materialize all the values at once. This means that list comprehensions aren’t useful if you’re working with iterators that return an infinite stream or a very large amount of data. Generator expressions are preferable in these situations.