I'm new to working with APIs. I'm running an iterator in Jupyter Notebook which calls Box.com API for some data (.docx and .pdf files). The main function cell prints a lot of http header responses per per directory scraped while iterating. This builds up as I'm iterating through roughly 9000 files, making the notebook super heavy (over 100 Mb). At this point the notebook becomes irresponsive even when I'm using 16Gb RAM.
Is there a way to suppress those header responses, prevent them from printing in the cell output, or an alternate approach to this?
I have tried the semicolon(;
) at the end of box API calls and %%capture
.
I'm not sure what I'm doing wrong here. I need the output for training a word2vec model and I have build the whole data processing pipeline.
I figured it out. You can use logging
in python for controlling the level of logs/header outputs in a Notebook cell. The only thing to note (that I missed) is that you must add the logging statement at the top of the particular cell you want to trim the output for. Its scope is limited to the cell, not the whole Jupyter Notebook.
Various levels of logging are described here: https://docs.python.org/3/howto/logging.html
Sample logging statement (one that worked for my case):
logging.getLogger().setLevel(logging.CRITICAL)
Note: Any Python print()
statements are not affected by this.