pythonweb-scrapingartificial-intelligenceduckduckgo

Traceback (most recent call last): File "<string>", line 1, in <module> error with download_images(dest, urls=search_images(f'{o} photo'))


I wrote this code, and it was working.

from duckduckgo_search import DDGS
from fastcore.all import *

def search_images(keywords, max_images=200):
    return L(DDGS().images(keywords, max_results=max_images)).itemgot('image')
urls = search_images('bird photos', max_images=1)
print(urls[0])
from fastdownload import download_url
dest = 'bird.jpg'
download_url(urls[0], dest, show_progress=False)
from fastai.vision.all import *
im = Image.open(dest)
im.to_thumb(256,256)
download_url(search_images('forest photos', max_images=1)[0], 'forest.jpg', show_progress=False)
Image.open('forest.jpg').to_thumb(256,256)

searches = 'forest','bird'
path = Path('bird_or_not')

I added

from time import sleep
for o in searches:
    dest = (path/o)
    dest.mkdir(exist_ok=True, parents=True)
    download_images(dest, urls=search_images(f'{o} photo'))
    sleep(10)  # Pause between searches to avoid over-loading server
    download_images(dest, urls=search_images(f'{o} sun photo'))
    sleep(10)
    download_images(dest, urls=search_images(f'{o} shade photo'))
    sleep(10)
    resize_images(path/o, max_size=400, dest=path/o)

and got

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\spawn.py", line 131, in _main
    prepare(preparation_data)
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 286, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\crist\PycharmProjects\AI Deep Learning\main.py", line 23, in <module>
    download_images(dest, urls=search_images(f'{o} photo'))
  File "C:\Users\crist\PycharmProjects\Complaint Bot\.venv\Lib\site-packages\fastai\vision\utils.py", line 44, in download_images
    parallel(partial(_download_image_inner, dest, timeout=timeout, preserve_filename=preserve_filename),
  File "C:\Users\crist\PycharmProjects\Complaint Bot\.venv\Lib\site-packages\fastcore\parallel.py", line 130, in parallel
    r = ex.map(f,items, *args, timeout=timeout, chunksize=chunksize, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\crist\PycharmProjects\Complaint Bot\.venv\Lib\site-packages\fastcore\parallel.py", line 85, in map
    if self.not_parallel == False: self.lock = Manager().Lock()
                                               ^^^^^^^^^
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\context.py", line 57, in Manager
    m.start()
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\managers.py", line 562, in start
    self._process.start()
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\context.py", line 337, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\crist\AppData\Local\Programs\Python\Python312\Lib\multiprocessing\spawn.py", line 164, in get_preparation_data
    _check_not_importing_main()

. Can someone explain why it's happening, and how I can fix & avoid it?

I tried redownloading libraries, which for some reason it wouldn't let me update pip. I also tried debugger and got nothing. I also tried closing other applications with Task Manager, but it wasn't the issue.


Solution

  • You can try the following, wrap your script's main logic inside:

    if __name__ == '__main__':
    

    because Windows by default is waiting for the spawn method not parallel I faced a similar issue Using selenium and I tried this solution and it worked for me.