I have an interesting problem that is driving me nuts. I have a python program that is using watchdog.observers.Observer. This program (aka watcher) watches a folder and responds when files appear in it. I have another program (aka parser) which periodically populates the watched folder with files.
Here's my watcher code:
import os
import sys
import time
from watchdog.observers import Observer
from event_handler import ImagesEventHandler
from constants import ROOT_FOLDER, IMAGES_FOLDER, CWD
class ImagesWatcher:
def __init__(self, src_path):
self.__src_path = src_path
print(self.__src_path)
self.__event_handler = ImagesEventHandler()
self.__event_observer = Observer()
print("********** Inside ImagesWatcher --init__ method just after instantiating ImagesEventHandler and Observer **************")
def run(self):
print("********** Inside ImagesWatcher run method **************")
self.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
self.stop()
def start(self):
print("********** Inside ImagesWatcher start method **************")
self.__schedule()
self.__event_observer.start()
def stop(self):
print("********** Inside ImagesWatcher stop method **************")
self.__event_observer.stop()
self.__event_observer.join()
def __schedule(self):
print("********** Inside ImagesWatcher __schedule method **************")
print(self.__src_path)
self.__event_observer.schedule(
self.__event_handler,
self.__src_path,
recursive=True
)
if __name__ == "__main__":
src_path = sys.argv[1] if len(sys.argv) > 1 else CWD
src_path = os.path.abspath(src_path)
watch_path = os.path.join(src_path, ROOT_FOLDER)
watch_path = os.path.join(watch_path, IMAGES_FOLDER)
print('watch_path: ' + watch_path)
if not os.path.exists(watch_path):
os.makedirs(watch_path)
print('just created: ' + watch_path)
ImagesWatcher(watch_path).run()
Here's the associated event handler code:
import os
from PIL import Image
from watchdog.events import FileSystemEventHandler
from lambda_function import lambda_handler
from time import sleep
from os.path import dirname, abspath
class ImagesEventHandler(FileSystemEventHandler):
def __init__(self,):
print("********** Inside event handler __init__ method **************")
def on_created(self, event):
print("********** Inside event handler on_created method **************")
self.process(event)
def process(self, event):
print("********** Inside event handler process method **************")
sleep(2)
image = Image.open(event.src_path)
tracking_dir=os.path.join(dirname(dirname(abspath(event.src_path))),'Tracking')
print("******************** tracking_dir: ' + tracking_dir + ' ********************")
lambda_handler(image,tracking_dir)
The stop method of the watcher is never executed. The init method of the event handler is executed, but neither the on_created nor the process methods are executed.
Here's how I build and run the docker containers:
docker build -t watcher -f docker/watcher/Dockerfile .
docker run -d --network onprem_network -v c:\My_MR:/code/My_MR --name watcher watcher
docker build -t parser -f docker/parser/Dockerfile .
docker run -d --network onprem_network -v c:\My_MR:/code/My_MR --name parser parser
My watcher Dockerfile:
FROM python:3.7.9
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
COPY requirements.txt /requirements.txt
RUN pip install --upgrade pip -r /requirements.txt && mkdir /code
WORKDIR /code
COPY . /code/
RUN apt update && apt-get update && apt install tesseract-ocr -y && apt-get install ffmpeg libsm6 libxext6 -y
CMD ["python", "/code/watcher.py"]
My parser Dockerfile:
FROM python:3.7.9
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
COPY requirements.txt /requirements.txt
RUN pip install --upgrade pip -r /requirements.txt && mkdir /code
WORKDIR /code
COPY . /code/
RUN apt update && apt-get update && apt-get install ffmpeg -y
CMD ["python", "/code/parser.py"]
My requirements.txt:
Pillow == 5.4.1
gql == 3.0.0a5
matplotlib == 3.0.3
numpy == 1.16.2
opencv_python == 4.4.0.44
pandas == 0.24.2
pytesseract == 0.2.6
python_ffmpeg_video_streaming == 0.1.14
watchdog == 2.0.2
requests
tesseract
Any help would be greatly appreciated.
The underlying API that watchdog uses to monitor linux filesystem events is called inotify. The Docker for Windows WSL 2 backend documentation notes:
Linux containers only receive file change events (“inotify events”) if the original files are stored in the Linux filesystem.
The directory you're mounting, c:\My_MR
, resides on the Windows file system and thus inotify inside the watcher container doesn't work.
Instead, you can run docker from inside your WSL 2 default distribution with a linux filesystem path, e.g., ~/my_mr
:
docker run -d --network onprem_network -v ~/my_mr:/code/My_MR --name watcher watcher
docker run -d --network onprem_network -v ~/my_mr:/code/My_MR --name parser parser
This directory can be accessed from Windows while that WSL 2 distribution is running using the \\wsl$\
network path, i.e., \\wsl$\<Distro name>\home\<username>\my_mr
(more info here). Accordingly, I believe docker run
could also be used from Windows using the \\wsl$\
path with -v
.