pythonflaskwfastcgi

wFastCGI / Flask - Restarting webserver on IIS


I'm building a Web-App that is fetching data from an API and displaying it. For that im using Flask and the request library. Because the API is not well layed out, i need to make a bunch of API calls to get all the data i need.

Here is how the simplified folder structure looks like:

app.py
api/
  api.py

To not overload the api and sending hundreds of api requests on every GET requests, i tried to implement a function that fetches the data on webserver start, stores it into a variable and refreshes the data after a specific time. Here is a simplified api class and refresh function

"""
The API class gets initizialized on webserver start
"""
class API:
    def __init(self):
        self.API_KEY = 'xxx-xxx'
        self.BASE_URL = 'https://xxxxxxxx.com/3'
        self.HEADER = {
            'X-Api-Key': f'{self.API_KEY}',
            'Accept': 'application/json'
        }

    self.session = requests.session()
    self.session.headers.update(self.HEADER)

    self.data = {}
    self.refresh_time = 900 # how long the function should wait until next refresh

    threading.Thread(target=refresh_data).start()


def refresh_data(self):
    while True:
        self._refresh() # function that fetches the data from the API and stores/refreshes the in the self.data json
        time.sleep(self.refresh_time)

I know its probably not the best way how to handle this, but in my venv it works without problems.

If i make this webapp production ready > deploying it to Windows IIS with wFastCGI the webserver gets restartet randomly ( i didnt noticed any pattern ) and so the api class gets initizialized multiple times meaning the refresh function gets started multiple times.

Here is some logging of the webserver:

2023-06-05 07:54:29,298 [MainThread  ] [            <module>()] [INFO ]  Setting up APIs...         # Log from webserver
2023-06-05 07:54:29,299 [MainThread  ] [            __init__()] [DEBUG]  API Class init             > debug log in API class
2023-06-05 07:54:29,377 [MainThread  ] [               index()] [INFO ]  GET from 192.168.18.125    # GET request 
2023-06-05 07:54:30,001 [MainThread  ] [            <module>()] [INFO ]  Setting up APIs...         # Log from webserver
2023-06-05 07:54:30,001 [MainThread  ] [            <module>()] [INFO ]  Setting up APIs...         # Log from webserver
2023-06-05 07:54:30,001 [MainThread  ] [            __init__()] [DEBUG]  API Class init             > 
2023-06-05 07:54:30,001 [MainThread  ] [            __init__()] [DEBUG]  API Class init             > debug log from the same API class
2023-06-05 07:54:30,002 [Thread-1 (_s] [        refresh_data()] [INFO ]  Checking data...           
2023-06-05 07:54:30,002 [Thread-1 (_s] [        refresh_data()] [INFO ]  Checking data...
2023-06-05 07:54:30,006 [Thread-1 (_s] [            _refresh()] [INFO ]  Refreshing data...
2023-06-05 07:54:30,007 [Thread-1 (_s] [       get_something()] [INFO ]  Getting data...

I already did some research maybe this helps.

  1. wfastcgi github question so i thought because im writing the logs to a file in the webserver folder the server gets restarted, so i wrote logs outside the folder but the server kept restarting ( i also tried to edit the web.config but nothing worked for me )
  2. Microsoft dev network question a similar question i found

Can anyone explain this behavior to me? I would appriciate it if there are any suggestions how to handle a timed api call or in other words queue.

EDIT:

I found out that the IIS has a load balancing feature, which can load a website ( or web app ) on demand or let the website always running.

Here is what i found IIS - "Always On" Application Pool

But the features has no impact on the wFastCGI, the application is still restarting.


Solution

  • After various attempts and recommendations to use some kind of cache/file export i implemented caching to the webapp and since then it works great.

    I already used a session for my api requests and therefore, i simply changed from a normal session to a cached session from requests_cache

    Here is an example what i did:

    from requests_cache import CachedSession
    
    class Api:
        def __init__(self):
            self.API_KEY = 'xxx-xxx'
            self.BASE_URL = 'https://xxxxxxxx.com/3'
            self.HEADER = {
                'X-Api-Key': f'{self.API_KEY}',
                'Accept': 'application/json'
            }
    
            # Session cache setup ( when data expires )
            self.default_expire_after = 900
            self.urls_expire_after = {
                f'{self.BASE_URL}/endpoint1/': 900,
                f'{self.BASE_URL}/endpoint2/': 1800,
                f'{self.BASE_URL}/endpoint1': 3600
            }
    
    
            # Session that creates a cache file in the root dir in sqlite format
            self.session = CachedSession('cache',
                                         backend='sqlite',
                                         expire_after=self.default_expire_after,
                                         urls_expire_after=self.urls_expire_after)
            self.session.headers.update(self.HEADER)
    

    The API requests with all the data are getting stored in the cache and when the data expires the session sends out a new api request. If the data isnt expired it takes everything from the cache.

    This has two major improvements: