pythonpython-multiprocessingbetfairapiclient

Error when trying to send data from APIClient to a function using multiprocessing


Betfairlightweight API: https://github.com/betcode-org/betfair

To work with this module, it is necessary to pass the APIClient data and login:

trading = betfairlightweight.APIClient(username, pw, app_key=app_key, cert_files=('blablabla.crt','blablabla.key'))
trading.login()

To speed up the data collection process, I'm use multiprocessing:

from multiprocessing import Pool

trading = betfairlightweight.APIClient(username, pw, app_key=app_key, cert_files=('blablabla.crt','blablabla.key'))
trading.login()

def main():
    matches_bf = # DataFrame...
    try:
        max_process = multiprocessing.cpu_count()-1 or 1
        pool = multiprocessing.Pool(max_process)
        list_pool = pool.map(data_event, matches_bf.iterrows())
    finally:
        pool.close()
        pool.join()

    trading.logout()

def data_event(event_bf):
    _, event_bf = event_bf
    event_id = event_bf['event_id']
    filter_catalog_markets = betfairlightweight.filters.market_filter(
        event_ids=[event_id],
        market_type_codes = [
            'MATCH_ODDS'
            ]
        )

    catalog_markets = trading.betting.list_market_catalogue(
        filter=filter_catalog_markets,
        max_results='100',
        sort='FIRST_TO_START',
        market_projection=['RUNNER_METADATA']
    )

     ... # some more code
     ... # some more code
     ... # some more code

That way 12 logins are made. For accessing an API, this is not the ideal way.

Why 12 logins?

When I activate the code it makes 1 login and when the multiprocessing pool is created, it generates 11 more logins, one for each process. If I put print(trading) exactly below trading.login(), one print statement appears in the terminal when the code starts to run, then another 11 happen simultaneously when the pool is created.

So I need to find a way to be able to do this same service using only ONE login.

I tried to throw the login inside main() and add as an argument to call the function:

from multiprocessing import Pool
from itertools import repeat

def main():
    trading = betfairlightweight.APIClient(username, pw, app_key=app_key, cert_files=('blablabla.crt','blablabla.key'))
    trading.login()

    matches_bf = # DataFrame...
    try:
        max_process = multiprocessing.cpu_count()-1 or 1
        pool = multiprocessing.Pool(max_process)
        list_pool = pool.map(data_event, zip(repeat(trading),matches_bf.iterrows()))
    finally:
        pool.close()
        pool.join()

    trading.logout()

def data_event(trading,event_bf):
    trading = trading
    _, event_bf = event_bf
    event_id = event_bf['event_id']
    filter_catalog_markets = betfairlightweight.filters.market_filter(
        event_ids=[event_id],
        market_type_codes = [
            'MATCH_ODDS'
            ]
        )

    catalog_markets = trading.betting.list_market_catalogue(
        filter=filter_catalog_markets,
        max_results='100',
        sort='FIRST_TO_START',
        market_projection=['RUNNER_METADATA']
    )

     ... # some more code
     ... # some more code
     ... # some more code

But the error encountered is:

TypeError: cannot pickle 'module' object

I tried to put trading inside the function data_event:

from multiprocessing import Pool

def main():
    trading = betfairlightweight.APIClient(username, pw, app_key=app_key, cert_files=('blablabla.crt','blablabla.key'))
    trading.login()

    matches_bf = # DataFrame...
    try:
        max_process = multiprocessing.cpu_count()-1 or 1
        pool = multiprocessing.Pool(max_process)
        list_pool = pool.map(data_event, matches_bf.iterrows())
    finally:
        pool.close()
        pool.join()

    trading.logout()

def data_event(event_bf):
    trading = betfairlightweight.APIClient(username, pw, app_key=app_key, cert_files=('blablabla.crt','blablabla.key'))
    _, event_bf = event_bf
    event_id = event_bf['event_id']
    filter_catalog_markets = betfairlightweight.filters.market_filter(
        event_ids=[event_id],
        market_type_codes = [
            'MATCH_ODDS'
            ]
        )

    catalog_markets = trading.betting.list_market_catalogue(
        filter=filter_catalog_markets,
        max_results='100',
        sort='FIRST_TO_START',
        market_projection=['RUNNER_METADATA']
    )

     ... # some more code
     ... # some more code
     ... # some more code

But the error encountered is:

errorCode': 'INVALID_SESSION_INFORMATION'

The reason is logical: multiprocessing did not login.

How should I proceed so that I use only one login and can do everything I need without being forced to work one by one (line by line without multiprocessing takes too long, not feasible)?

Additional info:


Solution

  • Pass session for betfairlightweight.APIClient instance to be pickleable.

    trading = betfairlightweight.APIClient(
        username,
        pw,
        app_key=app_key,
        cert_files=('blablabla.crt','blablabla.key'),
        session=requests.Session(),  # Add this
    )
    

    Explanation

    TypeError: cannot pickle 'module' object

    APIClient (BaseClient) defaults self.session to the requests module.

    class APIClient(BaseClient):
        ...
    
    class BaseClient:
        ...
    
        def __init__(
            self,
            username: str,
            password: str = None,
            app_key: str = None,
            certs: str = None,
            locale: str = None,
            cert_files: Union[Tuple[str], str, None] = None,
            lightweight: bool = False,
            session: requests.Session = None,
        ):
            ...
    
            self.session = session if session else requests
            ...