pythonpandassqlalchemymodin

modin shown a warning message "Perhaps you already have a cluster running?"


I am using modin to read an sql table, however I am getting this warning

    import pyodbc
    import sqlalchemy as sal
    from sqlalchemy import create_engine
    import modin.pandas as pd
    from distributed import Client
    client = Client()
   UserWarning: Port 8787 is already in use.
   Perhaps you already have a cluster running?
   Hosting the HTTP server on port 57475 instead

I am new usning modin and cant figure out whats going on, Jupyter Lab slows down for 5 minutes and then the dataframe is loaded, Any Suggestions/Recommendations ?


Solution

  • It seems you are using Modin in which engine initialization is being occurred while importing, i.e. at this moment import modin.pandas as pd. You don't need to create dask client yourself after that because dask environment has already been initialized. But if you want to create dask client yourself, you just need to move some lines:

    import pyodbc
    import sqlalchemy as sal
    from sqlalchemy import create_engine
    from distributed import Client
    client = Client()
    import modin.pandas as pd # Modin will connect to current dask environment
    

    Does this make sense?