pythonamazon-web-servicespyspark

Pyspark Session did not reach idle status in time


I am running a long jupyter notebook on pyspark AWS. I have encountered a strange behavior so that whenever I stop a running cell and rerun it again I got an error similar to: An error was encountered: Session 111 did not reach idle status in time. Current status is busy, and I need to restart the kernet and rerun from the first cell which is too complicated.


Solution

  • place the below code at the start of the cell to ensure the previous session is stopped before creating a new one.

    try:
        spark.stop()
    except Exception as e:
        print(f"Spark stop failedd: {e}")
    

    relying on spark-submit is good way than jupiter environment