I'm following tutorial Using Apache Spark 2.0 to Analyze the City of San Francisco's Open Data where it's claimed that the "local mode" Spark cluster available in Databricks "Community Edition" provides you with 3 executor slots. (So 3 Tasks should be able to run concurrently.)
However, when I look at the "Event Timeline" visualization for job stages with multiple tasks in my own notebook on Databricks "Community Edition", it looks like up to 8 tasks were running concurrently:
Is there a way to query the number of executor slots from PySpark or from a Databricks notebook? Or can I directly see the number in the Spark UI somewhere?
"Slots" is a term Databricks uses (or used?) for the threads available to do parallel work for Spark. The Spark documentation and Spark UI calls the same concept "cores", even though they are unrelated to physical CPU cores.
(See this answer on Hortonworks community, and this "Spark Tutorial: Learning Apache Spark" databricks notebook.)
To see how many there are in your Databricks cluster, click "Clusters" in the navigation area to the left, then hover over the entry for your cluster and click the "Spark UI" link. In the Spark UI, click the "Executors" tab.
You can see the number of executor cores (=executor slots) in both the summary and for each individual executor1 in the "cores" column of the respective table there:
1There's only one executor in "local mode" clusters, which are the cluster available in Databricks community edition.
How to query this number from within a notebook, I'm not sure.
spark.conf.get('spark.executor.cores')
results in java.util.NoSuchElementException: spark.executor.cores