apache-sparkapache-spark-sqlazure-data-lakedelta-lakedata-lakehouse

External vs Internal table in Delta Lake


Are there any performance benefits of the Internal table in Delta Lake compared to External Table as in both cases the source files reside in Data Lake?


Solution

  • There should not be much difference between managed vs unmanaged tables. They differ only by the path (default storage location vs explicitly specified) and behavior on what happens when you drop the table (drop data as well vs. dropping only table definition).

    Update Oct 2023: Things could be a bit different when you use Unity Catalog - right now, managed tables can have a bit more functionality, like, auto maintenance, etc. But it should eventually come to external tables as well.