pythonpysparkdebiandelta-lakedata-lake

how Install the Delta Lake package on the on-premise environment?


I want make a data lake for my self without using any cloud service. I now have an Debian server and I want create this data lake with Databricks solution, Delta Lake.

As I search all sample for stablish Delta Lake in could service.

How can I do this in my own server?

Maybe I want create an cluster for store data and doing machine learning. And I want use only python for create Delta Lake.


Solution

  • It's a broad question. The Delta Lake itself is just a library that allows you to work with data in a specific format. To use it you need few things: