hadoophivehdfsexternal-tablesapache-hive

external tables in Hive


  1. I added a CSV file in HDFS using R script.

  2. I update this CSV with new CSV/append data to it

  3. Created table using hue in Hive over this CSV.

  4. Altered it to be an external table.

Now, if when data is changed in the hdfs location, would data be automatically updated in hive table?


Solution

  • That's the thing with external (and also managed) tables in Hive. They're not really tables. You can think of them as link to HDFS location. So whenever you query external table, Hive reads all the data from location you selected when you created this table.

    From Hive doc:

    An EXTERNAL table points to any HDFS location for its storage, rather than being stored in a folder specified by the configuration property hive.metastore.warehouse.dir.