apache-sparkapache-spark-sqlapache-hudi

How to insert struct, map type in Apache Hudi


I see the official document, there are no samples about inserting complex types like struct and map.

So, what's the grammar?

My table definition:

spark-sql> desc struct_map;
_hoodie_commit_time     string  NULL
_hoodie_commit_seqno    string  NULL
_hoodie_record_key      string  NULL
_hoodie_partition_path  string  NULL
_hoodie_file_name       string  NULL
uuid    int     NULL
col1    struct<col11:int,col12:struct<col121:int>>      NULL
col2    map<string,int> NULL

Solution

  • Hudi uses the spark SQL syntax, so you can use its documentation (examples from databricks doc: ex1, ex2)

    For map, you can use the func map('<field 1>', val1, '<field 2>', val2, ...), and for struct, you can use the func struct(val1, val2, ...) or named_struct('<field 1>', val1, '<field 2>', val2)

    INSERT INTO struct_map VALUES
    (0, struct(0, struct(0)), map('key1', 1, 'key2', 2));
    INSERT INTO struct_map VALUES
    (0, named_struct('col11', 0, 'col12', struct('col121', 0)), map('key1', 1, 'key2', 2));