hivemapreducehiveqlreducehive-configuration

How to export hive query result to single local file?


I want to export hive query result to single local file with pipe delimiter.

Hive query contains order by clause.

I have tried below solutions.

Solution1:

hive -e 'insert overwrite local directory '/problem1/solution' fields terminated by '|' select * from table_name order by rec_date'

This solution is creating multiple files. After merging files, it loosing data order.

Solution2:

beeline -u 'jdbc:hive2://server_ip:10000/db_name' --silent --outputformat=dsv --delimiterForDSV='|' -e 'select * from table_name order by rec_date' > /problem1/solution

This solution is creating single file but it has empty 2 lines at top and 2 lines at bottom.

I am removing empty lines using sed command. It takes very long time.

Is there any other efficient way to achieve this?


Solution

  • Try these settings for executing ORDER BY on single reducer:

    set hive.optimize.sampling.orderby=false; --disable parallel ORDER BY
    

    Or try to set the number of reducers manually:

    set mapred.reduce.tasks=1;