Is there a light-weight solution to change the datatype of specific column in ORC file without having to convert entire column datatype and re-writing entire orc file?
The following is a heavy-weight solution:
Looking for a light-weight solution where I can just alter embedded metadata info.
Thanks!
It's not the answer that you're looking for, but no you can't change a column type in ORC without re-generating the file. What you're suggesting is the correct way to do it.
ORC includes indexes and aggregated values in the file header, and so changing a string -> double would require the entire column to be scanned so that the min/max/average etc could be calculated for what is now a numerical column.