I have set the 'partitionSize' option to multiple different values, and I seem to get the same amount of partitions no matter the number. According to the documentation the should correspond to the HDFS block size. Is there something that I am missing.
HDFS block size 64M
CREATE TABLE TABLE_TEST (DEFINITION_INFO) USING com.sap.spark.vora OPTIONS ( tablename "TABLE_TEST", partitionSize "64", paths "/load_from_here/combined.csv", eagerLoad "true" )
The csv is about 680M
The name of the parameter is a bit misleading. It is not for partitioning tables, but rather to influence the load performance when loading data into tables. In newer versions it might be renamed to avoid this confusion.