I am having around 2000000 messages in Kafka topic and I want to put these records into HDFS using NiFi,so I am using PutHDFS
processor for this along with ConsumeKafka_0_10
but it generates small files in HDFS, So I am using Merge Content processor for the merging the records before pushing the file.
Please help if the configuration needs changes This works fine for small number of messages but writes a single file for every record when it comes to topics with massive data.
Thank you!!
The Minimum Number of Entries is set to 1 which means it could have anywhere from 1 to the Max Number of Entries. Try making that something higher like 100k.