I have a question related to Storm Functionality. Suppose I have a spout which is reading a csv file and emits records chunk by chunk. That is, it emits 100 records at a time to the bolt.
My question is that whether a single chunk when received by the bolt will be sent to only one executor or will be divided among different executors for the sake of parallelism.
Note : The bolt has 5 executors.
What do you mean by "it emits 100 record at a time"? Does it mean, that a single tuple contains 100 CSV lines? Or do you emit 100 tuples (each containing a single CSV line) in a single nextTuple()
call.
One side remark: it is considered bad practice to emit multiple tuples in a single call to nextTuple()
. If nextTuple()
blocks for any reason, the spout thread is blocked and cannot (for example) react on incoming acks
. Best practice is, to emit a single tuple for each call to nextTuple()
. If no tuple is available to be emitted, you should return (without emitting) and not block, to wait until a tuple is available.