I'm using Heron for performing streaming analytics on IoT data. Currently in the architecture there is only one spout with parallelism factor 1.
I'm trying to benchmark the stats on the amount of data Heron can hold in the queue which it internally uses at spout.
I'm playing around with the method setMaxSpoutPending() by passing value to it. I want to know if there is any limit on the number which we pass to this method?
Can we tweak the parameter method by increasing system configuration or providing more resource to the topology?
So if you have one spout and one bolt, then max spout pending is the best way to control the number of pending tuples. Max Spout pending can be increased indefinitely. However increasing it beyond a certain amount increases the probability of timeout errors happening and in the worst case there could be no forward progress. Also higher msp typically require more heap required for spout and other components of the topology.