Is there any real benchmarks between Apache Flink and apache storm in real time processing based on performance comparison ?
Also if I want to make this performance comparison and implement it by myself, is there any stream API (like twitter API) that offers high throughput than twitter and which is open source ?
Thank you !
There are some benchmarks for Stream Processing in general - but they are not always broadly applicable or accessible than the ones you can find for RDBMS.
A main question that you should answer for yourself at first is: What exactly do you mean with performance? There are different metrics how to benchmark such a system.
However, I will try here to list some benchmarking works, that helped me:
A recent benchmarking framework that is implemented for Storm & Flink is the Yahoo Streaming Benchmark. It has a fixed internal architecture using Kafka & Redis and a predefined query/topology. Anyways, it is a good starting point.
Karimov et al have a nice paper regarding benchmarking of these systems. It is worth a read since it really helps to understand possible metrics. Unfortunately, I can not find any implementation or further information on their workload (data and queries) that they use - so it is more helpful for understanding, I would say.
van Dongen et al are doing a more in-depth analysis of several stream processing systems and offer their source code at github. Unfortunately, there is no implementation for Storm. But anyways, there are some interesting ideas & contributions on how to build such a framework.
As you see, Stream Processing has a high diversity in the way you can set-up and benchmark your systems...