hazelcast-jet

Hazelcast Jet Kafka with not serializable event handler


I want to use hazelcast-jet-kafka in my app, because in my case the number of kafka partitions is limited. How I understand jet-kafka parallelism doesn't depend on kafka partitions, it would be nice to find explanations of how jet-kafka achieve independence of the number of kafka partitions.

But my question is how I can handle events in jet when my event handler could not be serializable. For example, I've found a solution - use map sink and add local event listener to this map, but for me, it seems like a crutch, because I don't need to store these events in map. It is possible to set map size to zero in such scheme?

Also, I see in docs new type of sink - observable, it seems what I want, but observable listener could not get only local entries and for me, it is not suitable.

Could you help find the right solution? Or hazelcast-jet-kafka is not a good choice in that case?


Solution

  • it would be nice to find explanations of how jet-kafka achieve independence of the number of kafka partitions.

    One Jet thread can handle any number of partitions, so it's easy to achieve this independence. Jet just distributes all the partitions fairly among all the Kafka connector threads.

    But my question is how I can handle events in jet when my event handler could not be serializable.

    Hazelcast Jet doesn't require your event handler to be serializable. If you need a stateful handler, you have to supply a function that creates the state object. The function must be serializable, but the state doesn't have to be. If you just want a stateless mapping function, it must be serializable, but usually there's no problem with that.

    If you are getting an error that says a function is non-serializable, this can be due to a common pitfall of capturing more state than you actually need in the lambda. You should show your code in that case.