apache-kafkastreamingmaprbigdata

What are the differences between Kafka and MapR streams from coding perspective?


What are the differences between Kafka and MapR streams from coding perspective? I need to implement the MapR streams in future but currently I have only access to Kafka. So exploring the Kafka right now is useful? So that I can easily pick up on MapR streams once I get the access?


Solution

  • As such there is no big difference in Kafka and MapR Stream API in terms of coding.

    But there are some differences in terms of configuration and API arguments:

    1. Kafka supports Receiver and Direct both approaches, but MapR stream supports only Direct approach.
    2. The offset reset configuration value for reading the data from start, is smallest in Kafka, but in MapR Stream it is earliest.
    3. The Kafka API supports for passing the Key and Value deserializer arguments in method, but in MapR stream API you have to configure them in Kafka params map against key.deserializer and value.deserializer keys.

    Example of Direct approach for Kafka and MapR Stream API calls to receive the DStream:

    Kafka API:

    // setting the topic.
    HashSet<String> topicsSet = new HashSet<String>(Arrays.asList("myTopic"));
    
    // setting the broker list.
    Map<String, String> kafkaParams = new HashMap<String, String>();
    kafkaParams.put("metadata.broker.list", "localhost:9092");
    
    // To read the messages from start.
    kafkaParams.put("auto.offset.reset", "smallest");
    
    // creating the DStream
    JavaPairInputDStream<byte[], byte[]> kafkaStream = KafkaUtils.createDirectStream(streamingContext, byte[].class, byte[].class, DefaultDecoder.class, DefaultDecoder.class, kafkaParams, topicsSet);
    

    MapR Stream API:

    // setting the topic.
    HashSet<String> topicsSet = new HashSet<String>(Arrays.asList("myTopic"));
    
    // setting the broker list.
    Map<String, String> kafkaParams = new HashMap<String, String>(); 
    kafkaParams.put("metadata.broker.list", "localhost:9092"); 
    
    // To read the messages from start.
    kafkaParams.put("auto.offset.reset", "earliest");
    
    // setting up the key and value deserializer
    kafkaParams.put("key.deserializer", StringDeserializer.class.getName());
    kafkaParams.put("value.deserializer", ByteArrayDeserializer.class.getName()); 
    
    // creating the DStream
    JavaPairInputDStream<byte[], byte[]> kafkaStream = KafkaUtils.createDirectStream(streamingContext, byte[].class, byte[].class, kafkaParams, topicsSet);
    

    I hope the above explanation help you in understanding the differences between Kafka and MapR Stream API's.

    Thanks,
    Hokam
    www.streamanalytix.com