hadoopmapreducegoogle-bigqueryabstraction

What is Google's Dremel? How is it different from Mapreduce?


Google's Dremel is described here. What's the difference between Dremel and Mapreduce?


Solution

  • Check this article out. Dremel is the what the future of hive should (and will) be.

    The major issue of MapReduce and solutions on top of it, like Pig, Hive etc, is that they have an inherent latency between running the job and getting the answer. Dremel uses a totally novel approach (came out in 2010 in that paper by google) which...

    ...uses a novel query execution engine based on aggregator trees...

    ...to run almost realtime , interactive AND adhoc queries both of which MapReduce cannot. And Pig and Hive aren't real time

    You should keep an eye on projects coming out of this. Is is pretty new for me too... so any other expert comments are welcome!

    Edit: Dremel is what the future of HIVE (and not MapReduce as I mentioned before) should be. Hive right now provides a SQL like interface to run MapReduce jobs. Hive has very high latency, and so is not practical in ad-hoc data analysis. Dremel provides a very fast SQL like interface to the data by using a different technique than MapReduce.