What are some distributed and scalable alternatives to hadoop? Am looking for some distributed file systems like HDFS which can be used as a cheap and effective storage and would like a data processing engine(batch/real-time) on top of it. I know Spark can be a good alternative. But I would like to use this system as a file archive which is distributed, fault tolerant and scalable. Are there any apt solutions?
These are some other alternatives to Hadoop and Apache Spark. Cluster Map Reduce, Hydra and Conclusion, they are all relatively good for big data projects. Read more here https://datafloq.com/read/Big-Data-Hadoop-Alternatives/1135