I am working on a thesis proposal how to optimize big data architecture using blockchain/IPFS, by comparing the amount of data availability on Blockchain/IPFS and Hadoop/HDFS, the challenge is how do I calculate or measure availability on both architectures?
Here is a thesis of IPFS + Hadoop for Big Data analysis. https://www.cse.unsw.edu.au/~hpaik/thesis/showcases/16s2/scott_brisbane.pdf (1 slide summing up) https://s3-ap-southeast-2.amazonaws.com/scott-brisbane-thesis/decentralising-big-data-processing.pdf (actual paper)