I have a large amount of static data that needs to offer random access. Since, I'm using Disco to digest it, I'm using the very impressive looking Discodex (key, value) store on top of the Disco Distributed File System. However, Disco's documentation is rather sparse, so I can't figure out how to use my Discodex indices as an input into a Disco job.
Is this even possible? If so, how do I do this?
Alternatively, I am thinking about this incorrectly? Would it be better to just store that data as a text file on DDFS?
Never mind, it appears that what I'm doing isn't really meant to be done. It might be possible, but it would be far better to merely use semantic DDFS tags to refer to blobs of data.
The correct use case for Discodex is to store indexes constructed by a Disco map-reduce program that does not need be the input of another map-reduce program.