scalasdkhere-olp

Is reading non-published Volatile layer partitions against the best-practice?


I'm using OLP Volatile layer as the back-end of a real-time dashboard (average update cadence is about 5 secs.). The data is partitioned by source IDs and the set of source IDs varies a lot over time.

I understand that it is recommended in the documentation to publish your volatile layer partitions; however, unlike upload, publish is an expensive operation, and I believe it is not designed to be performed every few seconds.

So what I've been doing so far is I skip publish when writing data to the layer:

val writeEngine =
    DataEngine().writeEngine("hrn:of:my:catalog", new StableBlobIdGenerator(123L))
writeEngine.put(
  NewPartition(
    partition = "source-id-1",
    layer = "my-volatile-layer",
    data = someData
  )
)

and read data using the same blobIdGenerator as a priori:

readEngine
  .getDataAsBytes(new ReferencePartition(
    version = 123L,
    partition = "source-id-1",
    layer = "my-volatile-layer",
    dataHandle = (new StableBlobIdGenerator(123L)).generateBlobId(NewPartition(
      partition = "source-id-1",
      layer = "my-volatile-layer",
      data = NewPartition.ByteArrayData(Array.emptyByteArray)
    ))
  ))

I realize I'm treating the Volatile layer as an in-memory key-value store, and I understand this way I wouldn't be able to see my data in OLP console UI; but programmatically, the data is still uploaded and readable. Is it a legit use of Volatile's API?


Solution

  • It is valid to use a volatile layer as a key-value store even without the publication of metadata. As long as the data handles are known, this is fine. Metadata is useful if you need query by timestamp or partition id (if the data handle is unknown).

    If you do choose to publish metadata for your volatile layer, it would be most efficient to simply initialize a publication and upload metadata as partitions are added or removed, simply do not submit the publish job for finalization.