I've a data-fetch stage where I get multiple DFs and serialize those. I'm currently treating OutputPath as directory - create it if it doesn't exist etc. and then serialize all the DFs in that path with different names for each DF.
In a subsequent pipeline stage (say, predict) I need to retrieve all those through InputPath.
Now, from the documentation it seems InputPath/OutputPath as file. Does kubeflow as any limitation if I use it as directory?
The ComponentSpec's {inputPath: input_name}
and {outputPath: output_name}
placeholders and their Python analogs (input_name: InputPath()
/output_name: OutputPath()
) are designed to support both files/blobs and directories.
They are expected to provide the path for the input/output data. No matter whether the data is a blob/file or a directory.
The only limitation is that UX might not be able to preview such artifacts. But the pipeline itself would work.