I would like to configure process function and pass there custom config like database hostname etc.
I know I can create a config file and then I have it in my global configuration. Having that, I can pass it as constructor parameter to the process function.
Is there any other way to pass configuration to the process function?
There is "open" method but I did not manage to pass parameters there.
You are correct, passing configuration via the function's constructor is the most common approach, and the open()
method's Configuration
doesn't work for passing in configuration values.
Two issues with using the function's constructor and then saving this in a non-transient member is that (a) all of your configuration data has to be serializable, and (b) this data is serialized with your job graph before being sent by the Job Manager to all of the Task Managers, which can slow things down if it gets large.
I have, for larger data sizes, just passed an S3 URL to the function's constructor, and then in the open()
call I've downloaded the data.
Or you can use Flink's broadcast state