I have a pipeline which reads 20 files from storage and extracts the path of each file from it and load to a table. Ideally the record count should be 20 but when i execute the pipeline,t he same record is being flown again and again making total record count to increase indefinitely. I am wondering if I am making any mistake here.
I just replicated the issue. My guess is that you are inserting one record in BigQuery for each record in the files. If you choose, for example, Blob format, then you will have only one record per file.