lets say i create a stream like this:
stream create --name ftpstream --definition "ftp --username=ftpuser --password=pass123 --host=host.com --remoteDir=/home/ftpuser --mode=ref | log" --deploy
If FTP server contains two files both will be transferred and log sink will log something like this:
2017-10-24T20:43:08+0200 1.3.1.RELEASE INFO task-scheduler-3 sink.ftpstream - /tmp/xd/ftp/dummy.txt
2017-10-24T20:43:08+0200 1.3.1.RELEASE INFO task-scheduler-3 sink.ftpstream - /tmp/xd/ftp/test.txt
However, this print only happens the first time a new file is processed(if i add a new file to FTP server, sink log will log message)
but if i delete already transferred files from local /tmp/xd/ftp
directory, files will be transferred again but no sink log message is written.
How to correctly get the file reference each time a file is transferred?
Thanks.
You need a custom FTP source - the default source has an AcceptOnceFileListFilter
in its local-filter
so it won't pass the same file twice (even though it's fetched).
If you use a FileSystemPersistentAcceptOnceFileListFilter
, it will pass the file if the lastModified
date changes.