javasftpjsch

How to do an atomic SFTP file transfer using JSch, so that the file is not accessable until the write process has finished?


I've written a small Java program which uses Jsch to transfer multiple textfiles to a remote server. Since individual files can get quite big the transfer takes up to 20 seconds.

On the remote server the resulting file will get accessed (read) at various points of time which I have no control over. I've tested to copy the file during the SFTP transfer on the server. The copied file did not have the full data it needs to be complete since it was not fully available to the time of the copy command.

How can I ensure that the file can get accessed only after the transfer has fully finished, so that the file can get read completely? Since I can't control the file access on the remote server I need a way to do this from my Java program.

Here is the relevant part of the code I wrote:

InputStream contentInputStream = null;
try {
    contentInputStream = new ByteArrayInputStream(Files.readAllBytes(Paths.get("test1.txt")));
} catch (IOException e) {
    e.printStackTrace();
}
sftpChannel.put(contentInputStream, "abc.txt");

Solution

  • You're writing the file into a directory on the remote server, and some process on the remote server is watching for files to appear there. You don't want the remote process to act on the file until the file has been fully written to the server.

    Whoever set up this system should have designed it with this issue in mind. It takes time to transfer files between servers, and transfers can also fail before they complete. There needs to be a designed-in way to transfer files to the server and then make them available to this remote process.

    There are three common ways to do this:

    Write the file to a different folder: Write the file into a "temporary" or "working" directory on the remote server, which is on the same filesystem as the target folder but isn't being monitored by the remote process. Once the file transfer is completed, move the file from the temporary directory to the actual destination directory.

    Moving the file from one directory to another on the same filesystem should be "atomic", meaning it appears instantaneous to other processes. SFTP permits moving files from one directory to another on the remote server.

    Write the file to a special filename: Write the file to the destination directory on the remote system, but use a special file name which the remote process would ignore. Once the file transfer is complete, rename the file to have the correct name.

    For example, if the remote process is looking for file names ending in ".xml", you'd create a file named "foo.xml.tmp" on the remote server, write your data to it, and then rename it from "foo.xml.tmp" to "foo.xml". SFTP permits renaming files on the remote server.

    Use Modification Timestamps: The remote process can check the last-modified timestamp on the files it is about to process, and ignore files that have been modified in the last minute or so. This behavior would have to be built into the remote process.

    SFTP has an operation to set the modification timestamp on a remote file, and JSCH supports it, but you wouldn't normally call this function explicitly. Normally, you'd depend on the remote file's last-modified timestamp to reflect your process writing to the file.

    Note this method is less reliable than the others for a couple of reasons: