I want HDFS commands to fail if a parent directory doesn't exist when making subdirectories. When I use any of FileSystem#mkdirs
, I find that an exception isn't risen, instead creating non-existent parent directories:
import java.util.UUID
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
val conf = new Configuration()
conf.set("fs.defaultFS", s"hdfs://$host:$port")
val fileSystem = FileSystem.get(conf)
val cwd = fileSystem.getWorkingDirectory
// Guarantee non-existence by appending two UUIDs.
val dirToCreate = new Path(cwd, new Path(UUID.randomUUID.toString, UUID.randomUUID.toString))
fileSystem.mkdirs(dirToCreate)
Without the cumbersome burden of checking for the existence, how can I force HDFS to throw an exception if a parent directory doesn't exist?
The FileSystem API does not support this type of behavior. Instead, FileContext#mkdir
should be used; for example:
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileContext, FileSystem, Path}
import org.apache.hadoop.fs.permission.FsPermission
val files = FileContext.getFileContext()
val cwd = files.getWorkingDirectory
val permissions = new FsPermission("644")
val createParent = false
// Guarantee non-existence by appending two UUIDs.
val dirToCreate = new Path(cwd, new Path(UUID.randomUUID.toString, UUID.randomUUID.toString))
files.mkdir(dirToCreate, permissions, createParent)
The above example will throw:
java.io.FileNotFoundException: Parent directory doesn't exist: /user/erip/f425a2c9-1007-487b-8488-d73d447c6f79