rtiff

How do I speed up code to read exif data?


I am reading exif data from a large number of tif files in order to get the date each file was modified and return it as a list. I am using the following code:

ref_img_path <- file_list[1]


exif_data<-exifr::read_exif(ref_img_path)
exif_data
modification_date <- exif_data$FileModifyDate
print(modification_date)


read_tif_modification_dates<-function(folder_path) {
  
  tif_files <- list.files(folder_path, pattern = ".tif", full.names = TRUE)

  
 timestamps<-list()
 
 for(tif_file in tif_files){
   exif_data <- read_exif(tif_file)
   modification_date_str<-exif_data$FileModifyDate
   modification_date<-as.POSIXct(strptime(modification_date_str, "%Y:%m:%d %H:%M:%S"))
   timestamps[[tif_file]]<- modification_date
}
 
  return(timestamps)
}
  
  timestamps_list<-read_tif_modification_dates(folder_path)

It works fine with a folder that contains 11 images. It still works with a folder that contains 100,000, but takes hours. is there a way to do this that will speed it up?


Solution

  • Note that "FileModifyDate" isn't actually an EXIF metadata tag. The underlying exiftool is just pulling that info from the file system. If that's all the information you need, you don't need to parse the EXIF data at all. You can run

    file.info(tif_files)$mtime
    

    as one statement to get the modified date/time values for all the files in one go. They will already be formatted as POSIXct values so you won't need to convert them explicitly to date/time vaues.