Is there a way to encrypt SPSS-files (.sav) using the cyphr
-package? Encrypting .csv works fine, but when I try to encrypt .sav, I get following error-message:
Error in db_lookup(dat$ns, dat$name, file_arg) :
Rewrite rule for haven::write_sav not found
I have found a solution where I first convert the original files (*.csv and .sav) into *.rds files. After that they are encrypted. This works as intended.
With this procedure, encrypted *.rds files with the same name are created and saved in a separate folder for all *.csv and *.sav in the original folder.
Load packages:
library(rio)
library(stringr)
library(cyphr)
Set paths to the folder with original unencrypted data (data_originals
) and to the folder to store the encrypted data (data_encypted
):
path_originals <- "./data_originals"
path_encrypted <- "./data_encypted"
Set working directory:
setwd(path_originals)
Specify the directory in which the encrypted files are to be stored (data_encypted
).
data_dir <- file.path(path_encrypted)
Set path of personal key:
path_key_user <- "~/.ssh/"
Create a key for the data and encrypt that key with personal key:
data_admin_init(data_dir, path_user = path_key_user)
Get the data key and add encrypted data to the directory:
key <- cyphr::data_key(data_dir, path_user = path_key_user)
For *.csv-files:
Write all *.csv files in the folder data_originals
to a list:
filenames_csv <- list.files(path = path_originals, pattern = "*.csv")
Read in *.csv files located in the folder data_originals
:
df_csv <- lapply(filenames_csv, read.csv2)
Create a list of what the *.csv files should be named as *.rds files:
filenames_csv %>% str_replace(".csv", ".rds") -> filenames_csv2rds
Save the *.csv files as *.rds files to the folder created for the encrypted files (data_encrypted
):
for (i in 1:length(df_csv)) {
setwd(path_encrypted)
export(df_csv[i], filenames_csv2rds[i]) #
}
For *.sav-files:
Set working directory:
setwd(path_originals)
Write all *.sav files in the folder data_originals
to a list:
filenames_sav <- list.files(path = path_originals, pattern = "*.sav")
Read in *.sav files located in the folder data_originals
:
df_sav <-
lapply(filenames_sav,
Hmisc::spss.get,
use.value.labels = T,
lowername = T)
Create a list of what the *.sav files should be named as *.rds files:
filenames_sav %>% str_replace(".sav", ".rds") -> filenames_sav2rds
Save the *.sav files as *.rds files to the folder created for the encrypted files (data_encrypted
):
for (i in 1:length(df_sav)) {
setwd(path_encrypted)
export(df_sav[i], filenames_sav2rds[i]) #
}
Write the names of the *.rds files that are now in the data_encrypted
folder and are still to be encrypted in a list:
filenames <- list.files(path = path_encrypted, pattern = "*.rds")
Read in all *.rds files located in the folder data_encrypted
.
ldf <- lapply(filenames, readRDS)
Define paths:
paths <- file.path(data_dir, paste0(filenames))
Encrypt and save all files in folder data_encrypted
:
for (i in 1:length(ldf)) {
for (i in 1:length(paths)) {
encrypt(saveRDS(ldf[i], paths[i]), key)
}
}