I'm looking for a way to split up a large Excel file into a number of smaller Excel files using R.
Specifically, there are three things I would like to do:
My data is structured like this:
Area | School | Student ID | Test score A | Test score B |
---|---|---|---|---|
North | A | 134 | 24 | 31 |
North | A | 221 | 26 | 33 |
South | B | 122 | 22 | 21 |
South | B | 126 | 25 | 25 |
I have data covering roughly 200 schools located in 5 different areas.
Any guidance on how to do this would be greatly appreciated!
As some of the comments have referenced, this will be hard to solve without knowing your specific operating environment & folder structure, I solved this using Windows 10/ C drive user folder but you can customize to your system. You're going to need a folder with all the images from the school saved by the name (or the ID I created) of the school and they will all need to be the same format (JPG or PNG). Plus, you need folders created for each Area you want to output to (openxlsx can write the files but not create the folders for you). Once you have those setup, something similar to this should work for you, but I would highly recommend referring to the openxlsx documentation for more info:
library(dplyr)
library(openxlsx)
# Load your excel file into a df
# g0 = openxlsx::read.xlsx(<your excel file & sheet..see openxlsx documentation>)
# Replace this tibble with your actual excel file, this was just for an example
g0 = tibble(
Area = c("North","North","North","North"),
School = c("A","A","B","B"),
Student_ID = c(134,221,122,126),
test_score_a = c(24,26,22,25),
test_score_b = c(31,33,21,25))
# Need a numeric school id for the loop
g0$school_id = as.numeric(as.factor(g0$School))
# Loop through schools, filter using dplyr and create a sheet per school
for (i in 1:n_distinct(g0$school_id)){
g1 = g0 %>%
filter(school_id == i)
## Create a new workbook
wb <- createWorkbook(as.character(g1$School))
## Add some worksheets
addWorksheet(wb, as.character(g1$School))
## Insert images
## I left the image as a direct path for example but you can change to a
## relative path once you get it working
img <- system.file("C:","Users","your name","Documents","A","A.jpg", package = "openxlsx")
insertImage(wb, as.character(g1$School), img, startRow = 5, startCol = 3, width = 6, height = 5)
## Save workbook
saveWorkbook(wb, paste0("C://Users//your name//Documents//",g0$Area,"//",g0$school,".xlsx"), overwrite = TRUE)
}