I have a similar question to u/Ananas here: Sentinel3 OLCI (chl) Average of netcdf files on Python I am running into similar problems, in so much that I cannot seem to extract the necessary information from the .nc-files and then merge them to create a time-series. In my case,I am trying to do this in R. My current code, which I have followed and customised from here: https://www.youtube.com/watch?v=jWRszWCVWLc&t=1504s , returns an error:
Error in `[<-.data.frame`(`*tmp*`, variable, value = c(0, 0, 0, 0, 0, :
replacement has 1927 rows, data has 2202561
Maybe I am going at it the wrong way from the start and R-s capabilities wiht .nc files are not suited for this? Any suggestions are welcomed.
Here is my code
extract_variable_from_netcdf<- function(nc,variable){
tryCatch(
{
result<-var.get.nc(nc,variable)
return(result)
},
error=function(cond){
message(paste(variable,"attribute not found"))
message("Here is the original error message")
message(cond)
}
)
}
extract_global_attribute_from_netcdf<- function(nc,global_attribute){
tryCatch(
{
result<-att.get.nc(nc,"NC_GLOBAL",global_attribute)
return(result)
},
error=function(cond){
message(paste(global_attribute,"attribute not found"))
message("Here is the original error message")
message(cond)
}
)
}
folder<- "path to folder"
files<- list.files(folder, pattern= ".nc", full.names = TRUE)
variables<- c("conc_chl", "iop_bpart","lat", "lon") #variables I need to extract
global_attrs<- c("start_date", "stop_date")
headers<-c(global_attrs,variables)
df<-data.frame(matrix(ncol=length(headers), nrow=0))
colnames(df)<- headers
for(file in files) {
nc<- open.nc(file)
chl<- var.get.nc(nc, "conc_chl")
num_chl<- length(chl)
newdf<- data.frame(matrix(ncol=length(headers), nrow=num_chl))
colnames(newdf)<- headers
for (global_attribute in global_attrs) {
newdf[global_attribute]<-extract_global_attribute_from_netcdf(nc,global_attribute)
}
for (variable in variables) {
newdf[variable]<-extract_variable_from_netcdf(nc,variable)
}
df<-merge(df,newdf,all=TRUE)
}
The way I have used ".nc" files with satellite data, in R. Have been reading it in with the "raster" library as a raster file.
library(raster)
r <- raster("yuor_file.nc")
plot(r) # quick plot to see if everything is as it should be
The way I read in my timeseries was with a loop, and in addition I used a function found from this site somewhere, to covert the raster into a sensible r-data frame
stack overflow function, to convert the loaded raster to data frame
gplot_data <- function(x, maxpixels = 50000) {
x <- raster::sampleRegular(x, maxpixels, asRaster = TRUE)
coords <- raster::xyFromCell(x, seq_len(raster::ncell(x)))
## Extract values
dat <- utils::stack(as.data.frame(raster::getValues(x)))
names(dat) <- c('value', 'variable')
dat <- dplyr::as.tbl(data.frame(coords, dat))
if (!is.null(levels(x))) {
dat <- dplyr::left_join(dat, levels(x)[[1]],
by = c("value" = "ID"))
}
dat
}
Read in one file at a time, convert with function and return data.frame
files<- list.files(folder, pattern= ".nc", full.names = TRUE)
fun <- function(i) {
#read in one file at a time
r <- raster(files[i])
#convert to normal data frame
temp <- gplot_data(r)
temp #output
}
dat <- plyr::rbind.fill(lapply(1:length(files), fun)) #bind each iteration
Here a plot using ggplot2 and ggforce.
ggplot() +
geom_tile(data = dat,
aes(x = x, y = y, fill = value))
Alternatively if you do not know the context of you file, the following, from the "ncdf4" package, will help you inspect it. https://towardsdatascience.com/how-to-crack-open-netcdf-files-in-r-and-extract-data-as-time-series-24107b70dcd
library(ncdf4)
our_nc_data <- nc_open("/your_file.nc")
print(our_nc_data)
# look for the variable names and assign them to vectors that can be bound together in dataframes
lat <- ncvar_get(our_nc_data, "lat") #names of latitude column
lon <- ncvar_get(our_nc_data, "lon") #name of longitude column
time <- ncvar_get(our_nc_data, "time") #the time was called time
tunits <- ncatt_get(our_nc_data, "time", "units")# check units
lswt_array <- ncvar_get(our_nc_data, "analysed_sst") #select the relevant variable, this is temperature named "analysed_sst"