I have a .csv file that I’m having trouble processing because the numbers use commas for decimals, which causes issues when reading the file in R. Excel interprets commas for decimals, while R uses dots. I want to replace the commas with dots as I load the data into R. Here are some of the approaches I’ve tried:
Approach 1:
data <- read.csv2("cleaned_data.csv", stringsAsFactors = FALSE)
data$Daily_Loss <- as.numeric(gsub(",", ".", data$Daily_Loss))
data$Total_Loss <- as.numeric(gsub(",", ".", data$Total_Loss))
data$loss_2023 <- as.numeric(gsub(",", ".", data$loss_2023))
data$loss_2024 <- as.numeric(gsub(",", ".", data$loss_2024))
Approach 2:
data <- read.csv("cleaned_data.csv", sep = ";", dec = ",", stringsAsFactors = FALSE)
data$Daily_Loss <- as.numeric(gsub(",", ".", trimws(data$Daily_Loss)))
data$Total_Loss <- as.numeric(gsub(",", ".", trimws(data$Total_Loss)))
data$loss_2023 <- as.numeric(gsub(",", ".", trimws(data$loss_2023)))
data$loss_2024 <- as.numeric(gsub(",", ".", trimws(data$loss_2024)))
Any suggestions on how I can fix this more efficiently? What I want to do after is:
total_loss_2023 <- sum(data$loss_2023, na.rm = TRUE)
total_loss_2024 <- sum(data$loss_2024, na.rm = TRUE)
cat("Total Loss in 2023: ", total_loss_2023, "\n")
cat("Total Loss in 2024: ", total_loss_2024, "\n")
Any help is welcomed! That is part of the table I am trying to convert to a different format.
To handle comma decimals efficiently, try:
data <- read.csv2("cleaned_data.csv", dec = ",", stringsAsFactors = FALSE)
read.csv2()
is designed for European-style CSV files with semicolon separators and comma decimals.
If that doesn't work you can use this for replacing the commas:
data <- read.csv("cleaned_data.csv", stringsAsFactors = FALSE)
# Convert all numeric columns at once
numeric_cols <- c("Daily_Loss", "Total_Loss", "loss_2023", "loss_2024")
data[numeric_cols] <- lapply(data[numeric_cols], function(x) as.numeric(gsub(",", ".", x)))