I have a data frame in R with a column that looks like this:
Venue
AAA 2001
BBB 2016
CCC 1996
... ....
ZZZ 2007
In order to make working with the dataframe slightly easier I wanted to split up the venue column into two columns, location and year, like so:
Location Year
AAA 2001
BBB 2016
CCC 1996
... ....
ZZZ 2007
I have tried various variations of the cSplit()
function to achieve this:
df = cSplit(df, "Venue", " ") #worked somewhat, however issues with places with multiple words (e.g. Los Angeles, Rio de Janeiro)
df = cSplit(df, "Venue", "[:digit:]")
df = cSplit(df, "Venue,", "[0-9]+")
None of these worked so far for me. I'd appreciate it if anyone could point me in the right direction.
How about this?
d <- data.frame(Venue = c("AAA 2001", "BBB 2016", "CCC 1996", "cc d 2001"),
stringsAsFactors = FALSE)
d$Location <- gsub("[[:digit:]]", "", d$Venue)
d$Year <- gsub("[^[:digit:]]", "", d$Venue)
d
# Venue Location Year
# 1 AAA 2001 AAA 2001
# 2 BBB 2016 BBB 2016
# 3 CCC 1996 CCC 1996
# 4 cc d 2001 cc d 2001