Based on the documentation read.csv
, the parameter stringsAsFactors
, when set, should cause quoted data values to be interpreted as factors. Consider the following data file, which we will call test.csv
.
"a",b,c
"1",2,3
"3",2,3
When I try to read this data using read.csv
, it does not appear to parse the first column as a factor.
foo = read.csv("test.csv", stringsAsFactor=T)
is.factor(foo$a)
Output:
[1] FALSE
I tried to use the column name without quotes, but that did not work either. How can I correct this?
Your example data are coercible to numeric. Try with data that are not so coercible:
foo <- read.csv(text='"a",b,c
"1",2,3
"3",2,3
"a",2,3 ', stringsAsFactors=TRUE)
> foo$a
# [1] 1 3 a
# Levels: 1 3 a
Otherwise use colClasses
:
foo <- read.csv(text='"a",b,c
"1",2,3
"3",2,3 ', colClasses=c('factor','numeric','numeric'))
> foo$a
# [1] 1 3
# Levels: 1 3
Or you could convert using as.factor
after reading the data in.