The latest release (0.15.0
) of the Dataframe library for Kotlin Notebook uses the method readCsv
(note: lowercase sv
), and in the following code, I'm using it to read in a CSV of my Discover card financials, which has dates in the Trans. Date
column that look like 12/31/2024
, but it is throwing the following CellConversionException
parsing this format:
org.jetbrains.kotlinx.dataframe.exceptions.CellConversionException: Failed to convert '12/31/2024' from kotlin.String to kotlinx.datetime.LocalDate in column 'Trans. Date' in column [Trans. Date], row 0
at Cell In[6], line 17
Here is the Kotlin notebook cell:
// import kotlin dataframe library
%use dataframe(v=0.15.0, enableExperimentalCsv=true)
import kotlinx.datetime.format.DateTimeFormat
import kotlinx.datetime.format.char
import kotlinx.datetime.format.Padding.ZERO as ZEROpad
// seems like I need to define the DateTimeFormat,
// but I don't know how to pass it to the CSV parser.
val dateFormat : DateTimeFormat<LocalDate> = LocalDate.Format {
monthNumber(ZEROpad)
char('/')
dayOfMonth(ZEROpad)
char('/')
year(ZEROpad)
}
// read the csv file into a dataframe
var discover = DataFrame.readCsv("2024 discover.csv",
colTypes = mapOf(
// how do I get this conversion to work for given mm/dd/yyyy format?
"Trans. Date" to ColType.LocalDate
)
)
discover.getRows(0..4)
Note that the dateFormat
val is unused! I don't know what to do with it! How do I pass this format (assuming it is correct) to the parser to get a column of machine-readable dates?
@Adrian Klaver is correct.
DataFrame.readCsv()
works mostly the same as DataFrame.readCSV()
. This also applies to their parserOptions
argument.
To read a CSV file with a custom date-time pattern, you can use:
DataFrame.readCsv("2024 discover.csv",
// optional, should not be required if the specified column can successfully be parsed
colTypes = mapOf("Trans. Date" to ColType.LocalDate),
// proving custom ParserOptions
parserOptions = ParserOptions(dateTimePattern = "MM/dd/yyyy"),
)
At the moment it's not (yet) possible to use the kotlinx-datetime DateTimeFormat
with ParserOptions
. DataFrame still uses the java.time DateTimeFormatter
, like:
val dateTimeFormatter = DateTimeFormatter.ofPattern("MM/dd/yyyy")
DataFrame.readCsv("2024 discover.csv",
...,
// proving custom ParserOptions
parserOptions = ParserOptions(dateTimeFormatter = dateTimeFormatter),
)
If you do require kotlinx-datetime DateTimeFormat
, you can parse the column as String
and manually parse the column:
val dateFormat: DateTimeFormat<LocalDate> = ...
DataFrame.readCsv("2024 discover.csv",
// force the csv parser to read the column as String
colTypes = mapOf("Trans. Date" to ColType.String),
)
// manually convert the column to LocalDate with kotlinx-datetime DateTimeFormat.parse
.convert { "Trans. Date"<String>() }.with { dateFormat.parse(it) }