I have an excelfile, which I read into a Dataframe.
using DataFrames, XLSX
df = DataFrame(XLSX.readtable("Stocks/WIHL-Wihlborgs.xlsx", "PriceMonth")...)
│ Row │ Date │ Openprice │ Highprice │ Lowprice │ Closeprice │ Volume │
│ │ Any │ Any │ Any │ Any │ Any │ Any │
├─────┼────────────┼───────────┼───────────┼──────────┼────────────┼──────────┤
│ 1 │ 2020-12-23 │ 189.1 │ 189.7 │ 170.3 │ 181.5 │ 4170122 │
│ 2 │ 2020-11-30 │ 160.0 │ 191.6 │ 158.0 │ 189.1 │ 8006506 │
│ 3 │ 2020-10-30 │ 178.4 │ 184.3 │ 151.2 │ 160.6 │ 6760931 │
│ 4 │ 2020-09-30 │ 138.5 │ 178.8 │ 137.9 │ 177.5 │ 9005351 │
│ 5 │ 2020-08-31 │ 147.2 │ 152.1 │ 137.2 │ 138.3 │ 4865386 │
sort(df)
But when I try to use the sort()
function on this DateFrame, I get LoadError: MethodError: no method matching isless(::String, ::Dates.Date)
I think this is weird as I have another tab in the same excelfile where I have prices/day, and that works okey to sort. And the only difference that I see, is that in the Excel file, the prices/month tab has the Date field as Date, and the prices/day tab has the Date field as Any. But in the DataFrame, both tabs the Date field as Any.
I have a hard time understanding why the prices/day tab is working with sort()
and the prices/month tab isn't, instead it gives me the LoadError
.
Changing the columns to their appropriate data type should fix this (infer_eltypes):
using DataFrames, XLSX
df = DataFrame(XLSX.readtable("./dat.xlsx", "Sheet1", infer_eltypes=true)..., )
#5×6 DataFrame
# Row │ Date Openprice Highprice Lowprice Closeprice Volume
# │ Date Any Float64 Any Float64 Int64
#─────┼─────────────────────────────────────────────────────────────────
# 1 │ 2020-12-23 189.1 189.7 170.3 181.5 4170122
# 2 │ 2020-11-30 160 191.6 158 189.1 8006506
# 3 │ 2020-10-30 178.4 184.3 151.2 160.6 6760931
# 4 │ 2020-09-30 138.5 178.8 137.9 177.5 9005351
# 5 │ 2020-08-31 147.2 152.1 137.2 138.3 4865386
Now, sorting the Dates works fine:
df[sortperm(df.Date),1:2]
#5×2 DataFrame
# Row │ Date Openprice
# │ Date Any
#─────┼───────────────────────
# 1 │ 2020-08-31 147.2
# 2 │ 2020-09-30 138.5
# 3 │ 2020-10-30 178.4
# 4 │ 2020-11-30 160
# 5 │ 2020-12-23 189.1
Setting the type of Dates manually is also possible:
using Dates
df.Date = Dates.Date.(string.(df.Date), "yyyy-mm-dd")
#5-element Array{Date,1}:
# 2020-12-23
# 2020-11-30
# 2020-10-30
# 2020-09-30
# 2020-08-31