I am trying to extract data of specific stock symbol from the data of all stocks through for loop. When I use the code out of for loop the code is working while the same code is not working in for loop.
Below is the code -
Working -
df = fh_5[fh_5.symbol .== "GOOG", ["date","close"]]
Not working -
for s in unique!(fh_5.symbol)
df = fh_5[fh_5.symbol .== s, ["date","close"]]
date_range = leftjoin(date_range, df, on =:"dates" => :"date")
end
Error
ERROR: BoundsError: attempt to access 6852038×8 DataFrame at index [Bool[1, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], ["date", "close"]]
Stacktrace:
[1] getindex(df::DataFrame, row_inds::BitVector, col_inds::Vector{String})
@ DataFrames ~\.julia\packages\DataFrames\3mEXm\src\dataframe\dataframe.jl:448
[2] top-level scope
@ .\REPL[349]:2
And after I run the for loop the code which was working outside the for loop it does not work, I have to re import the csv file - the the code outside the for loop works if I run it first. Am I changing the the base dataset fh_5 while I am running the for loop?
Just to add the reproducible example - Data for the example
Below is the code used -
using DataFrames
using DataFramesMeta
using CSV
using Dates
using Query
fh_5 = CSV.read("D:\\Julia_Dataframe\\JuliaCon2020-DataFrames-Tutorial\\fh_5yrs.csv", DataFrame)
min_date = minimum(fh_5[:, "date"])
max_date = maximum(fh_5[:, "date"])
date_seq = string.(collect(Dates.Date(min_date) : Dates.Day(1) : Dates.Date(max_date)))
date_range = df = DataFrame(dates = date_seq)
date_range.dates = Date.(date_range.dates, "yyyy-mm-dd")
for s in unique(fh_5.symbol)
df = fh_5[fh_5.symbol .== s, ["date","close"]]
date_range = leftjoin(date_range, df, on =:"dates" => :"date")
rename!(date_range, Dict(:close => s))
end
Don't use unique!
for this, because that mutates the fh_5.symbol
column. In other words, unique!
removes the duplicate values from that column, which will change the length of that column. Use unique
instead. So, something like this:
for s in unique(fh_5.symbol)
df = fh_5[fh_5.symbol .== s, ["date","close"]]
date_range = leftjoin(date_range, df, on =:"dates" => :"date")
end
In Julia, by convention, functions with names that end in !
will mutate (some of) their arguments.