Stock quotes obtained from yfinance don't start the 1st of each month, so I need to extract the the Date
and Close
for the first available qupte for each month in a csv.
Date,Open,High,Low,Close,Adj Close,Volume
2021-01-04 00:00:00-05:00,88.0,88.12449645996094,85.35700225830078,86.30650329589844,86.30650329589844,37324000
2021-01-05 00:00:00-05:00,86.25450134277344,87.34149932861328,85.84500122070312,87.00250244140625,87.00250244140625,20360000
[...]
2021-02-01 00:00:00-05:00,92.22949981689453,95.7770004272461,92.22949981689453,94.65350341796875,94.65350341796875,40252000
2021-02-02 00:00:00-05:00,95.65650177001953,97.46849822998047,95.31849670410156,95.95600128173828,95.95600128173828,66332000
2021-02-03 00:00:00-05:00,103.28050231933594,105.33100128173828,100.67749786376953,102.94400024414062,102.94400024414062,97882000
2021-02-04 00:00:00-05:00,103.03099822998047,103.46499633789062,101.75499725341797,102.68150329589844,102.68150329589844,48596000
[...]
2021-04-29 00:00:00-04:00,119.46150207519531,120.20800018310547,118.69249725341797,119.63800048828125,119.63800048828125,41234000
2021-04-30 00:00:00-04:00,118.4010009765625,119.09249877929688,117.3280029296875,117.67500305175781,117.67500305175781,44856000
2021-05-03 00:00:00-04:00,118.24549865722656,119.07749938964844,116.7750015258789,117.15399932861328,117.15399932861328,28242000
2021-05-04 00:00:00-04:00,115.67749786376953,116.24949645996094,112.83399963378906,115.34149932861328,115.34149932861328,44818000
So far I managed to only get the days starting with 0, but this is still not what I want:
import csv
with open("quotes.csv", "r") as file:
data = csv.reader(file)
for line in data:
if len(line[0]) > 4 and int(line[0][8]) < 1:
print([line[0], line[4]])
The following is an example of what I want to obtain from the example csv above (I skipped some months only for convenience, I still need one day for each month):
2021-01-04 00:00:00-05:00, 86.30650329589844
2021-02-01 00:00:00-05:00, 94.65350341796875
2021-05-03 00:00:00-04:00, 117.15399932861328
I would recommend you convert the first value into a Python datetime
object. You can then easily test the date as needed. For example, the following approach detects whenever the month changes:
from datetime import datetime
import csv
with open("quotes.csv", "r") as file:
data = csv.reader(file)
header = next(data) # skip the header row
cur_month = None
for line in data:
try:
date = datetime.strptime(line[0][:19], "%Y-%m-%d %H:%M:%S")
if date.month != cur_month:
cur_month = date.month
print(line[0], line[4])
except:
pass # skip over badly formatted dates or empty rows
For the example given it would give:
2021-01-04 00:00:00-05:00 86.30650329589844
2021-02-01 00:00:00-05:00 94.65350341796875
2021-04-29 00:00:00-04:00 119.63800048828125
2021-05-03 00:00:00-04:00 117.15399932861328
(as there are many missing rows)