pythoncsvmatplotlibmplfinance

Cannot draw a candlestick chart


I am trying to open the csv file and draw a candlestick chart, but an error occurs:

Traceback (most recent call last):
  File "/media/shutruk/Data/projects/money/qwerty.py", line 28, in <module>
    plot_candlestick(data_json)
  File "/media/shutruk/Data/projects/money/qwerty.py", line 14, in plot_candlestick
    ohlc = [(mdates.datestr2num(d[0]), d[2], d[3], d[4], d[1]) for d in data]
  File "/media/shutruk/Data/projects/money/qwerty.py", line 14, in <listcomp>
    ohlc = [(mdates.datestr2num(d[0]), d[2], d[3], d[4], d[1]) for d in data]
KeyError: 0

The whole code is below:

import matplotlib.pyplot as plt
from mplfinance.original_flavor import candlestick_ohlc
import matplotlib.dates as mdates
import csv

file_path = 'data.csv'

def plot_candlestick(data):
  if len(data) > 100:
    data = data[-100:]

  fig, ax = plt.subplots()
  
  ohlc = [(mdates.datestr2num(d[0]), d[2], d[3], d[4], d[1]) for d in data]
  candlestick_ohlc(ax, ohlc, width=0.6, colorup='red', colordown='green')

  ax.xaxis_date()
  ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
  plt.gcf().autofmt_xdate()
  plt.subplots_adjust(left=0.05, right=0.85, top=0.90, bottom=0.10)
  
  plt.show()

with open(file_path, 'r') as file:
  reader = csv.DictReader(file)    
  data_json = [row for row in reader]

plot_candlestick(data_json)

csv file contents looks loke:

"Date","Price","Open","High","Low","Vol.","Change %"
"03/27/2024","70,285.4","69,999.2","70,714.1","69,459.5","82.93K","0.41%"
"03/26/2024","69,999.3","69,896.3","71,490.7","69,366.4","90.98K","0.15%"
"03/25/2024","69,892.0","67,216.4","71,118.8","66,395.0","124.72K","3.99%"
"03/24/2024","67,211.9","64,036.5","67,587.8","63,812.9","65.59K","4.96%"
"03/23/2024","64,037.8","63,785.6","65,972.4","63,074.9","35.11K","0.40%"
"03/22/2024","63,785.5","65,501.5","66,633.3","62,328.3","72.43K","-2.62%"
"03/21/2024","65,503.8","67,860.0","68,161.7","64,616.1","75.26K","-3.46%"
"03/20/2024","67,854.0","62,046.8","68,029.5","60,850.9","133.53K","9.35%"

Please tell me how to fix it.


Solution

  • d represents a dictionary from the data list, it is not the same as accessing values using keys like d['Date'], d['Open'], etc. I have changed that part.

    Replace commas in numeric columns (Open, High, Low, Price) and convert them to floating-point numbers. So, I changed that your ohlc = [(mdates.datestr2num(d[0]), d[2], d[3], d[4], d[1]) for d in data]

    I have tested with your CSV example and it worked.

    def plot_candlestick(data):
        if len(data) > 100:
            data = data[-100:]
    
        fig, ax = plt.subplots()
        
        ohlc = [(mdates.datestr2num(d['Date']), float(d['Open'].replace(',', '')), float(d['High'].replace(',', '')), float(d['Low'].replace(',', '')), float(d['Price'].replace(',', ''))) for d in data]
        candlestick_ohlc(ax, ohlc, width=0.6, colorup='red', colordown='green')
    
        ax.xaxis_date()
        ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
        plt.gcf().autofmt_xdate()
        plt.subplots_adjust(left=0.05, right=0.85, top=0.90, bottom=0.10)
        
        plt.show()
    
    
    plot_candlestick(data)