I have a price table that has date and time in a csv format:
Date Time o h l c v
0 2020-07-09 15:10:00 8 8 7.5 7.94 41
1 2020-07-09 15:00:00 7.61 8.24 7.61 8.24 10
2 2020-07-09 14:50:00 8.3 8.3 7.7 7.7 7
3 2020-07-09 14:40:00 8.72 8.72 8.3 8.3 7
4 2020-07-09 14:30:00 8.72 8.72 8.39 8.39 8
5 2020-07-09 14:20:00 8.35 8.6 8.3 8.6 6
6 2020-07-09 14:10:00 8.18 8.46 8.18 8.45 22
7 2020-07-09 14:00:00 8.5 8.5 8.5 8.5 1
ValueError: time data '0' does not match format '%Y-%m-%d %H:%M:%S'
This is the error I get from running these code snippets.
data = bt.feeds.GenericCSVData(dataname='ticks2.csv',
params = (
('nullvalue', float('NaN')),
('dtformat', '%Y/%m/%d'),# %H:%M:%S
('tmformat', '%H:%M:%S'),
('datetime', 0),
('time', 1),
('open', 2),
('high', 3),
('low', 4),
('close', 5),
('volume', 6),
I tried to merge Date and time columns for fixing this problem but to no avail...since the error stays the same.
df = pd.read_csv('ticks.csv', parse_dates=[['Date', 'Time']])
print(df)
del df["Unnamed: 0"]
First thing is that you have index as the first column in your CSV (i.e. 0, 1, 2, 3, 4...), but you don't have a column name for this column in your first line of CSV, so you need to add name for it into header of CSV (first line), just name it like "Index" so that first modified CSV line should look like Index Date Time o h l c v
.
Second thing is that looks like you have tabs instead of comma in your CSV as cells separator so you need to specify this in your read_csv as sep = '\t'
i.e. pd.read_csv('test.csv', sep = '\t', parse_dates = [['Date', 'Time']])
.
Below is a working corrected example, I did my example for the case of sep = ','
because tabs are removed by StackOverflow from text and I can't show them. For your case just modify sep = ','
to sep = '\t'
inside read_csv(...)
. You can see in my example that my csv contains added Index
in the beginning of first csv line. Also in the beginning of my example I have test csv file writing block, you don't need this block as you already have your file.
To conclude you have to do two things:
Index
plus tab.sep = '\t'
to your read_csv(...)
if you have tab separated CSV and looks like you have.# This file-writing block is not needed, it is to create example file
with open('test.csv', 'w', encoding = 'utf-8') as f:
f.write("""
Index,Date,Time,o,h,l,c,v
0,2020-07-09,15:10:00,8,8,7.5,7.94,41
1,2020-07-09,15:00:00,7.61,8.24,7.61,8.24,10
2,2020-07-09,14:50:00,8.3,8.3,7.7,7.7,7
3,2020-07-09,14:40:00,8.72,8.72,8.3,8.3,7
4,2020-07-09,14:30:00,8.72,8.72,8.39,8.39,8
5,2020-07-09,14:20:00,8.35,8.6,8.3,8.6,6
6,2020-07-09,14:10:00,8.18,8.46,8.18,8.45,22
7,2020-07-09,14:00:00,8.5,8.5,8.5,8.5,1
""")
# This code is needed to solve task
# Change to "sep = '\t'" for your case of tab-separated CSV
import pandas as pd
df = pd.read_csv('test.csv', sep = ',', parse_dates = [['Date', 'Time']])
print(df)
Output:
Date_Time Index o h l c v
0 2020-07-09 15:10:00 0 8.00 8.00 7.50 7.94 41
1 2020-07-09 15:00:00 1 7.61 8.24 7.61 8.24 10
2 2020-07-09 14:50:00 2 8.30 8.30 7.70 7.70 7
3 2020-07-09 14:40:00 3 8.72 8.72 8.30 8.30 7
4 2020-07-09 14:30:00 4 8.72 8.72 8.39 8.39 8
5 2020-07-09 14:20:00 5 8.35 8.60 8.30 8.60 6
6 2020-07-09 14:10:00 6 8.18 8.46 8.18 8.45 22
7 2020-07-09 14:00:00 7 8.50 8.50 8.50 8.50 1