Need Statement: I have performed a cursor.fetchall Select from a SQLite database, returning 'id' and 'date_time', the later of which is text. I want to create additional columns using pd.to_date of year, dayOfWeek, dayOfYear, hourOfDay
Issue: Following the example of a no-loop column add and population approach, I've tried multiple call combinations, none of which work.
I first tested a series of calls to confirm I could split the test date correctly;
sr = pd.Series(['2015-02-08 20:00:00'])
sr = pd.to_datetime(sr)
#Year: Series.dt.year The year of the datetime
#Day of week: Series.dt.dayofweek The day of the week with Monday=0, Sunday=6
#Day of year: Series.dt.dayofyear The ordinal day of the year
#Hour: Series.dt.hour The hours of the datetime
print(sr)
print(sr.dt.year )
print(sr.dt.dayofweek )
print(sr.dt.dayofyear )
print(sr.dt.hour )
Everything came out as expected;
0 2015-02-08 20:00:00
dtype: datetime64[ns]
0 2015 dtype: int64
0 6
dtype: int64
0 39
dtype: int64
0 20
dtype: int64
The code I've tried works perfectly through the lines below, returning 105,861 rows x 2 columns;
def splitDateTime():
try:
sqliteConnection = sqlite3.connect('TestElecConsump.db')
cursor = sqliteConnection.cursor()
print("Connected to SQLite")
sqlite_select_query = """SELECT id, date_time from WeatherRecord;"""
cursor.execute(sqlite_select_query)
records = cursor.fetchall()
print("Total rows are: ", len(records))
print("Printing first row:", records[0])
splitDatepd = pd.DataFrame(records, columns=['id','date_time'])
print("Dataframe shape:", splitDatepd.shape)
print("Dataframe : " , splitDatepd, sep='\n')
print ('records: ' + str(type(records)))
print ('splitDatepd: ' + str(type(splitDatepd)))
However, the next lines execute with no output whatsoever;
#Add new column of Pandas datetime year
splitDatepd["pd-datetime"] = splitDatepd.to-datetime["date_time"].dt.year
print("Dataframe shape:", splitDatepd.shape)
print("Dataframe : " , splitDatepd, sep='\n')
So I decided to simplfy matters by repeated the above by leaving off the .year parsing;
splitDatepd["pd-datetime"] = splitDatepd.to-datetime["date_time"]
Still there was no change to splitDatepd.
When the def finalizes and returns the dataframe, a printout of it looks exactly like the original dataframe from the Select statement.
What am I doing wrong?
You could try with the pd.to_datetime
function in a single column, for example:
splitDatepd["pd_datetime"] = pd.to_datetime(splitDatepd["date_time"])
PS: Remember that functions names use underscode, I mean, it is pd.to_datetime
not pd.to-datetime
.