I am using an embedded bokeh app in jupyter to label sections of time series. Lets say we have to following example dataframe:
Time Y Label
0 2018-02-13 13:14:05 0.401028 a
1 2018-02-13 13:30:46 0.900101 a
2 2018-02-13 13:40:06 -0.648143 a
3 2018-02-14 16:33:27 1.111675 a
4 2018-03-13 11:43:16 -0.986025 a
where Time is datetime64[ns], Y is float64 and Label is from type object.
Now I use the following bokeh app to change the entries of Label by using a user input and trigger the callback by a button click.
def modify_doc(doc):
p = figure(tools=["pan, box_zoom, wheel_zoom, reset, save, xbox_select"])
source=ColumnDataSource(df_test)
p.line(x="index", y="Y", source=source)
p.circle(x="index", y="Y", source=source, alpha=0)
def callback():
global list_new
list_new = []
inds = source.selected.indices
for j in inds:
source.data["Label"][j] = label_input.value.strip()
list_new.append(pd.DataFrame(source.data))
label_input = TextInput(title="Label")
button = Button(label="Label Data")
button.on_click(callback)
layout = column(p, label_input, button)
doc.add_root(layout)
show(modify_doc)
Do not wonder about list_new, it is a needed approach as I use multiple time series plots and ColumnDataSource objects.
After the callback I get the accepted Label output:
Label Time Y index
0 a 1.518528e+12 0.401028 0
1 a 1.518529e+12 0.900101 1
2 b 1.518529e+12 -0.648143 2
3 b 1.518626e+12 1.111675 3
4 b 1.520941e+12 -0.986025 4
But why does Time get converted to float? I know how to reconstruct the timestamps by using datetime.datetime.utcfromtimestamp() or matching the indices but how can I change the callback to keep the original entries in Time?
how can I change the callback to keep the original entries in Time?
You can't. The actual underlying wire format for datetime values is millseconds since epoch, and that is what Bokeh will automatically convert any and all datetime types into when serializing, sending to, or synchronizing with BokehJS in the browser. In standalone (non-server) cases this is not ever a concern, because the data never "comes back" to a Python process. But it can be suprising in Bokeh server contexts. You will either need to convert the timestamp values back to whatever datetime type you want (there are many), or if you just want to make sure the original values are undisturbed, make a copy beforehand.