I have a list of time string with different formats as shown
time = ["1:5 am", "1:35 am", "8:1 am", "9:14 am" "14:23 pm", "20:2 pm"]
dict = {'time': time}
df = pd.DataFrame(dict)
and wanted to replace strings in list as shown below.
["01:05 am", "01:35 am", "08:01 am", "09:14 am" "14:23 pm", "20:02 pm"]
Not sure how to write a regex that format the string in DataFrame.
A possible solution, which is based on regex.
(df['time'].str.replace(r'^(\d):', r'0\1:', regex=True)
.str.replace(r':(\d)\s', r':0\1 ', regex=True))
The main ideas are:
With r'^(\d):'
, one matches a single digit at the beginning of the string followed by a colon (e.g., 1: in 1:5 am).
With r'0\1:'
, one adds a 0 before the captured single digit and retains the colon.
With r':(\d)\s'
, one matches a single digit after a colon and before a space (e.g., :5 in 1:5 am).
With r':0\1 '
, one adds a 0 before the captured single digit and retains the colon and space.
Output:
0 01:05 am
1 01:35 am
2 08:01 am
3 09:14 am
4 14:23 pm
5 20:02 pm
Name: time, dtype: object