pythonpandasdataframeparsingreindex

formating file with hours and date in the same column


our electricity provider think it could be very fun to make difficult to read csv files they provide.

This is precise electric consumption, every 30 min but in the SAME column you have hours, and date, example :

[EDIT : here the raw version of the csv file, my bad]

;
"Récapitulatif de mes puissances atteintes en W";
;
"Date et heure de relève par le distributeur";"Puissance atteinte (W)"
;
"19/11/2022";
"00:00:00";4494
"23:30:00";1174
"23:00:00";1130
[...]
"01:30:00";216
"01:00:00";2672
"00:30:00";2816
;
"18/11/2022";
"00:00:00";4494
"23:30:00";1174
"23:00:00";1130
[...]
"01:30:00";216
"01:00:00";2672
"00:30:00";2816

How damn can I obtain this kind of lovely formated file :

2022-11-19 00:00:00 2098
2022-11-19 23:30:00 218
2022-11-19 23:00:00 606

etc.


Solution

  • Okay I have an idiotic brutforce solution for you, so dont take that as coding recommondation but just something that gets the job done:

    import itertools
    dList = [f"{f}/{s}/2022" for f, s in itertools.product(range(1, 32), range(1, 13))]
    

    i assume you have a text file with that so im just gonna use that:

    file = 'yourfilename.txt'
    #make sure youre running the program in the same directory as the .txt file
    with open(file, "r") as f:
        global lines
        lines = f.readlines()
    lines = [word.replace('\n','') for word in lines]
    for i in lines:
        if i in dList:
            curD = i
        else:
            with open('output.txt', 'w') as g:
                g.write(f'{i} {(i.split())[0]} {(i.split())[1]}')
    

    make sure to create a file called output.txt in the same directory and everything will get writen into that file.