I have a file with numerous timestamps and events, but I'm unsure of the data structure. I can read the events and have an approximate idea of the dates.
Does anyone have tips on possible formats for this data? The software is from around 1996, but its unknown and there is no documentation.
By analyzing a screenshot, I combined the date and bits to get this (relevant) information (times are in UTC+2, so the actual times could also be 09:00 in the binary):
0000000100010000101011001111000110001111010001001010111100000000 = 13.06.2024 11:23:24
0000000100010000111101111111000110001111010001000100101000000000 = 13.06.2024 11:24:39
0000000100010000111101111111000110001111010001000100100000000000 = 13.06.2024 11:24:39
0000000100010000111101111111000110001111010001000100101100000000 = 13.06.2024 11:24:39
0000000100010000001100111111001010001111010001000100100000000000 = 13.06.2024 11:25:39
0000000100010000010011111111010010001111010001000100101100000000 = 13.06.2024 11:34:39
0000000100100000111001101111000110001111010001001010111100000000 = 13.06.2024 11:24:22
0000000100100000110100111111001010001111010001001010111100000000 = 13.06.2024 11:28:19
If anyone has experience with older data formats or insights on how these binary sequences could represent timestamps, I'd appreciate your help.
I tried to deduce the encoding by examining the differences between the bytes and I searched the binary string for patterns and recognizable numbers in binary form and also unix-epoch timestamps.
Thanks in advance!
The timestamp data is in the middle 4 "bytes", but you have to take them in reverse order before converting from binary. The result is the number of seconds since the start of 1988.
Here's some python code to extract the timestamp:
import datetime
print("computed date from file:")
print("------------------- -------------------")
for line in open('raw_data', 'r'):
binary = line[0:64]
date_from_file = line[67:-1]
# chunk the binary data into 8-bit bytes:
bytes = [
binary[8*i : 8*i+8]
for i in range(8)
]
# reverse the order of the bytes
revbytes = [* reversed(bytes) ]
# The timestamp bits are the middle 4 bytes
timestamp_bits = ''.join(revbytes[2:6])
# Interpret those as a 32-bit integer
s = int(timestamp_bits, base=2)
# This is the number of seconds since the epoch
sec_since_epoch = datetime.timedelta(seconds=s)
# where the epoch is the start of 1988
epoch = datetime.datetime(1988, 1, 1)
computed_date = epoch + sec_since_epoch
print(f"{computed_date} vs {date_from_file}")
Output:
computed date from file:
------------------- -------------------
2024-06-13 11:23:24 vs 13.06.2024 11:23:24
2024-06-13 11:24:39 vs 13.06.2024 11:24:39
2024-06-13 11:24:39 vs 13.06.2024 11:24:39
2024-06-13 11:24:39 vs 13.06.2024 11:24:39
2024-06-13 11:25:39 vs 13.06.2024 11:25:39
2024-06-13 11:34:39 vs 13.06.2024 11:34:39
2024-06-13 11:24:22 vs 13.06.2024 11:24:22
2024-06-13 11:28:19 vs 13.06.2024 11:28:19
Approach: I put the lines in order by timestamp, and then looked for a range of bit-positions in which the bits were increasing (or rather, non-decreasing). That didn't work, so then I looked at bit-differences between adjacent timestamps. In particular, 11:24:22 and 11:24:39 are separated by only 17sec, and I noticed a range of positions that went from 00110 to 10111, which is a difference of 17. From there it was confirming my hypothesis, and looking around for the other bits.