I have a las file and I am trying to read it in python using lasio library, one of the columns is TIME
which is in the following format: 00:00:00.22-04-23
Sample of data copied from las
file:
TIME col1 col2
00:00:00.22-06-23 1010 20
00:00:05.22-06-23 1020 25
00:00:10.22-06-23 1015 32
My code to read the data:
df = lasio.read(file_path).df().reset_index()
This returns the df
in the following format:
TIME col1 col2 UNKNOWN:1 UNKNOWN:2
00:00:00.22 -06 -23 1010 20
00:00:05.22 -06 -23 1020 25
00:00:10.22 -06 -23 1015 32
As you can see, my TIME
column has been split into three columns at every -
. The data from col1
and col2
have been shifted to UNKNOWN:1
and UNKNOWN:2
(probably these columns are created by lasio during reading). I need it to return the TIME
column as in the original form and avoid shifting the values of col1
and col2
, so I can strip, split and manipulate TIME
using pandas once it is read into a dataframe.
Any advice is appreciated.
You can try to use pd.read_csv
with correct delimiter. For example:
df = pd.read_csv('your_file.txt', sep=r"\s+", engine="python")
print(df)
Prints:
TIME col1 col2
0 00:00:00.22-06-23 1010 20
1 00:00:05.22-06-23 1020 25
2 00:00:10.22-06-23 1015 32
EDIT: With updated file:
import re
import pandas as pd
from io import StringIO
with open('your_file.txt', 'r') as f_in:
data = re.sub(r'\A.*~A', '', f_in.read(), count=1, flags=re.S)
df = pd.read_csv(StringIO(data), sep=r"\s+", engine="python")
print(df)
Prints:
TIME col1 col2 col3
0 00:00:00.23-04-23 1977.47 160 160.5