I want to load a csv file as a dataframe in Python with pd.read_table. This file contains data from 60 parameters at different time points. The ‘Time’ column looks like this in Excel:
The time should actually be 39.7318, 39.75808 etc. However, when I load the file into Python, it reads the values in the time column as string. I tried adding dtype = np.float in order for the values to be read correctly;
df = pd.read_table(path, delimiter=";", dtype=np.float)
But this results in ValueError: could not convert string to float: '3.975.808'.
Is it possible to define that this column contains values with 5 decimals?
For a file table.txt
that looks like
Time;B
397.318;abc
3.975.808;efg
you could try
df = pd.read_table("table.txt", sep=";")
df["Time"] = (
df["Time"].str.replace(".", "", regex=False).astype("float").div(100_000)
)
to get
Time B
0 3.97318 abc
1 39.75808 efg
You could also use a converter
def to_float(string):
return float(string.replace(".", "")) / 100_000
df = pd.read_table("table.txt", sep=";", converters={"Time": to_float})
but I think the first solution is more efficient.