I would like to change some values in an mdf file (specifically, I would like to check for consistency, since the measurement instrument for some reason writes 10**10
when no value could be found). I can't figure out how to access specific values and change them. I figured out how to include the channel units in the channel names, which works reasonably fast:
with MDF(file) as mdf:
for i,gp in enumerate(mdf.groups):# add units to channel names (faster than using pandas)
for j,ch in enumerate(gp.channels):
mdf.groups[i].channels[j].name = ch.name + " [" + ch.unit + "]"
Unfortunately, gp.channels
doesn't seem to have a way to access the data, only some metadata for each channel (or at least I can't figure out the attribute or method).
I already tried to convert to a dataframe, where this is rather easy, but the file is quite large so it takes waaaay too long to sift through all the datapoints - my guess is this could be quite a bit faster if it is done in the mdf directly.
# slow method with dataframe conversion
data = mdf.to_dataframe()
columns = data.columns.tolist()
for col in columns:
for i,val in enumerate(data[col]):
if val == 10**10:
data.loc[i, col] = np.nan
Downsampling solves the taking too long part, but this is not really a solution either since I do need the original sample rate.
Accessing the data is not a problem, since I can use the select()
or get()
methods, but I can't change the values - I don't know how. Ideally, I'd change any 10**10
to a np.nan
.
ok, I figured out how to do if efficiently in pandas
, which works for me.
I used an combination of a lambda
function and the applymap
method of a pandas DataFrame
:
data = data.applymap(lambda x: np.nan if x==10**10 else x)