pythonpython-3.xpandasdata-consistencyasammdf

change specific value in mdf-object (python, asammdf)


I would like to change some values in an mdf file (specifically, I would like to check for consistency, since the measurement instrument for some reason writes 10**10 when no value could be found). I can't figure out how to access specific values and change them. I figured out how to include the channel units in the channel names, which works reasonably fast:

with MDF(file) as mdf:
    for i,gp in enumerate(mdf.groups):# add units to channel names (faster than using pandas)
        for j,ch in enumerate(gp.channels):
            mdf.groups[i].channels[j].name = ch.name + " [" + ch.unit + "]"

Unfortunately, gp.channels doesn't seem to have a way to access the data, only some metadata for each channel (or at least I can't figure out the attribute or method).

I already tried to convert to a dataframe, where this is rather easy, but the file is quite large so it takes waaaay too long to sift through all the datapoints - my guess is this could be quite a bit faster if it is done in the mdf directly.

# slow method with dataframe conversion
data = mdf.to_dataframe()
columns = data.columns.tolist()
for col in columns:
    for i,val in enumerate(data[col]):
        if val == 10**10:
            data.loc[i, col] = np.nan

Downsampling solves the taking too long part, but this is not really a solution either since I do need the original sample rate. Accessing the data is not a problem, since I can use the select() or get() methods, but I can't change the values - I don't know how. Ideally, I'd change any 10**10 to a np.nan.


Solution

  • ok, I figured out how to do if efficiently in pandas, which works for me. I used an combination of a lambda function and the applymap method of a pandas DataFrame:

    data = data.applymap(lambda x: np.nan if x==10**10 else x)