In the following code, based on an example I found using py2store, I use with_key_filt
to make two daccs (one with train data, the other with test data). I do get a filtered annots
store, but the wfs
store is not filtered.
What am I doing wrong?
from py2store import cached_keys
class Dacc:
"""Waveform and annotation data access"""
def __init__(self, wfs, annots, annot_to_tag=lambda x: x['tag']):
self.wfs = wfs # waveform store (keys: filepaths, values: numpy arrays)
self.annots = annots # annotation store (keys: filepaths, values: dicts or pandas series)
self.annot_to_tag = annot_to_tag # function to compute a tag from an annotation item
@classmethod
def with_key_filt(cls, key_filt, wfs, annots, annot_to_tag, chunker):
"""
Make an instance of the dacc class where the data is filtered out.
You could also filter out externaly, but this can be convenient
"""
filtered_annots = cached_keys(annots, keys_cache=key_filt)
return cls(wfs, filtered_annots, annot_to_tag)
def wf_tag_gen(self):
"""Generator of (wf, tag) tuples"""
for k in self.annots:
try:
wf = self.wfs[k]
annot = self.annots[k]
yield wf, self.annot_to_tag(annot)
except KeyError:
pass
It seems the intent of with_key_filt
seems to be to filter annots
, which itself is used as the seed of the wg_tag_gen
generator (and probably the other generators you didn't post). As such, it does indeed filter everything.
But I do agree on your expectation that the wfs
should be filtered as well. To achieve this, you just need to add one line to filter the wfs
.
class TheDaccYouWant(Dacc):
@classmethod
def with_key_filt(cls, key_filt, wfs, annots, annot_to_tag, chunker):
filtered_annots = cached_keys(annots, keys_cache=key_filt)
wfs = cached_keys(wfs, keys_cache=key_filt) # here's what was added
return cls(wfs, filtered_annots, annot_to_tag, chunker)