I have a transcript with different speakers, for instance (new.txt):
spk_0: Default transcript, containing many sentences. Such as this.
spk_1: Blablabla
spk_2: Blablablaba fjdslf
I want to create different strings from this transcript that only contains the text said by a speaker, so for instance:
new_spk_0 = "Default transcript, containing many sentences. Such as this."
new_spk_1 = "Blablabla"
How could I go about doing this?
Fixed it using the method provided in: Reading only the words of a specific speaker and adding those words to a list
Here a regex match at the beginning of the sentence is used to indicate the prevalence of different speakers and is later split into multiple key-value pairs in a dictionary.