python-3.xgoogle-colaboratoryaudio-analysis

Formation of folder redundantly


I have the following structure. I want to iterate through sub folders (machine, gunshot) and process .wav files and build mfccresult folder in each category and the .csv file in it. I have the following code and the MFCC folder is keep forming in already formed MFCC folder.

parent_dir = 'sound'
for subdirs, dirs, files in os.walk(parent_dir):

    resultsDirectory = subdirs + '/MFCC/'
    if not os.path.isdir("resultsDirectory"):
        os.makedirs(resultsDirectory)
    for filename in os.listdir(subdirs):
        if filename.endswith('.wav'):
            (rate,sig) = wav.read(subdirs + "/" +filename)
            mfcc_feat = mfcc(sig,rate)
            fbank_feat = logfbank(sig,rate)
            outputFile = resultsDirectory + "/" + os.path.splitext(filename)[0] + ".csv"
            file = open(outputFile, 'w+')
            numpy.savetxt(file, fbank_feat, delimiter=",")
            file.close()

Solution

  • What version of python are you using? Not sure if this has changed in the past, but os.walk does not return "subdirs" as the first of the tuple, but the dirpath. See here for python 3.6.

    I don't know your absolute path, but seeing as you are passing in the path sound as a relative reference, I assume it is a folder inside the directory where you run your python code. So for example, lets say you are running this file (lets call it mycode.py) from

    /home/username/myproject/mycode.py

    and you have some subdirectory:

    /home/username/myproject/sound/

    So:

    resultsDirectory = subdirs + '/MFCC/'

    as written in your code above would resolve to:

    /home/username/myproject/sound/MFCC/

    So your first if statement will be entered since this is not an existing directory. Thereby you create a new directory:

    /home/username/myproject/sound/MFCC/

    From there, you take

    filename in os.listdir(subdirs)

    This is also appears to be a misunderstanding of the output of this function. os.listdir() will return directories not files. See here for the man on that.

    So now you are looping through the directories in:

    /home/username/myproject/sound/

    Here, I assume you have some of the directories from your diagram already made. So I assume you have:

    /home/username/myproject/sound/machine_sound /home/username/myproject/sound/gun_shot_sound

    or something along those lines.

    So the if statement will never be entered, since your directory names to not end with '.wav'.

    Even if it did enter, you'd still have issues asfilename will actually be equal to machine_sound on the first loop, and gun_shot_sound in the second time through.

    Maybe you are using some other wav library, but the python built-in is called wave and you need to call the wave.open() on the file not wav.read(). See here for the docs.

    I'm not sure what you were trying to achieve with the call to os.path.splitext(filename)[0], but you can read about it here You will end up with the same thing that went in in this case though, so machine_sound and gun_shot_sound.

    Your output file will thus result in:

    /home/username/myproject/sound/MFCC/machine_sound.csv

    on the first loop, and

    /home/username/myproject/sound/MFCC/gun_shot_sound.csv

    the second time through.

    So in conclusion, I'm not sure what is happening when you say "MFCC folder is keep forming in already formed MFCC folder" but you definitely have a lot of reading ahead of you before you can understand your own code, and have any hope of fixing it to do what you want. Assuming you read through the links I provided, you should be able to do that though. Good luck!

    Additionally, you had quite few typos in your code that I edited, include the immensely important whitespace characters. You should clean that up and ensure your code runs before posting it here, then double check that your copy/paste action did not result in any errors. People will be much more willing to help if you clean up your presentation a bit.