pythonfile-uploadclassificationfastapi

How to process files in FastAPI from multiple clients without saving the files to disk


I am using FastAPI to create an API that receives small audio files from a mobile app. In this API I do processing of the signal and I am able to return a response after classifying that sound. The final goal is to send the classification back to the user.

Here's what I am doing so far:

@app.post("/predict")

def predict(file: UploadFile = File(...)):   # Upload the wav audio sent from the mobile app user

 with open(name_file, "wb") as buffer:
        shutil.copyfileobj(file.file, buffer)  #creating a file with the received audio data
...

prev= test.my_classification_module(name_file) #some processing and the goal response in PREV variable

In my_classification_module(), I have this :

X, sr = librosa.load(sound_file)

I want to avoid creating a file to be classified with librosa. I would like to do this with a temporary file, without using memory unecessarily and to avoid the overlap of files when the app is used by multiple users.


Solution

  • If your function supports a file-like object, you could use the .file attribute of UploadFile, e.g., file.file (which is a SpooledTemporaryFile instance), or if your function requires the file in bytes format, use the .read() async method (see the documentation). If you wish to keep your route defined with def instead of async def (have a look at this answer for more info on def vs async def), you could use the .read() method of the file-like object directly, e.g., file.file.read().

    Update - How to resolve File contains data in an unknown format error

    1. Make sure the audio file is not corrupted. If, let's say, you saved it and opened it with a media player, would the sound file play?

    2. Make sure you have the latest version of librosa module installed.

    3. Try installing ffmpeg and adding it to the system path, as suggested here.

    4. As described here and in the documentation, librosa.load() can take a file-like object as an alternative to a file path - thus, using file.file or file.file._file should normally be fine (as per the documentation, _file attribute is either an io.BytesIO or io.TextIOWrapper object...).

      However, as described in the documentation here and here, as well as in this github discussion, you could also use the soundfile module to read audio from file-like objects. Example:

      import soundfile as sf 
      
      data, samplerate = sf.read(file.file)
      
    5. You could also write the file contents of the uploaded file to a BytesIO stream, and then pass it to either sf.read() or librosa.load():

      from io import BytesIO
      
      contents = file.file.read()
      buffer = BytesIO(contents)
      data, samplerate = librosa.load(buffer)  # ussing librosa module
      #data, samplerate = sf.read(buffer)      # using soundfile module
      buffer.close()
      
    6. Another option would be to save the file contents to a NamedTemporaryFile, which "has a visible name in the file system" that "can be used to open the file". Once you are done with it, you can manually delete it using the remove() or unlink() method.

      from FastAPI import HTTPException
      from tempfile import NamedTemporaryFile
      import os
      
      contents = file.file.read()
      temp = NamedTemporaryFile(delete=False)
      try:
          with temp as f:
              f.write(contents);
          data, samplerate = librosa.load(temp.name)   # ussing librosa module
          #data, samplerate = sf.read(temp.name)       # using soundfile module
      except Exception:
          raise HTTPException(status_code=500, detail='Something went wrong')
      finally:
          #temp.close()  # the `with` statement above takes care of closing the file
          os.remove(temp.name)