google-cloud-platform google-cloud-storage google-cloud-speech

Re-encoding audio file to linear16 for google cloud speech api fails with '[Errno 30] Read-only file system'

I'm trying to convert an audio file to linear 16 format using FFmpeg module. I've stored the audio file in one cloud storage bucket and want to move the converted file to a different bucket. The code works perfectly in VS code and deploys successfully to cloud functions. But, fails with [Errno 30] Read-only file system when run on the cloud.

Here's the code

from google.cloud import speech
from google.cloud import storage
import ffmpeg
import sys


out_bucket = 'encoded_audio_landing'
input_bucket_name = 'audio_landing'

def process_audio(input_bucket_name, in_filename, out_bucket):
    '''
    converts audio encoding for GSK call center call recordings to linear16 encoding and 16,000
    hertz sample rate

    Params:
        in_filename: a gsk call audio file

    returns an audio file encoded so that google speech to text api can transcribe
    '''
    storage_client = storage.Client()
    bucket = storage_client.bucket(input_bucket_name)

    blob = bucket.blob(in_filename)
 
   
    blob.download_to_filename(blob.name)
    print('type contents: ', type('processedfile'))
    #print('blob name / len / type', blob.name, len(blob.name), type(blob.name))

    try:
        out, err = (
            ffmpeg.input(blob.name)
            #ffmpeg.input()
            .output('pipe: a', format="s16le", acodec="pcm_s16le", ac=1, ar="16k")
            .overwrite_output()
            .run(capture_stdout=True, capture_stderr=True)
        )
        
    except ffmpeg.Error as e:
        print(e.stderr, file=sys.stderr)
        sys.exit(1)

    up_bucket = storage_client.bucket(out_bucket)
    up_blob = up_bucket.blob(blob.name)
    #print('type / len out', type(out), len(out))
    up_blob.upload_from_string(out)

    #delete source file
    blob.delete()




def hello_gcs(event, context):
    """Background Cloud Function to be triggered by Cloud Storage.
       This generic function logs relevant data when a file is changed,
       and works for all Cloud Storage CRUD operations.
    Args:
        event (dict):  The dictionary with data specific to this type of event.
                       The `data` field contains a description of the event in
                       the Cloud Storage `object` format described here:
                       https://cloud.google.com/storage/docs/json_api/v1/objects#resource
        context (google.cloud.functions.Context): Metadata of triggering event.
    Returns:
        None; the output is written to Cloud Logging
    """

    #print('Event ID: {}'.format(context.event_id))
    #print('Event type: {}'.format(context.event_type))
    print('Bucket: {}'.format(event['bucket']))
    print('File: {}'.format(event['name']))
    print('Metageneration: {}'.format(event['metageneration']))
    #print('Created: {}'.format(event['timeCreated']))
    #print('Updated: {}'.format(event['updated']))

    #convert audio encoding
    print('begin process_audio')
    process_audio(input_bucket_name, event['name'], out_bucket)

Solution

The problem was that I was downloading the file to my local directory, which obviously wouldn't work on the cloud. I read another article where someone used added the get file path function and used that as an input into blob.download_tofilename(). I'm not sure why that worked.

I did try just removing the whole download_tofilename bit, but it didn't work without that.

I'd very much appreciate an explanation if someone knows why

#this gets around downloading the file to a local folder. it creates some sort of templ location
def get_file_path(filename):
    file_name = secure_filename(filename)
    return os.path.join(tempfile.gettempdir(), file_name)


def process_audio(input_bucket_name, in_filename, out_bucket):
    '''
    converts audio encoding for GSK call center call recordings to linear16 encoding and 16,000
    hertz sample rate

    Params:
        in_filename: a gsk call audio file
        input_bucket_name: location of the sourcefile that needs to be re-encoded
        out_bucket: where to put the newly encoded file

    returns an audio file encoded so that google speech to text api can transcribe
    '''
    storage_client = storage.Client()
    bucket = storage_client.bucket(input_bucket_name)

    blob = bucket.blob(in_filename)

    print(blob.name)

    #creates some sort of temp loaction for the tile
    file_path = get_file_path(blob.name)
 
   
    blob.download_to_filename(file_path)
    print('type contents: ', type('processedfile'))
    #print('blob name / len / type', blob.name, len(blob.name), type(blob.name))

    #envokes the ffmpeg library to re-encode the audio file, it's actually some sort of command line application
    #   that is available in Python and google cloud. The things in the .outuput bit are options from ffmpeg, you
    #   pass these options into ffmpeg there
    try:
        out, err = (
            ffmpeg.input(file_path)
            #ffmpeg.input()
            .output('pipe: a', format="s16le", acodec="pcm_s16le", ac=1, ar="16k")
            .overwrite_output()
            .run(capture_stdout=True, capture_stderr=True)
        )
        
    except ffmpeg.Error as e:
        print(e.stderr, file=sys.stderr)
        sys.exit(1)