pythonfastapihttp-status-code-500

Calls to external API work when running code as a script, but receive `500 Internal Server Error` response when using FastAPI to run the same code?


I have an application to predict the size of a fish in an image. I have built a FastAPI endpoint --/predict/-- that runs the multi-step process to make that prediction. The steps include two calls to external APIs (not under my control, so I can't see more than what they return).

When I run my code just from the script, such as through an IDE (I use PyCharm), the code for the prediction steps runs correctly and I get appropriate responses back from both APIs.

The first is to Roboflow, and here is an example of the output from running the script (again, I just call this from the command line or hit Run in Pycharm):

2024-03-30 10:59:36,073 - DEBUG - Starting new HTTPS connection (1): detect.roboflow.com:443
2024-03-30 10:59:36,339 - DEBUG - https://detect.roboflow.com:443 "POST /fish_measure/1?api_key=AY3KX4KMynZroEOyXUEb&disable_active_learning=False HTTP/1.1" 200 914

The second is to Fishial, and here is an example of the output from running the script (script or through PyCharm), where this one has to get the token, url, etc:

2024-03-30 11:02:31,866 - DEBUG - Starting new HTTPS connection (1): api-users.fishial.ai:443
2024-03-30 11:02:33,273 - DEBUG - https://api-users.fishial.ai:443 "POST /v1/auth/token HTTP/1.1" 200 174
2024-03-30 11:02:33,273 - INFO - Access token: eyJhbGciOiJIUzI1NiJ9.eyJleHAiOjE3MTE4MTE1NTMsImtpZCI6ImIzZjNiYWZlMTg2NGNjYmM3ZmFkNmE5YSJ9.YtlaecKMyxjipBDS97xNV3hYKcF3jRpOxTAVnwrxOcE
2024-03-30 11:02:33,273 - INFO - Obtaining upload url...
2024-03-30 11:02:33,582 - DEBUG - Starting new HTTPS connection (1): api.fishial.ai:443
2024-03-30 11:02:33,828 - DEBUG - https://api.fishial.ai:443 "POST /v1/recognition/upload HTTP/1.1" 200 1120
2024-03-30 11:02:33,829 - INFO - Uploading picture to the cloud...
2024-03-30 11:02:33,852 - DEBUG - Starting new HTTPS connection (1): storage.googleapis.com:443
2024-03-30 11:02:34,179 - DEBUG - https://storage.googleapis.com:443 "PUT /backend-fishes-storage-prod/6r9p24qp4llhat8mliso8xacdxm5?GoogleAccessId=services-storage-client%40ecstatic-baton-230905.iam.gserviceaccount.com&Expires=1711811253&Signature=gCGPID7bLuw%2FzUfv%2FLrTRPeQA060CaXQEqITPvW%2FWZ5GHXYKDRNCxVrUJ7UmpHVa0m60gIMFwFSQhYqsDmP3SkjI7ZnJSIEj53zxtOpcL7o2VGv6ZUuoowWwzmzqeM9yfbCHGI3TmtuW0lMhqAyi6Pc0wYhj73P12QU28wF8sdQMblHQLQVd1kFXtPl5yjSW12ADt4WEvB7dbnl7HmUTcL8WFS2SnJ1zcLljIbXTlRWcqc88MIcklSLG69z%2FJcUSh%2BeNxRp%2Fzotv5GitJBq9pF%2BzRt25lCt%2BYHGViJ46uu4rQapZBfACxsE762a1ZcrvTasy97idKRaijLJKAtZBRQ%3D%3D HTTP/1.1" 200 0
2024-03-30 11:02:34,180 - INFO - Requesting fish recognition...
2024-03-30 11:02:34,182 - DEBUG - Starting new HTTPS connection (1): api.fishial.ai:443
2024-03-30 11:02:39,316 - DEBUG - https://api.fishial.ai:443 "GET /v1/recognition/image?q=eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBMksyUEE9PSIsImV4cCI6bnVsbCwicHVyIjoiYmxvYl9pZCJ9fQ==--d37fdc2d5c6d8943a59dbd11326bc8a651f9bd69 HTTP/1.1" 200 10195

Here is the code for the endpoint:

from fastapi import FastAPI, File, UploadFile, HTTPException, BackgroundTasks
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import Union

class PredictionResult(BaseModel):
    prediction: Union[float, str]
    eyeball_estimate: Union[float, str]
    species: str
    elapsed_time: float


@app.post("/predict/", response_model=PredictionResult)
    async def predict_fish_length(file: UploadFile = File(...)):
        try:
            # capture the start of the process so we can track duration
            start_time = time.time()
            # Create a temporary file
            temp_file = tempfile.NamedTemporaryFile(delete=False)
            temp_file_path = temp_file.name
    
            with open(temp_file_path, "wb") as buffer:
                shutil.copyfileobj(file.file, buffer)
    
            temp_file.close()
    
            prediction = process_one_image(temp_file_path)
            
            end_time = time.time()  # Record the end time
            elapsed_time = end_time - start_time  # Calculate the elapsed time
    
            return PredictionResult(
                prediction=prediction["prediction"][0],
                eyeball_estimate=prediction["eye_ratio_len_est"][0],
                species=prediction["species"][0],
                elapsed_time=elapsed_time
            )
    
        except Exception as e:
            # Clean up the temp file in case of an error
            os.unlink(temp_file_path)
            raise HTTPException(status_code=500, detail=str(e)) from e

I run this through uvicorn, then try to call the endpoint through curl as follows:

curl -X POST http://127.0.0.1:8000/predict/ -F "file=@/path/to/image.jpg"

The Roboflow API calls work fine, but now I get this response from the Fishial (second) API:

2024-03-30 10:48:09,166 - DEBUG - Starting new HTTPS connection (1): api.fishial.ai:443
2024-03-30 10:48:10,558 - DEBUG - https://api.fishial.ai:443 "GET /v1/recognition/image?q=eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBMWkyUEE9PSIsImV4cCI6bnVsbCwicHVyIjoiYmxvYl9pZCJ9fQ==--36e68766cd891eb0e57610e8fb84b76e205b639e HTTP/1.1" 500 89
INFO:     127.0.0.1:49829 - "POST /predict/ HTTP/1.1" 500 Internal Server Error

I'm not sure where to look, or perhaps what to print out/log, in order to get more information. I'm not even sure if the error is on my side or coming from the API I'm calling (though the 500 89 end of the GET line at the end makes me think it's coming from the API I'm calling).

Many thanks!

EDIT: A request was made for more code. The function to process an image is just a series of calls to other functions. So I've included here only the code I use to call the second (Fishial) API:

def recognize_fish(file_path, key_id=key_id, key_secret=key_secret, identify=False):
    if not os.path.isfile(file_path):
        err("Invalid picture file path.")

    for dep in DEPENDENCIES:
        try:
            __import__(dep)
        except ImportError:
            err(f"Unsatisfied dependency: {dep}")

    logging.info("Identifying picture metadata...")

    name = os.path.basename(file_path)
    mime = mimetypes.guess_type(file_path)[0]
    size = os.path.getsize(file_path)
    with open(file_path, "rb") as f:
        csum = base64.b64encode(hashlib.md5(f.read()).digest()).decode("utf-8")

    logging.info(f"\n  file name: {name}")
    logging.info(f"  MIME type: {mime}")
    logging.info(f"  byte size: {size}")
    logging.info(f"   checksum: {csum}\n")

    if identify:
        return

    if not key_id or not key_secret:
        err("Missing key ID or key secret.")

    logging.info("Obtaining auth token...")

    data = {
        "client_id": key_id,
        "client_secret": key_secret
    }

    response = requests.post("https://api-users.fishial.ai/v1/auth/token", json=data)
    auth_token = response.json()["access_token"]
    auth_header = f"Bearer {auth_token}"

    logging.info(f"Access token: {auth_token}")

    logging.info("Obtaining upload url...")

    data = {
        "blob": {
            "filename": name,
            "content_type": mime,
            "byte_size": size,
            "checksum": csum
        }
    }

    headers = {
        "Authorization": auth_header,
        "Content-Type": "application/json",
        "Accept": "application/json"
    }

    response = requests.post("https://api.fishial.ai/v1/recognition/upload", json=data, headers=headers)
    signed_id = response.json()["signed-id"]
    upload_url = response.json()["direct-upload"]["url"]
    content_disposition = response.json()["direct-upload"]["headers"]["Content-Disposition"]

    logging.info("Uploading picture to the cloud...")

    with open(file_path, "rb") as f:
        requests.put(upload_url, data=f, headers={
            "Content-Disposition": content_disposition,
            "Content-MD5": csum,
            "Content-Type": ""
        })

    logging.info("Requesting fish recognition...")

    response = requests.get(f"https://api.fishial.ai/v1/recognition/image?q={signed_id}",
                            headers={"Authorization": auth_header})
    fish_count = len(response.json()["results"])

    logging.info(f"Fishial Recognition found {fish_count} fish(es) on the picture.")

    if fish_count == 0:
        return []

    species_names = []

    for i in range(fish_count):
        fish_data = extract_from_json(f"results[{i}]", response.json())

        if fish_data and "species" in fish_data:
            logging.info(f"Fish {i + 1} is:")

            for j in range(len(fish_data["species"])):
                species_data = fish_data["species"][j]
                if "fishangler-data" in species_data and "metaTitleName" in species_data["fishangler-data"]:
                    species_name = species_data["fishangler-data"]["metaTitleName"]
                    accuracy = species_data["accuracy"]

                    logging.info(f"  - {species_name} [accuracy {accuracy}]")
                    species_names.append(species_name)
                else:
                    logging.error("  - Species name not found in the response.")
        else:
            logging.error(f"\nFish {i + 1}: Species data not found in the response.")

    return species_names

P.S. This feels like it's getting a little long. If putting this much code on Pastebin is more appropriate, I'm happy to edit.


Solution

  • I was able to fix the problem, though I am not sure I truly understand the problem.

    The solution to the problem was to provide the Fishial API with a "real" or "permanent" path instead of a temp file. I edited my endpoint to:

    My revised endpoint is below, and it works correctly:

    @app.post("/predict/", response_model=PredictionResult)
    async def predict_fish_length(file: UploadFile = File(...), get_species: bool = Query(False)):
        try:
            # capture the start of the process so we can track duration
            start_time = time.time()
    
            # Create the uploads folder if it doesn't exist
            uploads_folder = "uploads"
            os.makedirs(uploads_folder, exist_ok=True)
    
            # Generate a unique file name
            file_name = f"{int(time.time())}_{file.filename}"
            file_path = os.path.join(uploads_folder, file_name)
    
            # Save the uploaded file to the uploads folder
            with open(file_path, "wb") as buffer:
                content = await file.read()
                buffer.write(content)
    
            prediction = process_one_image(file_path, get_species=get_species)
            logging.debug(f"Prediction result: {prediction}")
    
            if not isinstance(prediction, dict):
                raise ValueError("Invalid prediction format. Expected a dictionary.")
    
            required_keys = ["ml_prediction", "eye_ratio_len_est"]
            if get_species:
                required_keys.append("species")
    
            for key in required_keys:
                if key not in prediction:
                    raise KeyError(f"Missing required key '{key}' in the prediction dictionary.")
    
            end_time = time.time()  # Record the end time
            elapsed_time = end_time - start_time  # Calculate the elapsed time
    
            return PredictionResult(
                ml_prediction=prediction["ml_prediction"],
                eyeball_estimate=prediction["eye_ratio_len_est"],
                species=prediction.get("species", [None])[0],
                elapsed_time_seconds=elapsed_time,
                path=prediction["image_path"]
            )
    
        except Exception as e:
            error_message = f"An error occurred: {str(e)}"
            logging.error(error_message)
            raise HTTPException(status_code=500, detail=error_message) from e