aws-lambdaamazon-cloudwatchcloudwatch-alarmsaws-canaries

Deploying Python AWS Synthetics Canary via S3


I have a frustrating problem. I must be missing something obvious. This is my first canary, and I can't find any examples of using Python scripting for this via S3.

I am uploading a Python script to be used for the Lambda handler. I am building the archive via a secondary process and deploying it via Terraform. I note below how, even though I'm providing the S3 reference for the uploaded code to Terraform, there appears to be no reference to it in either the canary or the underlying Lambda function.

This is the structure of the archive:

$ unzip -l application/function.zip 
Archive:  application/function.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
      776  2025-06-19 21:53   canary.py
      119  2025-05-21 21:30   README.txt
---------                     -------
      895                     2 files

This is the content of the Python module. It's just a skeleton while I debug the deployment:

import logging
import json

_LOGGER = logging.getLogger(__name_)

def _get_pretty_json(data):
    return \
        json.dumps(
            data,
            sort_keys=True,
            indent=4,
            separators=(',', ': '))

def handler(event, context):

    _LOGGER.debug("EVENT:\n{}\n\nCONTEXT:\n{}".format(
                  _get_pretty_json(event), _get_pretty_json(context)))

This is the Terraform scripting:

module "assets_s3" {
  source = "../../datasource/s3"
}

resource "aws_s3_object" "function_archive_upload" {
  bucket = module.assets_s3.canary_reports_bucket_id
  key    = "functions/${data.aws_caller_identity.current.account_id}/${data.aws_region.current.name}/${var.name}/function.zip"

  source = "${path.module}/../../application/function.zip"
  etag   = filemd5("${path.module}/../../application/function.zip")
}

resource "aws_synthetics_canary" "canary_api_calls" {
  name = var.name

  artifact_s3_location = "s3://${module.assets_s3.canary_reports_bucket_id}/results/${data.aws_caller_identity.current.account_id}/${data.aws_region.current.name}/${var.name}"

  execution_role_arn = data.aws_iam_role.role.arn
  runtime_version    = var.runtime_version

  handler = "canary.handler"

  s3_bucket = module.assets_s3.canary_reports_bucket_id
  s3_key    = aws_s3_object.function_archive_upload.id

  schedule {
    expression          = var.schedule
    duration_in_seconds = 0
  }

  vpc_config {
    subnet_ids         = var.subnet_ids
    security_group_ids = [var.security_group_id]
  }

  depends_on = [
    aws_s3_object.function_archive_upload
  ]
}

After I apply the Terraform, the S3 has the archive (naturally, since there were no failures):

$ aws --profile dashlx_dev s3 ls s3://dashlx-canary/functions/086261225885/us-west-2/dustin-test-canary/
2025-06-20 01:38:51        833 function.zip

Yet, "Canary script" section in the canary UI says:

Error: Unable to load the canary script. The canary handler may not match to the script file, or the size of the file exceeds the limit (5MB).

The canary bootstrap will fail with:

2025-06-20T05:39:20Z ERROR: Canary error:
Traceback (most recent call last):
  File "/var/task/index.py", line 89, in handle_canary
      raise ModuleNotFoundError('No module named: %s' % file_name)
ModuleNotFoundError: No module named: canary

I updated the bootstrap to show the contents of the current path:

...
if customer_canary_handler is not None:
    # Assuming handler format: fileName.functionName
    file_name, function_name = customer_canary_handler.split(".")
    logger.info("Customer canary entry file name: %s" % file_name)
    logger.info("Customer canary entry function name: %s" % function_name)

logger.info("Source path: {}".format(constants.PYTHON_SRC_PATH))

for path, folder, files in os.walk(constants.PYTHON_SRC_PATH):
    for filename in files:
        logger.info("FILE: [{}] [{}]".format(path, filename))

absolute_file_path = constants.PYTHON_SRC_PATH + file_name + ".py"
# Call customer's execution handler
# Canary file is located under /opt/python/
...

The output of that is:

Customer canary entry file name: canary
Customer canary entry function name: handler
Source path: /opt/python/
FILE: [/opt/python/aws_synthetics] [THIRD_PARTY_LICENSES.zip]
FILE: [/opt/python/aws_synthetics/common] [__init__.py]
FILE: [/opt/python/aws_synthetics/common/har_parser] [README.md]
FILE: [/opt/python/aws_synthetics/core] [__init__.py]
FILE: [/opt/python/aws_synthetics/reports] [__init__.py]
FILE: [/opt/python/aws_synthetics/selenium] [__init__.py]
FILE: [/opt/python/lib] [chromedriver]
FILE: [/opt/python/lib/chromium] [aws.tar.br]
FILE: [/opt/python/lib/python3.11/site-packages] [_brotli.cpython-311-x86_64-linux-gnu.so]
FILE: [/opt/python/lib/python3.11/site-packages] [brotli.py]
FILE: [/opt/python/lib/python3.11/site-packages] [typing_extensions.py]
FILE: [/opt/python/lib/python3.11/site-packages/Brotli-1.1.0.dist-info] [INSTALLER]
FILE: [/opt/python/lib/python3.11/site-packages/PyAmazonCACerts-1.0-py3.11.egg-info] [PKG-INFO]
FILE: [/opt/python/lib/python3.11/site-packages/amazoncerts] [__init__.py]
FILE: [/opt/python/lib/python3.11/site-packages/certifi] [__init__.py]
FILE: [/opt/python/lib/python3.11/site-packages/certifi-2024.7.4.dist-info] [INSTALLER]
FILE: [/opt/python/lib/python3.11/site-packages/selenium] [__init__.py]
FILE: [/opt/python/lib/python3.11/site-packages/selenium-4.21.0.dist-info] [INSTALLER]
FILE: [/opt/python/lib/python3.11/site-packages/trio] [__init__.py]
FILE: [/opt/python/lib/python3.11/site-packages/trio-0.24.0.dist-info] [INSTALLER]
FILE: [/opt/python/lib/python3.11/site-packages/trio_websocket] [__init__.py]
FILE: [/opt/python/lib/python3.11/site-packages/trio_websocket-0.11.1.dist-info] [INSTALLER]
FILE: [/opt/python/lib/python3.11/site-packages/typing_extensions-4.12.2.dist-info] [INSTALLER]
FILE: [/opt/python/lib/python3.11/site-packages/urllib3] [__init__.py]
Canary error:
Traceback (most recent call last):
  File "/var/task/index.py", line 90, in handle_canary
    raise ModuleNotFoundError('No module named: %s' % file_name)
ModuleNotFoundError: No module named: canary

For succinctness, I've removed the log prefixes from each line as well as all but one file from each package just to emphatically show that my module is nowhere to be found.

What could the problem possibly be? We've confirmed that the file got uploaded above, and the Canary definition states the handler name that we provided:

canary handler name

The Canary bootstrap clearly expects a filename and function name:

file_name, function_name = customer_canary_handler.split(".")
logger.info("Customer canary entry file name: %s" % file_name)
logger.info("Customer canary entry function name: %s" % function_name)

The output of which matches what we've uploaded:

Customer canary entry file name: canary
Customer canary entry function name: handler

I think it's worth mentioning how weird it is that we're providing the handler to the canary while provisioning, and yet the canary points to Lambda, the Lambda points to its own bootstrap code, and the Lambda expects the handler to already be in-context (which it isn't). What's supposed to be acquiring and expanding the archive? The event being sent to the bootstrap code doesn't even reference it (just where the results are supposed to be pushed):

{
    "activeTracing": false,
    "artifactS3Location":
    {
        "s3Bucket": "XYZ-canary",
        "s3Key": "results/086261225885/us-west-2/dustin-test-canary/canary/us-west-2/dustin-test-canary/2025/06/20/06/03-16-122-dryrun"
    },
    "canaryName": "dustin-test-canary",
    "canaryRunId": "47c3b044-543b-4480-b700-a24e0f623de6",
    "canaryRunStartTime": 1750399405674,
    "customerCanaryCodeLocation": "arn:aws:lambda:us-west-2:086261225885:layer:cwsyn-dustin-test-canary-3b1266be-beae-4414-a495-a8a87c43bd22:6",
    "customerCanaryHandlerName": "canary.handler",
    "dryRunId": "d128b9d6-e35d-4b00-8add-70a8bd751c09",
    "invocationTime": 1750399396122,
    "logContextMap":
    {
        "canaryRunId": "47c3b044-543b-4480-b700-a24e0f623de6",
        "dryRunId": "d128b9d6-e35d-4b00-8add-70a8bd751c09"
    },
    "runtimeVersion": "syn-python-selenium-6.0",
    "s3BaseFilePath": "XYZ-canary/results/086261225885/us-west-2/dustin-test-canary"
}

I'd appreciate any wisdom from people who have experience with these.

I actually started this process with uploading the function directly to the canary (rather than S3), but got an identical result, no matter what I did. Since that has such a restrict size limitation and inevitably, I'll need some libraries that will require us to move to S3 eventually, I just switched to doing that now in the hope that this problem would go away. It hasn't.

It's worth mentioning that this code started with the example at https://github.com/aws-samples/cloudwatch-synthetics-canary-terraform .


Solution

  • Resolved:

    A note on debugging:

    We effectively debugged things by manually creating a canary and comparing the settings both in the UI and dumped at the CLI (by creating a bunch of aliases and macros to explore them). After seeing the canary working with directly-embedded code, we uploaded our existing package (that we've been having trouble with, above) to that manually-created canary, directly, and suddenly saw that working (this confirming that there was nothing wrong with the archive). We also debugged with and without blueprint (bootstrap) routines. Those extra wrappers added a layer of complexity.

    Important, final note on updating Lambda bootstrap/blueprint
    code:

    The canary will trigger the bootstrap/blueprint function (e.g. Python/Selenium) in Lambda, and that code will invoke the function in the archive that you're storing in S3.

    When you push an updated S3 canary function via Terraform, there can or will be downstream effects on the Canary, S3 versions, Lambda versions, and Lambda layers. However, if you touch the bootstrap code (like I did, to add verbosity), you will increment the version of the Lambda function but the canary will still be pointing to the old one.

    Somehow I was able to get the canary to look at later versions of the Lambda function over time through lots of refreshes and edits, but there appears to be no straightforward way of doing this via the UI or the CLI. That said, in order to revert any changes to the blueprint, just switch to a different version of the blueprint from the canary-edit screen and then switch back.