amazon-web-serviceswebsocketaws-api-gateway

AWS Websockets API with custom domain name not working in all stages?


I setup a Websockets API with the AWS API Gateway service and it‘s working fine for the stage staging. I‘ve setup a custom domain name for thr websockets API, like so:

I‘ve deployed the websockets API for the production stage too now, but I am having issues. Connecting with my custom domain name without path suffix as indicated in the mapping for production, from a client, works perfectly. But, when I try to send a postToConnection call from my server-side SDK to a specific websocket connection ID, my server fails with the following message coming from the AWS Cloud:

AccessDeniedException (client): User: XXX is not authorized to perform: execute-api:Invoke on resource: arn:aws:execute-api:YYY/production/POST/@connections/@connections/LLDsmeF5ZicCJ8w%3D

This already seemed odd, as the permission policy of the IAM Role (properly assumed, I verified this) is:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "execute-api:ManageConnections"
            ],
            "Resource": [
                "arn:aws:execute-api:AWSREGION:AWSACCOUNTID:APIID/production/DELETE/@connections/*",
                "arn:aws:execute-api:AWSREGION:AWSACCOUNTID:APIID/production/POST/@connections/*"
            ]
        }
    ]
}

While for the staging stage, the permission policy attached to the IAM Role used in staging is:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "execute-api:ManageConnections"
            ],
            "Resource": [
                "arn:aws:execute-api:AWSREGION:AWSACCOUNTID:APIID/staging/DELETE/@connections/*",
                "arn:aws:execute-api:AWSREGION:AWSACCOUNTID:APIID/staging/POST/@connections/*"
            ]
        }
    ]
}

And for staging it works. So the „invoke“ permission should not be required.

Then I added the „invoke“ permission to the policy, and now got the error (again only for the production stage) when calling „postToConnection“, saying:

NotFoundException (client): No method found matching route @connections/@connections/LLDsmeF5ZicCJ8w= for http method POST.

What on earth is going on? The CloudWatch Logs do not provide anymore insights and give exactly the same (proper execution) logs until before sending „postToConnection“.


Solution

  • This github answer solved my issue. Turns out that sending the postToConnection request to the staging API endpoint:

    https://APIID.execute-api.AWSREGION.amazonaws.com/staging/@connections

    worked properly, while sending it to

    https://APIID.execute-api.AWSREGION.amazonaws.com/production/@connections

    did not, and resulted in the above-mentioned access-denied error. The fact that there's the @connections/@connections part in the error message made me think that there must be one @connections path part that was provided excessively. So, without changing any of the custom domain name configs and related mappings, I change my server-side SDK code (also for the staging) to send the postToConnection request to:

    https://APIID.execute-api.AWSREGION.amazonaws.com/staging/

    instead of:

    https://APIID.execute-api.AWSREGION.amazonaws.com/staging/@connections

    and thus also sending the SDK postToConnection request to the prod endpoint:

    https://APIID.execute-api.AWSREGION.amazonaws.com/production/

    instead of:

    https://APIID.execute-api.AWSREGION.amazonaws.com/production/@connections

    and that now fully works, also for prod.

    I still don't understand why the above-mentioned actually worked for staging even with @connections. Anyways, the documentations of the SDK and also the CLI actually tell you to send the postToConnection call to the basic URL, excluding the @connections suffix, so I guess this is all due to that problem.