gocassandracqlamazon-keyspaces

`gocql: no hosts available in the pool` with AWS Keyspaces


I'm using the gocql package to connect with AWS Keyspaces. Generally, my approach works 95% of the time, but every few days or so I get this error: gocql: no hosts available in the pool. I've followed the official documentation on how to set up a connection and read/write to the keyspace, but I haven't figured out how to avoid this issue. The connection is ran in a Lambda, and it is initialized like this:

var session *gocql.Session

func handleRequest(ctx context.Context, req events.ApiGatewayProxyRequest) (events.ApiGatewayProxyResponse, error) {...}

func main() {
    cql, err := cassandra.GetKeyspacesSession()
    if err != nil {
        fmt.Printf("ERROR: could not initialize the Keyspaces CQL session: %v\n", err)
        panic(err)
    }
    session = cql
    defer session.Close()

    lambda.Start(handleRequest)
}

For context, this is how I initialize my gocql client:

func GetKeyspacesSession() (*gocql.Session, error) {
    // this works, never has an issue
    config, err := secrets.LoadSSLCertificate(context.Background())
    if err != nil {
        return nil, err
    }

    cluster := gocql.NewCluster(os.Getenv("KEYSPACES_ENDPOINT"))
    port, err := strconv.Atoi(os.Getenv("KEYSPACES_PORT"))
    if err != nil {
        fmt.Printf("Invalid port number: %v\n", os.Getenv("KEYSPACES_PORT"))
        return nil, err
    }
    cluster.Port = port

    auth := sigv4.NewAwsAuthenticator()
    auth.Region = os.Getenv("AWS_REGION")
    auth.AccessKeyId = os.Getenv("AWS_ACCESS_KEY_ID")
    auth.SecretAccessKey = os.Getenv("AWS_SECRET_ACCESS_KEY")

    cluster.Authenticator = auth

    cluster.SslOpts = &gocql.SslOptions{Config: config}
    cluster.Consistency = gocql.LocalQuorum
    cluster.DisableInitialHostLookup = true
    cluster.ReconnectionPolicy = ReconnectionPolicy{MaxRetries: 4, Delay: time.Duration(400) * time.Millisecond}
    debugCluster(cluster)

    session, err := cluster.CreateSession()
    if err != nil {
        fmt.Printf("Error creating session: %v\n", err)
    }
    return session, err
}

I followed the official documentation from AWS to authenticate with the sigv4 driver as well as tips from this GitHub issue to try to resolve this, and it always seems almost fixed, but this problem keeps haunting me. What do I do to avoid this? Do I need to initialize it within the function handler and not the initializer? Documentation has been inconclusive.


Solution

  • I haven't spent the time yet to understand why this works, but initializing the client in the handler instead of the main() function has prevented this issue. I haven't seen it come up in the last couple months now.

    Before

    var session *gocql.Session
    
    func handleRequest(ctx context.Context, req events.ApiGatewayProxyRequest) (events.ApiGatewayProxyResponse, error) {...}
    
    func main() {
        cql, err := cassandra.GetKeyspacesSession()
        if err != nil {
            panic(err)
        }
        session = cql
        defer session.Close()
    
        lambda.Start(handleRequest)
    }
    

    After

    func handleRequest(ctx context.Context, req events.ApiGatewayProxyRequest) (events.ApiGatewayProxyResponse, error) {
        cql, err := cassandra.GetKeyspacesSession()
        if err != nil {
            panic(err)
        }
        defer cql.Close()
    }
    
    func main() {
        lambda.Start(handleRequest)
    }