apache-sparkcassandraspark-cassandra-connectoramazon-keyspaces

Problem to read and write aws keypace with spark connector


I'm trying to write read and write some data on aws keyspace, but the following message appears.

enter image description here

Versions: Spark: 2.4.6 Cassandra connector: 2.5.2 Scala: 2.11.10

New and old version problems occurs too.


Solution

  • This error is do to not being able to see system.peers table. Spark requires the peers table info to get the token information.

    1. check if they have access to read the system tables. If you are using a public endpoint you should have 9 and if you are using a VPCE you should have one for each availability zone.
      SELECT * FROM system.peers

    If you are using a vpc endpoint check to see if you have setup the right permissions.

     {
             "Sid":"ListVPCEndpoints",
             "Effect":"Allow",
             "Action":[
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeVpcEndpoints"
             ],
             "Resource":"*"
          }
    

    The following example is how to using Spark/Glue to export Keyspaces data to S3. https://github.com/aws-samples/amazon-keyspaces-examples/tree/main/scala/datastax-v4/aws-glue/export-to-s3