I have a AWS Glue job that gets the data from S3, transforms it and loads to multiple Redshift tables.
Glue Job Details:
Type: Spark
This job runs : A new script to be authored by you
Worker type: Standard
Maximum Capacity : 5Connection Details:
The Glue job uses "Data catalog > connection" to connect to Redshift
Connection type : JDBCSometimes the Glue job fails with below Error:
The specified subnet does not have enough free addresses to satisfy the request (Service:AmazonEc2, Status Code: 400, Error Code: InsuffecientFreeAdressesInSubnet)
Is there a way to calculate the number of IP Addresses required by the Glue job based on above criteria or any other way to do it so that I can schedule the jobs in sequence?
Data Processing Units (DPUs) in Glue or workers decide the amount IPs you need. If you are using a subnet with not enough IPs, then select a small DPU count. The Subnet that you have added on your connection is the one that gets used. If you have another subnet in the same VPC, you can make use of that, as long as you can reach your redshift cluster from that subnet.