I'm having s3 endpoint grief. When my instances initialize they can not install docker. Details:
I have ASG instances sitting in a VPC with pub and private subnets. Appropriate routing and EIP/NAT is all stitched up.Instances in private subnets have outbouond 0.0.0.0/0 routed to NAT in respective public subnets. NACLs for public subnet allow internet traffic in and out, the NACLs around private subnets allow traffic from public subnets in and out, traffic out to the internet (and traffic from s3 cidrs in and out). I want it pretty locked down.
Amazon EC2 instance can't update or use yum
another s3 struggle with resolution:
I have tried:
S3Endpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal: '*'
Action:
- 's3:GetObject'
Resource:
- 'arn:aws:s3:::prod-ap-southeast-2-starport-layer-bucket/*'
- 'arn:aws:s3:::packages.*.amazonaws.com/*'
- 'arn:aws:s3:::repo.*.amazonaws.com/*'
- 'arn:aws:s3:::amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/*'
- 'arn:aws:s3:::amazonlinux.*.amazonaws.com/*'
- 'arn:aws:s3:::*.amazonaws.com'
- 'arn:aws:s3:::*.amazonaws.com/*'
- 'arn:aws:s3:::*.ap-southeast-2.amazonaws.com/*'
- 'arn:aws:s3:::*.ap-southeast-2.amazonaws.com/'
- 'arn:aws:s3:::*repos.ap-southeast-2-.amazonaws.com'
- 'arn:aws:s3:::*repos.ap-southeast-2.amazonaws.com/*'
- 'arn:aws:s3:::repo.ap-southeast-2-.amazonaws.com'
- 'arn:aws:s3:::repo.ap-southeast-2.amazonaws.com/*'
RouteTableIds:
- !Ref PrivateRouteTableA
- !Ref PrivateRouteTableB
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
VpcId: !Ref BasicVpc
VpcEndpointType: Gateway
(as you can see, very desperate) The first rule is required for the ECR interface endpoints to pull the image layers from s3, all of the others are attempts to reach amazon-linux-extras repos.
Below is the behavior happening on initialization I have recreated by connecting with session manager using SSM endpoint:
https://aws.amazon.com/premiumsupport/knowledge-center/connect-s3-vpc-endpoint/
I can not yum install or update
root@ip-10-0-3-120 bin]# yum install docker -y
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd Could not retrieve mirrorlist https://amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/2/core/latest/x86_64/mirror.list error was 14: HTTPS Error 403 - Forbidden
One of the configured repositories failed (Unknown), and yum doesn't have enough cached data to continue. At this point the only safe thing yum can do is fail. There are a few ways to work "fix" this:
1. Contact the upstream for the repository and get them to fix the problem.
2. Reconfigure the baseurl/etc. for the repository, to point to a working
upstream. This is most often useful if you are using a newer
distribution release than is supported by the repository (and the
packages for the previous distribution release still work).
3. Run the command with the repository temporarily disabled
yum --disablerepo=<repoid> ...
4. Disable the repository permanently, so yum won't use it by default. Yum
will then just ignore the repository until you permanently enable it
again or use --enablerepo for temporary usage:
yum-config-manager --disable <repoid>
or
subscription-manager repos --disable=<repoid>
5. Configure the failing repository to be skipped, if it is unavailable.
Note that yum will try to contact the repo. when it runs most commands,
so will have to try and fail each time (and thus. yum will be be much
slower). If it is a very temporary problem though, this is often a nice
compromise:
yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
Cannot find a valid baseurl for repo: amzn2-core/2/x86_64
and can not:
amazon-linux-extras install docker
Catalog is not reachable. Try again later.
catalogs at https://amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/2/extras-catalog-x86_64-v2.json, https://amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/2/extras-catalog-x86_64.json Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/amazon_linux_extras/software_catalog.py", line 131, in fetch_new_catalog request = urlopen(url) File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "/usr/lib64/python2.7/urllib2.py", line 435, in open response = meth(req, response) File "/usr/lib64/python2.7/urllib2.py", line 548, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib64/python2.7/urllib2.py", line 473, in error return self._call_chain(*args) File "/usr/lib64/python2.7/urllib2.py", line 407, in _call_chain result = func(*args) File "/usr/lib64/python2.7/urllib2.py", line 556, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) HTTPError: HTTP Error 403: Forbidden
Any gotchas I've missed? I'm very stuck here. I am familiar with basic VPC networking, NACLs and VPC endpoints (the ones I've used at least), I have followed the trouble-shooting (although I already had everything set-up as outlined).
I feel the s3 policy is the problem here OR the mirror list. Many thanks if you bothered to read all that! Thoughts?
By the looks of it, you are well aware of what you are trying to achieve. Even though you are saying that it is not the NACLs, I would check them one more time, as sometimes one can easily overlook something minor. Take into account the snippet below taken from this AWS troubleshooting article and make sure that you have the right S3 CIDRs in your rules for the respective region:
Make sure that the network ACLs associated with your EC2 instance's subnet allow the following: Egress on port 80 (HTTP) and 443 (HTTPS) to the Regional S3 service. Ingress on ephemeral TCP ports from the Regional S3 service. Ephemeral ports are 1024-65535. The Regional S3 service is the CIDR for the subnet containing your S3 interface endpoint. Or, if you're using an S3 gateway, the Regional S3 service is the public IP CIDR for the S3 service. Network ACLs don't support prefix lists. To add the S3 CIDR to your network ACL, use 0.0.0.0/0 as the S3 CIDR. You can also add the actual S3 CIDRs into the ACL. However, keep in mind that the S3 CIDRs can change at any time.
Your S3 endpoint policy looks good to me on first look, but you are right that it is very likely that the policy or the endpoint configuration in general could be the cause, so I would re-check it one more time too.
One additional thing that I have observed before is that depending on the AMI you use and your VPC settings (DHCP options set, DNS, etc) sometimes the EC2 instance cannot properly set it's default region in the yum config. Please check whether the files awsregion
and awsdomain
exist within the /etc/yum/vars
directory and what's their content. In your use case, the awsregion should have:
$ cat /etc/yum/vars/awsregion
ap-southeast-2
You can check whether the DNS resolving on your instance is working properly with:
dig amazonlinux.ap-southeast-2.amazonaws.com
If DNS seems to be working fine, you can compare whether the IP in the output resides within the ranges you have allowed in your NACLs.
EDIT:
After having a second look, this line, is a bit stricter than it should be:
arn:aws:s3:::amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/*
According to the docs it should be something like:
arn:aws:s3:::amazonlinux-2-repos-ap-southeast-2/*