eucalyptus

Euca 5.0 Ansible Console Task Failing


Background:

I am only able to get past the ansible console install/config tasks by adding --region localhost to anywhere in: /usr/share/eucalyptus-ansible/roles/cloud-post/tasks/console.yml wherever it calls tools that take that argument.

Otherwise each sub task fails like this: ["euca-describe-images: error: connection error (('Connection aborted.', gaierror(-2, 'Name or service not known')))"]

Running the commands from that playbook directly on the euca server being configured gives the same result unless I specify --region localhost

Problem:

I'm stuck here: [cloud-post : update console route53 system domain for eucalyptus-cloud authentication]

Error: "euform-update-stack: error (ValidationError): No updates are to be performed.", "stderr_lines": ["euform-update-stack: error (ValidationError): No updates are to be performed."]

All services are running except the ImagingBackend is Not Ready

No instances are running according to euca-describe-instances

Images are available:

IMAGE   ami-5be483c81cf8bd65c   eucalyptus-console-image-5-0-823/eucalyptus-console-image-5-0-823.raw.manifest.xml  000216594841    available   private x86_64  machine             instance-store  hvm 
TAG image   ami-5be483c81cf8bd65c   type    eucalyptus-console-image
TAG image   ami-5be483c81cf8bd65c   version 5.0.823
IMAGE   ami-f31092ddb73e29af9   eucalyptus-service-image-v5.0.100/eucalyptus-service-image.raw.manifest.xml 000216594841    available   privatx86_64    machine             instance-store  hvm 
TAG image   ami-f31092ddb73e29af9   provides    imaging,loadbalancing
TAG image   ami-f31092ddb73e29af9   type    eucalyptus-service-image
TAG image   ami-f31092ddb73e29af9   version 5.0.100

---
all:
  hosts:
    exp-euca.lan.com:
    exp-enc-[01:02].lan.com:

  vars:
    vpcmido_public_ip_range: "192.168.100.5-192.168.100.254"
    vpcmido_public_ip_cidr: "192.168.100.1/24"
    cloud_system_dns_dnsdomain: "cloud.lan.com"
    cloud_public_port: 443 
    eucalyptus_console_cloud_deploy: yes
    cloud_service_image_rpm: no
    cloud_properties:
      services.imaging.worker.ntp_server: "x.x.x.x"
      services.loadbalancing.worker.ntp_server: "x.x.x.x"


  children:
    cloud:
      hosts:
        exp-euca.lan.com:
    console:
      hosts:
        exp-euca.lan.com:
    node:
      hosts:
        exp-enc-[01:02].lan.com:

EDIT: Solved. Details are in the comments of the marked answer.


Solution

  • The name error most likely means that DNS for the domain cloud.lan.com is not being correctly delegated to your deployment. To test this, check if the nameserver is found:

    dig +short NS cloud.lan.com
    

    you should see "ns1.cloud.lan.com" and then should be able to use that nameserver to resolve services, e.g.

    dig +short ec2.cloud.lan.com @ns1.cloud.lan.com
    

    which should be the IP of the host for the compute service.

    The second item is a bug in the ansible playbook that occurs when the stack is already present and up to date. To work around it, you can either update your playbook or delete the stack before running the playbook. Depending on how far the playbook progressed you may have a script to do this:

    /usr/local/bin/console-manage-stack -a delete
    

    the related playbook change is https://github.com/AppScale/ats-deploy/pull/36