ansible

Is it possible to retry a task that fails due to an exception?


I understand it's possible to retry a task that executes a command and returns a non-zero exit status but I would like to know if it's possible to retry a task that, for example, executes a lookup plugin that raises an error. I have tried numerous things but it doesn't seem possible. Here is a contrived example for purposes of illustration.

My retry.yml playbook:

---
- name: Test retry
  hosts: all
  gather_facts: false
  tasks:
    - debug:
        msg: "{{ lookup('raise_error') }}"
      retries: 3
      delay: 3
      register: result
      until: not result.failed

My lookup plugin raise_error.py:

# python 3 headers, required if submitting to Ansible
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type

from ansible.errors import AnsibleError, AnsibleParserError
from ansible.plugins.lookup import LookupBase
from ansible.utils.display import Display

display = Display()

class LookupModule(LookupBase):

    def run(self, terms, variables=None, **kwargs):

        raise AnsibleError("testing of error handling")

        return ret

The result is the task is never retried.

TASK [debug msg={{ lookup('raise_error') }}] ***************************************************************************
task path: /home/bob/test_retry.yml:6
fatal: [localhost]: FAILED! =>
  msg: 'An unhandled exception occurred while running the lookup plugin ''raise_error''. Error was a <class ''ansible.errors.AnsibleError''>, original message: testing of error handling. testing of error handling'

PLAY RECAP *************************************************************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

Solution

  • First, I recommend to overthink the Use Case in general and provide a detailed description what you are really try to do, see What is the XY problem?. Apart from that it is recommended to implement the error handling within the lookup plugin itself, so it will not fail but provide Common Return Vaules.

    Q: "The result is the task is never retried."

    Right, that's the expected behavior for such a failed task and fatal error.

    Q: "Is it possible to retry a task that fails due to an exception?"

    Sure, but I would consider the following example as Anti Pattern or Bad Practice.

    For a task file called lookup_task.yml

    - name: raw result of running {{ item }}
      ansible.builtin.debug:
        msg: "{{ lookup('ansible.builtin.pipe', '{% if item | int < 2 %}false{% else %}echo {{ item }}{% endif %}') }}"
      register: result
      ignore_errors: true
    

    a minimal example playbook

    ---
    - hosts: localhost
      become: false
      gather_facts: false
    
      tasks:
    
        - name: Run lookup task from file
          include_tasks: lookup_task.yml
          loop: "{{ range(1, 4, 1) }}"
          register: loops
    
        - debug:
            var: result
    
        - debug:
            var: loops
    

    will result into an output of

    TASK [Run lookup task from file] *************************************************
    included: lookup_task.yml for localhost => (item=1)
    included: lookup_task.yml for localhost => (item=2)
    included: lookup_task.yml for localhost => (item=3)
    
    TASK [raw result of running 1] ***************************************************
    fatal: [localhost]: FAILED! =>
      msg: 'An unhandled exception occurred while running the lookup plugin ''ansible.builtin.pipe''. Error was a <class ''ansible.errors.AnsibleError''>, original message: lookup_plugin.pipe(false) returned 1. lookup_plugin.pipe(false) returned 1'
    ...ignoring
    
    TASK [raw result of running 2] ***************************************************
    ok: [localhost] =>
      msg: '2'
    
    TASK [raw result of running 3] ***************************************************
    ok: [localhost] =>
      msg: '3'
    
    TASK [debug] *********************************************************************
    ok: [localhost] =>
      result:
        changed: false
        failed: false
        msg: '3'
    
    TASK [debug] *********************************************************************
    ok: [localhost] =>
      loops:
        changed: false
        msg: All items completed
        results:
        - ansible_loop_var: item
          include: lookup_task.yml
          include_args: {}
          item: 1
        - ansible_loop_var: item
          include: lookup_task.yml
          include_args: {}
          item: 2
        - ansible_loop_var: item
          include: lookup_task.yml
          include_args: {}
          item: 3
        skipped: false
    
    PLAY RECAP ***********************************************************************
    localhost                  : ok=8 changed=0 failed=0 skipped=0 rescued=0 ignored=1
    

    If you like to go with the example you may need to implement a loop break, handling conditionals, more logic or whatever might be necessary for your case. Take note that you cannot use loop on import_tasks statements and need to use include_tasks instead. But this will validate a conditon like when: result.failed | default(true) during compile time and not during run time anymore, resulting into an execution of all loop cycles.

    Also, a task like

    ---
    - hosts: localhost
      become: false
      gather_facts: false
    
      tasks:
    
        - name: Run lookup task from file
          import_tasks: lookup_task.yml
          retries: 3
    
        - debug:
            var: result
    

    wouldn't work as requested because of the necessary clause ignore_errors: true or failed_when: <condition>.