I understand it's possible to retry a task that executes a command and returns a non-zero exit status but I would like to know if it's possible to retry a task that, for example, executes a lookup plugin that raises an error. I have tried numerous things but it doesn't seem possible. Here is a contrived example for purposes of illustration.
My retry.yml playbook:
---
- name: Test retry
hosts: all
gather_facts: false
tasks:
- debug:
msg: "{{ lookup('raise_error') }}"
retries: 3
delay: 3
register: result
until: not result.failed
My lookup plugin raise_error.py:
# python 3 headers, required if submitting to Ansible
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
from ansible.errors import AnsibleError, AnsibleParserError
from ansible.plugins.lookup import LookupBase
from ansible.utils.display import Display
display = Display()
class LookupModule(LookupBase):
def run(self, terms, variables=None, **kwargs):
raise AnsibleError("testing of error handling")
return ret
The result is the task is never retried.
TASK [debug msg={{ lookup('raise_error') }}] ***************************************************************************
task path: /home/bob/test_retry.yml:6
fatal: [localhost]: FAILED! =>
msg: 'An unhandled exception occurred while running the lookup plugin ''raise_error''. Error was a <class ''ansible.errors.AnsibleError''>, original message: testing of error handling. testing of error handling'
PLAY RECAP *************************************************************************************************************
localhost : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
First, I recommend to overthink the Use Case in general and provide a detailed description what you are really try to do, see What is the XY problem?. Apart from that it is recommended to implement the error handling within the lookup plugin itself, so it will not fail but provide Common Return Vaules.
Q: "The result is the task is never retried."
Right, that's the expected behavior for such a failed task and fatal error.
Q: "Is it possible to retry a task that fails due to an exception?"
Sure, but I would consider the following example as Anti Pattern or Bad Practice.
For a task file called lookup_task.yml
- name: raw result of running {{ item }}
ansible.builtin.debug:
msg: "{{ lookup('ansible.builtin.pipe', '{% if item | int < 2 %}false{% else %}echo {{ item }}{% endif %}') }}"
register: result
ignore_errors: true
a minimal example playbook
---
- hosts: localhost
become: false
gather_facts: false
tasks:
- name: Run lookup task from file
include_tasks: lookup_task.yml
loop: "{{ range(1, 4, 1) }}"
register: loops
- debug:
var: result
- debug:
var: loops
will result into an output of
TASK [Run lookup task from file] *************************************************
included: lookup_task.yml for localhost => (item=1)
included: lookup_task.yml for localhost => (item=2)
included: lookup_task.yml for localhost => (item=3)
TASK [raw result of running 1] ***************************************************
fatal: [localhost]: FAILED! =>
msg: 'An unhandled exception occurred while running the lookup plugin ''ansible.builtin.pipe''. Error was a <class ''ansible.errors.AnsibleError''>, original message: lookup_plugin.pipe(false) returned 1. lookup_plugin.pipe(false) returned 1'
...ignoring
TASK [raw result of running 2] ***************************************************
ok: [localhost] =>
msg: '2'
TASK [raw result of running 3] ***************************************************
ok: [localhost] =>
msg: '3'
TASK [debug] *********************************************************************
ok: [localhost] =>
result:
changed: false
failed: false
msg: '3'
TASK [debug] *********************************************************************
ok: [localhost] =>
loops:
changed: false
msg: All items completed
results:
- ansible_loop_var: item
include: lookup_task.yml
include_args: {}
item: 1
- ansible_loop_var: item
include: lookup_task.yml
include_args: {}
item: 2
- ansible_loop_var: item
include: lookup_task.yml
include_args: {}
item: 3
skipped: false
PLAY RECAP ***********************************************************************
localhost : ok=8 changed=0 failed=0 skipped=0 rescued=0 ignored=1
If you like to go with the example you may need to implement a loop break, handling conditionals, more logic or whatever might be necessary for your case. Take note that you cannot use loop
on import_tasks
statements and need to use include_tasks
instead. But this will validate a conditon like when: result.failed | default(true)
during compile time and not during run time anymore, resulting into an execution of all loop cycles.
Also, a task like
---
- hosts: localhost
become: false
gather_facts: false
tasks:
- name: Run lookup task from file
import_tasks: lookup_task.yml
retries: 3
- debug:
var: result
wouldn't work as requested because of the necessary clause ignore_errors: true
or failed_when: <condition>
.