In my playbook, I have a list of dictionaries. I want to keep only duplicate dictionaries.
Basically it is something like:
{
"l": [
{
"a": "d",
"b": "e",
"c": "f",
},
{
"a": "ggg",
"b": "hhh",
"c": "iii",
},
{
"a": "jjj",
"b": "kkk",
"c": "lll",
},
{
"a": "d",
"b": "e",
"c": "f",
}
]
}
I tried different things but at some point I realized I was perhaps reinventing the wheel. I couldn't find any module though.
EDIT
Here's what I've managed to do with jq
, it's not very elegant because I'm reprocessing the list for each element (and I'm also making nulls to filter them afterwards). What reassures me is that gpt hasn't managed to do it at all (despite my insistence). The list is contained in the data object because the filter references that list explicitly.
---
- name: Filter duplicates from a list of dictionaries
hosts: localhost
gather_facts: no
vars:
jq_filter: '[ .l[] as $item | if ((.l | map(select(. == $item)) | length) > 1) then $item else null end | select (. != null) ] | unique'
data:
l:
- { a: "ddd", b: "eee", c: "fff" }
- { a: "ggg", b: "hhh", c: "iii" }
- { a: "jjj", b: "kkk", c: "lll" }
- { a: "ddd", b: "eee", c: "fff" }
- { a: "jjj", b: "kkk", c: "lll" }
tasks:
- name: debug list
debug:
var: data
- name: Process list
shell: echo '{{ data | to_json }}' | jq '{{ jq_filter }}'
register: shell_output
- name: get duplicates
set_fact:
duplicates: "{{ shell_output.stdout_lines | join | from_json }}"
- name: debug duplicates
debug:
var: duplicates
I accepted the answer with a custom filter
I think you're looking for
l | difference(l | unique) | unique
Unfortunately, the filter difference doesn't work with lists of dictionaries. But, you can write a filter on your own. For example,
shell> cat filter_plugins/duplicates.py
def duplicates(l):
return [i for i in l if l.count(i) > 1]
class FilterModule(object):
def filters(self):
return {
'duplicates': duplicates,
}
Then, the playbook below
- hosts: localhost
vars:
l:
- {a: d, b: e, c: f}
- {a: ggg, b: hhh, c: iii}
- {a: jjj, b: kkk, c: lll}
- {a: d, b: e, c: f}
result: "{{ l | duplicates | unique }}"
tasks:
- debug:
var: result | to_yaml
gives probably what you want
result:
- {a: d, b: e, c: f}
If you want to select the exact frequency, create the below filter
shell> cat filter_plugins/dict_counter.py
def dict_counter(l):
u = [dict(s) for s in set(frozenset(d.items()) for d in l)]
return [{'dict': i, 'count': l.count(i)} for i in u]
class FilterModule(object):
def filters(self):
return {
'dict_counter': dict_counter,
}
gives
l | dict_counter:
- count: 2
dict: {a: d, b: e, c: f}
- count: 1
dict: {a: ggg, b: hhh, c: iii}
- count: 1
dict: {a: jjj, b: kkk, c: lll}
Then, the below declaration gives the same result
result: "{{ l | dict_counter
| selectattr('count', 'eq', 2)
| map(attribute='dict') }}"