ansiblejinja2

Select items in list where attribute does *not* match regex pattern


I have a list of dicts, something like:

[
  {
    "url": "bucket.amazonaws.com",
    "file": "file1.txt"
  },
  {
    "url": "github.com",
    "file": "file2.txt"
  }
]

I can filter the results to create a variable with files in AWS:

aws_files: "{{ array | selectattr('url', 'match', '.*amazonaws\\.com.*') }}"

Or files in Github:

github_files: "{{ array | selectattr('url', 'match', '.*github\\.com.*') }}"

How can I filter to find all files that don't match either of the above patterns?

This is what I have but I don't think it's right:

remaining_files: "{{ not (array | selectattr('url', 'match', '.*(amazonaws|github)\\.com.*')) }}"

This also doesn't work:

remaining_files: "{{ array | selectattr('url', 'notmatch', '.*(amazonaws|github)\\.com.*') }}"

Also, secondary question: is there a way to match the pattern anywhere in the string (so I don't have to always add .* to the beginning/end of the pattern).


Solution

  • For example, given the patterns in a list

      patterns:
        - .*amazonaws\.com.*
        - .*github\.com.*
    

    and the array for testing

      array:
        - {file: file1.txt, url: bucket.amazonaws.com}
        - {file: file2.txt, url: github.com}
        - {file: file3.txt, url: example.com}
    

    Select the aws and github files

      aws_files: "{{ array | selectattr('url', 'match', patterns.0) }}"
      github_files: "{{ array | selectattr('url', 'match', patterns.1) }}"
    

    gives

      aws_files: [{'file': 'file1.txt', 'url': 'bucket.amazonaws.com'}]
      github_files: [{'file': 'file2.txt', 'url': 'github.com'}]
    

    You can now subtract the matching lists

      result: "{{ array | difference(aws_files + github_files) }}"
    

    to get what you want

      result: [{'file': 'file3.txt', 'url': 'example.com'}]
    

    Or, you can reject joined patterns. The declaration below gives the same result

      result: "{{ array | rejectattr('url', 'match', patterns|join('|')) }}"
    

    Example of a complete playbook for testing

    - hosts: localhost
    
      vars:
    
        patterns:
          - .*amazonaws\.com.*
          - .*github\.com.*
    
        array:
          - {file: file1.txt, url: bucket.amazonaws.com}
          - {file: file2.txt, url: github.com}
          - {file: file3.txt, url: example.com}
    
        aws_files: "{{ array | selectattr('url', 'match', patterns.0) }}"
        github_files: "{{ array | selectattr('url', 'match', patterns.1) }}"
    
        result: "{{ array | difference(aws_files + github_files) }}"
        resul2: "{{ array | rejectattr('url', 'match', patterns|join('|')) }}"
    
      tasks:
    
        - debug:
            msg: |
              aws_files: {{ aws_files }}
              github_files: {{ github_files }}
              result: {{ result }}
              resul2: {{ resul2 }}
    

    Note: You can simplify the patterns

      patterns:
        - amazonaws\.com
        - github\.com
    

    and use search instead of match. The test match, quoting:

    succeeds if it finds the pattern at the beginning of the string

    but, the test search, quoting:

    succeeds if it finds the pattern anywhere within the string