regexsshansibleansible-2.xssh-config

How to Match SSH Configuration Block with Ansible regex


Backstory:

I'm making an Ansible role that creates an SSH key for hosts that need one and automatically updates my configuration file with the latest information; lan ip, username, key, etc. I can't get the Ansible to delete the existing block correctly though before adding the new one. Then when it adds the new one, it overwrites the other hosts block for some reason (so at the end, only 1 block exists even if there are 10 hosts. I'll fix the latter later, but my main concern is to get the old block deleted correctly.


Question(s) & tl;dr:

Is there a better approach to doing this, does anyone see why this regex doesn't work in Ansible, and could my regex be simplified/improved?


Here's the relevant tasks from my Ansible role

- name: Read existing SSH config
  slurp:
    src: "{{ ssh_config_dir }}/config"
  register: ssh_config_file

- name: Decode existing SSH config
  set_fact:
    ssh_config_content: "{{ ssh_config_file.content | b64decode }}"

- name: Parse existing SSH config into lines
  set_fact:
    ssh_config_lines: "{{ ssh_config_content.split('\n') }}"

- name: Check if existing host entry matches
  set_fact:
    host_entry_valid: >
      {{ ssh_config_lines | select('match', '^Host {{ inventory_hostname }}$') | list | length > 0 and
         ssh_config_lines | select('match', '^\\s*Hostname {{ hostvars[inventory_hostname].ansible_host }}$') | list | length > 0 and
         ssh_config_lines | select('match', '^\\s*User {{ ssh_remote_user }}$') | list | length > 0 and
         ssh_config_lines | select('match', '^\\s*Port {{ ssh_port }}$') | list | length > 0 and
         ssh_config_lines | select('match', '^\\s*IdentityFile {{ ssh_key_dir }}/{{ inventory_hostname }}{{ ssh_key_name_suffix }}$') | list | length > 0 }}

- name: Debug host entry validity
  debug:
    var: host_entry_valid

- name: Backup the existing SSH config
  copy:
    src: "{{ ssh_config_dir }}/config"
    dest: "{{ ssh_config_dir }}/config.bak"
  when: not host_entry_valid

- name: Define the regex pattern
  set_fact:
    my_regex: '^(\s+)?Host\s+{{ inventory_hostname }}(\s+)?$\n^(((\s+)?[A-Za-z0-9./_-]|#)+(\s+)?)$\n^(\s+)?$'

- name: Print regex pattern
  debug:
    msg: "{{ my_regex | quote }}"

- name: Remove existing host entry if it doesn't match
  lineinfile:
    path: "{{ ssh_config_dir }}/config"
    state: absent
    regexp: '^(\s+)?Host\s+{{ inventory_hostname }}(\s+)?$\n^(((\s+)?[A-Za-z0-9./_-]|#)+(\s+)?)$\n^(\s+)?$'
  when: not host_entry_valid

Check if existing host entry matches needs works, but it should trigger the Remove existing host entry if it doesn't match task which I'm having problems with. The regex matches a block perfectly in VSCode, but in Sublime it matches multiple blocks and Ansible it doesn't seem to work at all.


This started out as a ChatGPT creation, but I tweaked some of it and ended up writing the regex myself.

This works in VSCode:

^(\s+)?Host test(\s+)?$\n^(((\s+)?[A-Za-z0-9./_-]|#)+(\s+)?)$\n

but the negative(?) look ahead does not. It does work in Sublime, but sublime matches multiple blocks for some reason.

^(\s+)?Host test(\s+)?$\n^(((\s+)?[A-Za-z0-9./_-]|#)+(\s+)?)$\n(?=Host)

The idea is to match Host hostname and then look for any lines that contain text until there is a blank line; and with the negative look ahead check to see if host exists after the blank line, but I'm not familiar with using lookaheads and this won't work if the block is the last block in the file, so I'll be removing that.


Here's the relevant Ansible output and the old SSH block doesn't get removed:

TASK [ssh-keys : Remove existing SSH agents and known hosts] ********************************************************************************************
included: /scripts/ansible/roles/ssh-keys/tasks/remove_existing_ssh_agents.yml for test

TASK [ssh-keys : Remove all SSH agents] *****************************************************************************************************************
skipping: [test]

TASK [ssh-keys : Remove host from known hosts] **********************************************************************************************************
skipping: [test]

TASK [ssh-keys : Copy SSH key to remote server] *********************************************************************************************************
included: /scripts/ansible/roles/ssh-keys/tasks/copy_key.yml for test

TASK [ssh-keys : Copy SSH key to remote server] *********************************************************************************************************
skipping: [test]

Example block:

Host test
   HostName 10.0.0.4
   User myuser
   IdentityFile ...

Host test2
  ...

Solution

  • As Zeitounator commented, there's a specific module for SSH, which will be far easier and safer than doing it all yourself.

    But regarding your question, your regular expression pattern can be rewritten like this:

    ^[\t  ]*host[\t  ]+(.+?)[\t  ]*\n(?:(?!(?:^[\t  ]*#.*\n)*[\t  ]*host\b)[\t  ]*(?:(\w+)\b(?:[\t  ]*=[\t  ]*|[\t  ]+)(.*)|#.*)?(?:\n|\Z))+
    

    Test it live here: https://regex101.com/r/LiGszL/2

    Details on the regular expression pattern

    Comments on your pattern
    The commented version of my pattern

    The single-line pattern above is the same as this commented original version:

    # Host entry:
    # Start of line followed by optional horizontal spaces,
    # The word "Host" case-insensitive, followed by anything (captured) and a new line.
    ^[\t  ]*host[\t  ]+(.+?)[\t  ]*\n
    # A configuration line or comment, multiple times:
    (?:
      # Negative lookahead to avoid matching a new "Host" entry, but
      # also with optional comment lines before it.
      (?!(?:^[\t  ]*\#.*\n)*[\t  ]*host\b)
      # Optional horizontal spaces.
      [\t  ]*
      # Config line, comment or empty line (done with the ? at the end).
      (?:
        # A) A config line, capturing it (with space or equal sign).
        (\w+)\b(?:[\t  ]*=[\t  ]*|[\t  ]+)(.*)   |
        # B) Or a comment.
        \#.*
      )?
      # New line or end of the config file.
      (?:\n|\Z)
    )+
    

    See it in action with explanation: https://regex101.com/r/LiGszL/1

    Note that it could be simplified in the middle part matching config lines or comments. No need to do all these checks as we could simply match anything as we have the negative lookahead to stop us. But this shows how it could be possible to read the host configuration lines or the comments with a second regular expression.

    Full example, in JavaScript:

    const regexHostEntry = /^[\t  ]*host[\t  ]+(?<host>.+?)[\t  ]*\n(?<config>(?:(?!(?:^[\t  ]*#.*\n)*[\t  ]*host\b)[\t  ]*(?:(\w+)\b(?:[\t  ]*=[\t  ]*|[\t  ]+)(.*)|#.*)?(?:\n|\Z))+)/gim;
    
    const regexConfigLine = /^[\t  ]*(\w+)\b(?:[\t  ]*=[\t  ]*|[\t  ]+)(.*)/gim;
    
    const input = `Host test 
      Hostname test.domain.com
      User james
      Port 22
      # Comment
      IdentityFile ~/.ssh/key.pub
    
    # With 2 aliases
    Host test2 test-2
      Hostname test2.domain.com
      User = james
      Port=22
      # Port 23
      IdentityFile = ~/.ssh/key2.pub
    
    # For all hosts except test2, activate compression and set log level:
    Host * !test2
      Compression yes
      LogLevel INFO
    
      IdentityFile ~/.ssh/id_rsa
    
    Host *.sweet.home
      Hostname 192.168.2.17
      User tom
      IdentityFile "~/.ssh/id tom.pub" # If has spaces, then quote it.
    
    # With a lot of spaces between lines
    Host localhost
    
        Hostname 127.0.0.*
    
        IdentityFile ~/.ssh/id_rsa
    
    # Without empty lines between Host definitions:
    Host dummy
      Hostname ssh.dummy.com
      User user
    Host dummy2
      Hostname ssh.dummy2.com
      User user`;
    
    let matches = input.matchAll(regexHostEntry);
    
    if (matches) {
      matches.forEach((match) => {
        console.log(`Found match for Host ${match.groups.host}:`);
        console.log([...match.groups.config.matchAll(regexConfigLine)]);
      });
    }