pythonregexpython-textfsm

How to individually match three separate portions of a single line of CLI output (LLDP) with regex (TextFSM template)


I'm working with Ansible and TextFSM (Python) templates to dynamically pull LLDP info from network devices then to apply the LLDP output to the same devices interface descriptions. I Currently have a working model however I need to fine tune what is written to the interface description to match our naming convention (hostname-interface) where the hostname cannot include the FQDN and the interface should be the first three letters of the interface in lower case ("eth" in this case) followed immediately by the interface number (24). The final result would look like "lab-fr-sw01-eth24"

I am able to pull the appropriate output fine with (\S+) for each variable in the template:

Value NEIGHBOR (\S+)
Value LOCAL_INTERFACE (\S+)   
Value NEIGHBOR_INTERFACE (\S+)

Example CLI output: Et1 lab-fr-sw01.test.local Ethernet24 120

The only problem with this is that sometimes the switch pulls a FQDN for the "NEIGHBOR" variable like above and sometimes it does not. Right now I am trying to write a specific regex (TextFSM templates only use regex) statement per variable. For the neighbor variable I'm trying to match the second non-white space characters up to the "." if it exists. So far I have only been able to accurately grab the local interface (Et1) with (^\S+) then when I attempt to grab only the hostname with ^[^.]+ I am also including the local interface output "Et1". To match, I've been using https://regex101.com

Et1 lab-fr-sw01.test.local Ethernet24 120

Where LOCAL_INTERFACE = Et1, --> (^\S+)
NEIGHBOR = lab-ew-sw01.test.local and --> ^[^.]+ 
NEIGHBOR_INTERFACE = Ethernet24 --> ?

The desired end result that would be written to the devices interface description would look something like "lab-fr-sw01-eth24". However, because we have several sites and each site name is included in the hostname I cannot rely on trying to match the hostname letter by letter.


Solution

  • I'm guessing that here we wish to capture three parts of our string, which we can do so with a simple expression such as:

    ([a-z0-9]+)\s+([\w\-\.]+)\s([a-z0-9]+)\s([0-9]+)
    

    Demo 1

    where our desired outputs are in groups #1, #2, and #3 and here we are also applying the i flag.

    Test

    # coding=utf8
    # the above tag defines encoding for this document and is for Python 2.x compatibility
    
    import re
    
    regex = r"([a-z0-9]+)\s+([\w\-\.]+)\s([a-z0-9]+)\s([0-9]+)"
    
    test_str = "Et1 lab-fr-sw01.test.local Ethernet24 120"
    
    subst = "LOCAL_INTERFACE = \\1\\nNEIGHBOR = \\2\\nNEIGHBOR_INTERFACE = \\3"
    
    # You can manually specify the number of replacements by changing the 4th argument
    result = re.sub(regex, subst, test_str, 0, re.MULTILINE | re.IGNORECASE)
    
    if result:
        print (result)
    
    # Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.
    

    Demo

    RegEx Circuit

    jex.im visualizes regular expressions:

    enter image description here

    Edit

    For capturing test.local, we would simply remove . from our char list:

    ([a-z0-9]+)\s+([\w\-]+)(.+?)\s([a-z0-9]+)\s([0-9]+)
    

    Demo 2