I'm trying to parse the output of the "display interface brief" Comware switch command to convert it to a CSV file using RegEx. This command is printed using the following format:
Interface Link Speed Duplex Type PVID Description
BAGG51 UP 4G(a) F(a) T 1
FGE1/0/42 DOWN auto A T 1 ### LIVRE ###
GE6/0/20 UP 100M(a) F(a) A 1 LIVRE (MGMT - [WAN8-P8]
It's seems quite challenging for me because doesn't matter which RegEx I try, it doesn't properly handle "DOWN auto" and "100M(a) F(a)" output that has only one space between them. I also couldn't find a way to properly handle the last field, that can contain one or more spaces, but into most RegEx that I tried it create a separate capture group for each space instead of handling it's text content properly.
I'd also tried countless ways to try to parse it, and I couldn't find much content about parsing non-uniform columns into the Internet and StackOverflow community.
I need to parse it into the following format, with 7 capture groups per line, respecting the end of line:
BAGG51;UP;4G(a);F(a);T;1
FGE1/0/42;DOWN;auto;A;T;1;### LIVRE ###
GE6/0/20;UP;100M(a);F(a);A;1;LIVRE (MGMT - [WAN8-P8]
The most successfully RegEx that I found so far was: ^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)
replacing it to $1;$2;$3;$4;$5;$6;$7
using Notepad++ but it doesn't properly handle the "Description" field, that can be empty.
The following pattern seems to be working here:
^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)(?:[ ]+(.*))?
This follows your pattern with six mandatory capture groups, followed by an optional seventh capture group. The (?:[ ]+(\S+))?
at the end of the pattern matches one or more spaces followed by the content. Note that this pattern should be used in multiline mode.
Here is a working demo