I'm processing a OFX (bank transactions) file. My bank doesn't use the <NAME>
tag to specify the payee, but this information is a substring of <MEMO>
tag.
So, my file is something like:
...ofx headers and other stuff
...line below is a transaction
<STMTTRN>
<TRNTYPE>OTHER</TRNTYPE>
<DTPOSTED>20160609120000</DTPOSTED>
<TRNAMT>-4.00</TRNAMT>
<FITID>2016060914000</FITID>
<CHECKNUM>000000700132</CHECKNUM>
<REFNUM>700.132</REFNUM>
<MEMO>Credit Card Payment - 09/06 18:37 Walmart 2th street</MEMO>
</STMTTRN>
...continues other transactions and end of file
I would like to match every <MEMO>
tag, extract the payee name (Walmart 2th street
in this example) and write a new line with a <NAME>
. My output would be like:
...ofx headers and other stuff
...line below is a transaction
<STMTTRN>
<TRNTYPE>OTHER</TRNTYPE>
<DTPOSTED>20160609120000</DTPOSTED>
<TRNAMT>-4.00</TRNAMT>
<FITID>2016060914000</FITID>
<CHECKNUM>000000700132</CHECKNUM>
<REFNUM>700.132</REFNUM>
<MEMO>Credit Card Payment - 09/06 18:37 Walmart 2th street</MEMO>
<NAME>Walmart 2th street</NAME>
</STMTTRN>
...continues other transactions and end of file
Another tool as awk can be a solution.
With GNU sed:
sed -r 's/.*<MEMO>.* [0-9]{2}:[0-9]{2} (.*)<.*/&\n <NAME>\1<\/NAME>/' file
Output:
<STMTTRN>
<TRNTYPE>OTHER</TRNTYPE>
<DTPOSTED>20160609120000</DTPOSTED>
<TRNAMT>-4.00</TRNAMT>
<FITID>2016060914000</FITID>
<CHECKNUM>000000700132</CHECKNUM>
<REFNUM>700.132</REFNUM>
<MEMO>Credit Card Payment - 09/06 18:37 Walmart 2th street</MEMO>
<NAME>Walmart 2th street</NAME>
</STMTTRN>
If you want to edit your file "in place" use sed's option -i
.