xmlerlang

Can I use pattern matching to parse XML in Erlang?


I've got a newbie question.

I'm trying to parse a xml message with pattern matching in functions

A sample of a message is:

    <msg> <action type="xxx"... />  </msg>

What I would like to able to do is ( sort of )

decode_msg_in( << $<,$m,$s,$g,$>, Message/binary, $<,$/,$m,$s,$g,$> >>, R ) ->

The decode does not work (obviously, it's only a indication on what I'd like to do).

Is this even possible?

Does anyone have an idea? Or do I need to "iterate" the whole message as a list, building new "words"?


Solution

  • i probably think you need to read about Bit syntax expressions, Binary Comprehensions and about this xml parser library called erlsom, download it here. You will be brought up to speed in what you want to do.

    EDIT


    The xml message may reach your server as a binary, or as a string: Which ever way it does, the xml parser provided can parse the xml data into Erlang terms. Using the erlsom library, here is an example for your xml structure. I have my erlsom library in code path.

    C:\Windows\System32>erl
    Eshell V5.9  (abort with ^G)
    1> XML = "<msg><action type=\"xxx\"/>message</msg>".
    "<msg><action type=\"xxx\"/>message</msg>"
    2> erlsom:simple_form(XML).
    {ok,{"msg",[],[{"action",[{"type","xxx"}],[]},"message"]},
        []}
    3> {_,Parsed,_} = erlsom:simple_form(XML).
    {ok,{"msg",[],[{"action",[{"type","xxx"}],[]},"message"]},
        []}
    4> Parsed.
    {"msg",[],[{"action",[{"type","xxx"}],[]},"message"]}
    5> {_,_,[{_,[{_,ActionType}],_},Message]} = Parsed.
    {"msg",[],[{"action",[{"type","xxx"}],[]},"message"]}
    6> ActionType.
    "xxx"
    7> Message.
    "message"
    8>
    

    You can see above that it comes down to easy pattern matching. The resulting structure will give you something clean as long as the senders send properly formatted xml data. If you suspect improper xml data to hit your server, then, you need to wrap the parser in

    try [CALL] of [GoodResult] -> [Action1] catch _Error:_Reason -> [Action2] end.

    Note that if the XML Body is very large, you need to use SAX method to parse the xml to avoid big memory foot prints. SAX examples are included in the library documentation.