javascriptjqueryjsonweb-scrapinghtml-parsing

Create a JSON object from HTML using jQuery


Problem Overview

Let's say I have a shipment of candy. The shipment has a number of boxes, and each box has a number of unique candy types. Every box has a unique id, different from every other box; the same is true for candy types. Furthermore, a candy has additional traits, like color, flavor and quantity.

Example Code

Take the following HTML example:

<div class="shipment">
    <div class="box" data-boxid="a">
        <div class="candy" data-candyid="1" data-color="orange" data-flavor="orange" data-qty="7">
            <!-- unimportant content -->
        </div>
        <div class="candy" data-candyid="2" data-color="red" data-flavor="strawberry" data-qty="4">
            <!-- unimportant content -->
        </div>
    </div>
    <div class="box" data-boxid="b">
        <div class="candy" data-candyid="3" data-color="green" data-flavor="lime">
            <!-- unimportant content -->
        </div>
    </div>
</div>

Previous Attempts

I've seen similar examples of table parsing with jQuery's .map() function, and I've also seen mention of .each(), but I've been unable to generate any working code.

Desired Output

I want to generate (with jQuery) a JSON object similar to the following:

{
    "shipment": {
        "a": {
            "1": {
                "color": "orange",
                "flavor": "orange",
                "qty": "7"
            },
            "2": {
                "color": "red",
                "flavor": "strawberry",
                "qty": "4"
            }
        },
        "b": {
            "3": {
                "color": "green",
                "flavor": "lime"
            }
        }
    }
}

Additional Notes

My app already uses jQuery extensively, so it seems like a logical tool for the job. However, if plain 'ol JavaScript is a more appropriate choice, feel free to say so.

The HTML is always going to be well-formed and always going to follow a the format specified. However, in some cases, information may be incomplete. Note that the third candy had no quantity specified, so quantity was simply ignored while building the object.


Solution

  • This generates what you asked for:

    var json = {};   
    
    $('.shipment').each(function(i,a) {
        json.shipment = {};
    
        $(a).find('.box').each(function(j,b) {
            var boxid = $(b).data('boxid');
            json.shipment[boxid] = {};
    
            $(b).find('.candy').each(function(k,c) {
                var $c = $(c),
                    candyid = $c.data('candyid'),
                    color = $c.data('color'),
                    flavor = $c.data('flavor'),
                    qty = $c.data('qty');
                json.shipment[boxid][candyid] = {};
                if (color) json.shipment[boxid][candyid].color = color;
                if (flavor) json.shipment[boxid][candyid].flavor = flavor;
                if (qty) json.shipment[boxid][candyid].qty = qty;
            });
       });
    });
    

    http://jsfiddle.net/mblase75/D22mD/

    As you can see, at each level it uses .each() to loop through the elements matching a certain class and then .find().each() to loop through the desired children. In between, .data() is used to grab the desired data- attributes and the json object is built level by level.