phparrayscsvmultidimensional-arraytext-parsing

Parse a tab-delimited, multiline text into an array of associative arrays using predefined column names


I am having a hard time doing this..so I have a file with the following text

The Boys in the Boat    Daniel James Brown  067002581X  19.99   5   16.99   9   16.99
Harry Potter and the Cursed Child   J. K. Rowling, Jack Thorne, John Tiffany    1338099133  18.95   25  17.98       0   17.98
Just Mercy  Bryan Stevenson 0812994520  17.50   8   16.25   10  16.25
Me Before You   Jojo Moyes  0670026603  18.95   2   17.50   1   17.25
A Thousand Splendid Suns    Khaled Hosseini 1594489505  19.00   7   15.50   4   14.95
The Wright Brothers David McCullough    1476728742  21.95   3   18.95   3   18.95

I need to somehow use a function to read it into a 2 dimensional associative array.. I have no problem creating the array from scratch infact I have already done, that...the array should look like

$books = array( 
    "The Boys in the Boat" => array (
        "author" => 'Daniel James Brown',
        "isbn" => '067002581X', 
        "hardcover" => 19.99,
        "quantity" => 5,
        "softcover" => 5.99,
        "e-book" => 6.99,
    ),
    "Jungle" => array (
        "author" => 'Upton Sinclair',
        "isbn" => '067002581',  
        "hardcover" => 19.99,
        "quantity" => 5,
        "softcover" => 5.99,
        "e-book" => 6.99,
    ),
);

I have no idea how to create a function that would go through the file text line by line and make a 2d array...I know I have to use explode but I don't know what delimiter to use, since if I used space it wouldn't work and there are no other delimiters in the file.


Solution

  • This is what I came up with.

    Please note the comments in the code and also note that I supposed your example file was changed when posted here (tabs to spaces).

    If you have a set number of spaces as separators this gets more difficult since we may have the same set of empty spaces inside a field (eg. Book name). This could make the script not run as expected. You can change "\t" to " " (four spaces) if that is what you have, you need to test the file/string to know.

    Tabs are fine, but you can't have tabs inside any text fields as well or this will fail.

    Because the tabs are being changed for spaces here, I hardcoded them as \t in the code below.

    <?php
    # This is basically just exploding the elements 2 times
    # and getting the elements in the right order
    
    # This will hold the array with info
    $finalResult = array();
    
    # Your input text to be parsed
    # I am supposing you have TABS as delimiters and the order never changes
    $inputText = <<<EOF
    The Boys in the Boat\tDaniel James Brown\t067002581X\t19.99\t5\t16.99\t9\t16.99
    Harry Potter and the Cursed Child\tJ. K. Rowling, Jack Thorne, John Tiffany\t1338099133\t18.95\t25\t17.98\t0\t17.98
    Just Mercy\tBryan Stevenson\t0812994520\t17.50\t8\t16.25\t10\t16.25
    Me Before You\tJojo Moyes\t0670026603\t18.95\t2\t17.50\t1\t17.25
    A Thousand Splendid Suns\tKhaled Hosseini\t1594489505\t19.00\t7\t15.50\t4\t14.95
    The Wright Brothers\tDavid McCullough\t1476728742\t21.95\t3\t18.95\t3\t18.95
    EOF;
    
    # First break each line into array of lines
    # If you are sure this file comes from *nix you can use \n as newline
    # If you are sure this comes from Windows you can use \r\n
    # Or you can use PHP_EOL but this use the current system as a basis for what to use
    $textLines = explode("\n", $inputText);
    
    # Now we go through each line getting all data
    foreach($textLines as $line) {
        # Get each tab-separated field
        $expLine  = explode("\t", $line);
        # Sanity check
        if (count($expLine) < 8) {
            # The line does not have enough items, deal with error
            echo "Item " . (isset($expLine[0]) ? $expLine[0]." " : "") . "ignored because of errors\n";
            continue;
        }
        # new item
        # I have changed this a bit as it seems you have more fields to get than
        # what shows on your example (2 quantities, for hard and softcovers)
        $finalResult[$expLine[0]] = array(
            "author"      => $expLine[1],
            "isbn"        => $expLine[2],  
            "hardcover"   => $expLine[3],
            "hc-quantity" => $expLine[4],
            "softcover"   => $expLine[5],
            "sc-quantity" => $expLine[6],
            "e-book"      => $expLine[7],
        );
    }
    
    # Just show the data structure
    var_dump($finalResult);
    
    ?>
    

    And then the result:

    array(6) {
      ["The Boys in the Boat"]=>
      array(7) {
        ["author"]=>
        string(18) "Daniel James Brown"
        ["isbn"]=>
        string(10) "067002581X"
        ["hardcover"]=>
        string(5) "19.99"
        ["hc-quantity"]=>
        string(1) "5"
        ["softcover"]=>
        string(5) "16.99"
        ["sc-quantity"]=>
        string(1) "9"
        ["e-book"]=>
        string(5) "16.99"
      }
      ["Harry Potter and the Cursed Child"]=>
      array(7) {
        ["author"]=>
        string(40) "J. K. Rowling, Jack Thorne, John Tiffany"
        ["isbn"]=>
        string(10) "1338099133"
        ["hardcover"]=>
        string(5) "18.95"
        ["hc-quantity"]=>
        string(2) "25"
        ["softcover"]=>
        string(5) "17.98"
        ["sc-quantity"]=>
        string(1) "0"
        ["e-book"]=>
        string(5) "17.98"
      }
      ["Just Mercy"]=>
      array(7) {
        ["author"]=>
        string(15) "Bryan Stevenson"
        ["isbn"]=>
        string(10) "0812994520"
        ["hardcover"]=>
        string(5) "17.50"
        ["hc-quantity"]=>
        string(1) "8"
        ["softcover"]=>
        string(5) "16.25"
        ["sc-quantity"]=>
        string(2) "10"
        ["e-book"]=>
        string(5) "16.25"
      }
      ["Me Before You"]=>
      array(7) {
        ["author"]=>
        string(10) "Jojo Moyes"
        ["isbn"]=>
        string(10) "0670026603"
        ["hardcover"]=>
        string(5) "18.95"
        ["hc-quantity"]=>
        string(1) "2"
        ["softcover"]=>
        string(5) "17.50"
        ["sc-quantity"]=>
        string(1) "1"
        ["e-book"]=>
        string(5) "17.25"
      }
      ["A Thousand Splendid Suns"]=>
      array(7) {
        ["author"]=>
        string(15) "Khaled Hosseini"
        ["isbn"]=>
        string(10) "1594489505"
        ["hardcover"]=>
        string(5) "19.00"
        ["hc-quantity"]=>
        string(1) "7"
        ["softcover"]=>
        string(5) "15.50"
        ["sc-quantity"]=>
        string(1) "4"
        ["e-book"]=>
        string(5) "14.95"
      }
      ["The Wright Brothers"]=>
      array(7) {
        ["author"]=>
        string(16) "David McCullough"
        ["isbn"]=>
        string(10) "1476728742"
        ["hardcover"]=>
        string(5) "21.95"
        ["hc-quantity"]=>
        string(1) "3"
        ["softcover"]=>
        string(5) "18.95"
        ["sc-quantity"]=>
        string(1) "3"
        ["e-book"]=>
        string(5) "18.95"
      }
    }