phpsqlsplitsql-parser

Parse and split comma-separated fields and expression from a SQL SELECT clause


I've been trying to extract something inside a string. I got the follwing string :

*, bob, DATE('gdfgfd', 'Fdsfds', ('fdsfdfsd')), george

I want to split by commas outside parentheses and it is suppose to give this:

[
    "*",
    "bob",
    "DATE('gdfgfd', 'Fdsfds', ('fdsfdfsd'))",
    "george"
]

I've been trying to use explode but it cut even inside ( and ) ... logic by the function mean.

So I've did this : [^(,\s]+|\([^)]+\) but it give cut even if a commas is found inside bracket.

Anyone know how to do what I mean?

EDIT :

Ok to be very clear and direct.

I got this : SELECT MyField, Field2, Blabla, Function(param), etc FROM table Blabla

I got the string MyField, Field2, Blabla, Function(param), etc already because the query is done by multiple function class like $DB->Select('MyField, Field2, Blabla, Function(param), etc'); but now I want to parse everything between commas so MyField, Field2, Blabla, Function(param), etc become this :


Solution

  • Here's what I cooked up, doesn't support multibyte characters:

    Edit: added string awareness

    <?php
    
    
    $stack = array();
    $stuff = array();
    
    $escaping = false;
    $input = "*, bob, [], DATE('g()d\\'f,gfd', ('Fd()sf)ds'), ('fdsfd\"\"()fsd')), ',(),() (,,'";
    $len = strlen( $input );
    $i = 0;
    $curstr = "";
    $char;
    
    while( $i < $len ) {
        $char = $input[$i++];
    
        if( $escaping ) {
            $curstr .= $char;
            $escaping = false;
            continue;
        }
    
        switch( $char ) {
    
            case "\\":
                $escaping = true;
                break;
    
            case '"':
                $top = end( $stack );
                if( $top === '"' ) {
                    array_pop( $stack );
                }
                else if( $top !== "'" ){
                    $stack[] = '"';
                }
    
                $curstr .= $char;
                break;
    
            case "'":
                $top = end( $stack );
                if( $top === "'" ) {
                    array_pop( $stack );
                }
                else if( $top !== '"' ) {
                    $stack[] = "'";
                }
    
                $curstr .= $char;           
                break;
    
            case ",":
                if( count( $stack ) ) {
                    $curstr .= $char;
                }
                else {
                    $stuff[] = trim($curstr);
                    $curstr = "";
                }
                break;
    
            case "(":
                $top = end( $stack );
                if( $top !== "'" && $top !== '"' ) {
                    $stack[] = "(";                   
                }
    
                $curstr .= $char;
                break;
    
            case ")":
                $top = end( $stack );
    
                if( $top !== "'" && $top !== '"' ) {
                    if( end($stack) !== "(" ) {
                        die( "Unbalanced parentheses" );
                    }
                    array_pop( $stack );
                }
    
                $curstr .= $char;
    
    
                break;
    
            default:
                $curstr .= $char;
                break;
    
        }
    }
    
    if( count( $stack ) ) {
        die( "Unbalanced ".end($stack) );
    }
    
    $stuff[] = trim( $curstr );
    
    print_r( $stuff );
    
    /*
        Array
    (
        [0] => *
        [1] => bob
        [2] => []
        [3] => DATE('g()d'f,gfd', ('Fd()sf)ds'), ('fdsfd""()fsd'))
        [4] => ',(),() (,,'
    )
    
    */