javasplitstringtokenizer

Split String with semicolon token in JAVA


I'm having issues trying to split a String whith semicolon :

String is :

dsnSalarie;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4; ;S21.G00.30.008;e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4;;;

The bolted semicolon is a token and must not be considered as a delimiter, so I've tried to change the delimite for a String like "<;>" :

dsnSalarie<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> <;>S21.G00.30.008<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;>;<;>

With StringUtils.split or with StringTokenizer I can't get that semicolon, even when using "StringUtils.splitPreserveAllTokens"

The only work around that i found is by surrounder the semicolon whith space, and them trim it when splited :

dsnSalarie<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> <;>S21.G00.30.008<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> ; <;>

Thanks for your ideas.


Solution

  • I am not quite sure I understand, but the following code:

    public class Test {
    public static void main(String[] args) {
        String test="dsnSalarie<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;> <;>S21.G00.30.008<;>e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4<;>;<;>";
        String[] split = test.split("<;>");
        for (String string : split) {
            System.out.println(string);
        }
    }
    }
    

    Yields

    dsnSalarie
    e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4
     
    S21.G00.30.008
    e3f5c7c0-5f5e-4579-a262-3fd87aafe1e4
    ;
    

    A tokenizer is not able to differentiate between same characters, like the semicolon. If there is a semantic attached to the ; you need a proper parser like ANTLR to formulate your language which can infer higher order from the tokens.